How To Make A GIF Using Python | An Application with The United States Wind Turbine Database

Figure 1: GIF demonstration using random data and 'gifly' algorithm.

Figure 1: GIF demonstration using random data and 'gifly' algorithm.

The graphics interchange format, or GIF, has become widely used for animation in both the general internet and science/data communities. GIFs are a great way to consolidate images into one dynamic information file, while avoiding video codec issues between platforms. The GIF overcomes cross-compatibility issues by being self-contained as a looping image file without any audio. Its format is standardized such that when uploaded to PowerPoint, Word, HTML, etc. - the animation remains unchanged. Its simplicity and consistency deem it a great candidate for applications in data visualization where sound is not needed and multiple loops are beneficial.


Algorithm and Python Function

Below is the function used to create each GIF used below. The function takes matplotlib figures and saves them as .png files in a nearby folder. When the figures are finished plotting, the library 'imageio' takes the .png files and loops through them with the selected interval and saves them as an animated .gif file. The Github repository for creating a Python GIF can be found here at the project's page entitled 'gifly.' You can also click on the GIF above to be taken to the project's page.

import matplotlib.pyplot as plt
import os,imageio
def gif_maker(gif_name,png_dir,gif_indx,num_gifs,dpi=90):
# make png path if it doesn't exist already
if not os.path.exists(png_dir):
os.makedirs(png_dir)
# save each .png for GIF
# lower dpi gives a smaller, grainier GIF; higher dpi gives larger, clearer GIF
plt.savefig(png_dir+'frame_'+str(gif_indx)+'_.png',dpi=dpi)
plt.close('all') # comment this out if you're just updating the x,y data
if gif_indx==num_gifs-1:
# sort the .png files based on index used above
images,image_file_names = [],[]
for file_name in os.listdir(png_dir):
if file_name.endswith('.png'):
image_file_names.append(file_name)
sorted_files = sorted(image_file_names, key=lambda y: int(y.split('_')[1]))
# define some GIF parameters
frame_length = 0.5 # seconds between frames
end_pause = 4 # seconds to stay on last frame
# loop through files, join them to image array, and write to GIF called 'wind_turbine_dist.gif'
for ii in range(0,len(sorted_files)):
file_path = os.path.join(png_dir, sorted_files[ii])
if ii==len(sorted_files)-1:
for jj in range(0,int(end_pause/frame_length)):
images.append(imageio.imread(file_path))
else:
images.append(imageio.imread(file_path))
# the duration is the time spent on each image (1/duration is frame rate)
imageio.mimsave(gif_name, images,'GIF',duration=frame_length)
view raw gifly.py hosted with ❤ by GitHub

Simple Implementation of Gifly

The code below is the script used to create the wavy GIF at the top of this page. It uses a simple random generation and sinusoidal warping to update the figure using matplotlib's 'set_ydata' and 'set_xdata.' This method creates GIFs very quickly because it only has to update the data in the existing figure and save it as a .png file.

import numpy as np
import matplotlib.pyplot as plt
from gifly import gif_maker
plt.style.use('ggplot')
interv = (0,1000)
x = np.linspace(interv[0],interv[1],1000)
x = x+(100*np.sin(0.01*x))
y = np.linspace(interv[0],interv[1],1000)+np.random.random(len(x))*20
tot_gifs = 20
x_plot,y_plot = [],[]
axes = plt.gca()
axes.set_ylim([np.min(y),np.max(y)])
axes.set_xlim([np.min(x),np.max(x)])
plot1, = axes.plot(0,0)
for ii in range(0,tot_gifs):
x_plot.extend(x[ii*int(len(x)/tot_gifs):(ii+1)*int(len(x)/tot_gifs)])
y_plot.extend(y[ii*int(len(x)/tot_gifs):(ii+1)*int(len(x)/tot_gifs)])
plot1.set_xdata(x_plot)
plot1.set_ydata(y_plot)
gif_maker('straight_line_noise.gif','./gif_maker_png/',ii,tot_gifs,dpi=120)

More Complex GIFs

I will be using the The United States Wind Turbine Database (USWTDB) [download here] in this section of the tutorial. It contains over 50,000 points and I will be using some more advanced plotting and analysis tools, so please see Python documentation for any unknown methods you may come across.

The wind turbine database has its own viewer (admittedly better than anything we'll be creating here), so if you'd like to see that and inspect the points, head here. The data, for reference, look like the figure shown below, which we will be attempting to replicate in .gif format to show the installation of turbines over time.

Figure 2: U.S. Wind Turbine Database viewer

Figure 2: U.S. Wind Turbine Database viewer

Figure 3: Distribution of Turbine Power Capacity [in kW]

The code below is the analysis script used to produce the GIF in figure 4. One may notice that each iteration I'm re-plotting the Basemap - this is because I was unable to overlay the data atop the previous Basemap during looping. Perhaps someone is capable of solving this issue, but I was not. Therefore, the computation time is quite long for large loops, however, in my case a few seconds/minutes for a high-quality animation is not an issue.
#!/usr/bin/python
from mpl_toolkits.basemap import Basemap
import matplotlib
import matplotlib.colors as colors
import matplotlib.pyplot as plt
import numpy as np
import csv
from gifly import gif_maker
from scipy import stats
font = {'family' : 'sans-serif',
'size' : 26}
matplotlib.rc('font', **font)
# Grabbing the .csv data
lats,lons,capacity,year = [],[],[],[]
with open('./uswtdbCSV/uswtdb_v1_1_20180710.csv') as csvfile:
reader = csv.DictReader(csvfile,delimiter=',')
for data in reader:
# only taking continental U.S. data and getting rid of unknown years
if float(data['p_year'])<1950.0 or float(data['ylat'])>50 or\
float(data['ylat'])<24 or float(data['xlong'])>-66 or\
float(data['xlong'])<-124 or float(data['t_cap'])<0:
continue
lats.append(float(data['ylat']))
lons.append(float(data['xlong']))
capacity.append(float(data['t_cap']))
year.append(float(data['p_year']))
# sorting the data based on year the turbine was built
y = np.argsort(year)
year_sort = np.array(year)[y]
lats_sort = np.array(lats)[y]
lons_sort = np.array(lons)[y]
capacity_sort = np.array(capacity)[y]
# plot and loop parameters
zoom_scale = 3
curr_year = year_sort[0]
x_array,y_array,cap_array,color_array = [],[],[],[]
# Setup the bounding box for the zoom and bounds of the map
bbox = [np.min(lats_sort)-zoom_scale,np.max(lats_sort)+zoom_scale,\
np.min(lons_sort)-zoom_scale,np.max(lons_sort)+zoom_scale]
# create the basemap for lat/lon scatter plotting
m = Basemap(projection='merc',llcrnrlat=bbox[0],urcrnrlat=bbox[1],\
llcrnrlon=bbox[2],urcrnrlon=bbox[3],lat_ts=10,resolution=None)
# directory to be created for .png files that the GIF will need
png_dir = './png_files_size/'
# indexing for loop year
gif_indx = 0
# set capacity bounds for size interpolation
cap_min = np.min(capacity_sort)
cap_max = np.max(capacity_sort)
colormap = plt.cm.coolwarm
normalize = matplotlib.colors.Normalize(vmin = cap_min,vmax = cap_max)
loop_size = len(year_sort)
num_gifs = len(np.unique(year_sort))
for pp in range(0,loop_size):
if year_sort[pp]==curr_year:
x,y = m(lons_sort[pp],lats_sort[pp])
x_array.append(x)
y_array.append(y)
cap_array.append(np.interp(capacity_sort[pp],[cap_min,cap_max],[30,200]))
color_array.append(capacity_sort[pp])
if pp!=loop_size-1:
continue
else:
curr_year = year_sort[pp]
# recreate figure each loop
fig = plt.figure(figsize=(12,7))
m = Basemap(projection='merc',llcrnrlat=bbox[0],urcrnrlat=bbox[1],\
llcrnrlon=bbox[2],urcrnrlon=bbox[3],lat_ts=10,resolution=None)
m.bluemarble() # this plots the earth-like contour to the U.S. map
# scatter new data with the color and size changes
scat1 = plt.scatter(x_array,y_array,s=cap_array,c = color_array,edgecolors='#444444',alpha=0.5,cmap=colormap,norm=normalize)
plt.colorbar(scat1,label='Average Power [kW]')
plt.ylabel(str(year_sort[pp-1])) # updated year
gif_maker('wind_turbine_yearly_with_colors.gif',png_dir,gif_indx,num_gifs,90)
gif_indx+=1
Figure 4: Turbine installation from 1981 to 2018.

Figure 4: Turbine installation from 1981 to 2018.


More Complicated Plot with Marker Size and Color Changes

The scatter plot above shows the temporal variation of wind turbine installations across the continental U.S., however, I don't find it very insightful into the distribution of actual wind power. I decided to take full advantage of Python's plotting capabilities as well as the range of data available in the USWTDB. 

During the creation of the GIF below, the user may notice changes from .png files to .gif format. I believe this is because of the color limitations of the .gif format. The information below is still there, though the colors appear much dimmer than they should.
wind_turbine_dist_size.gif

The script for reproducing the temporal capacity distribution of wind turbines can be found below. 

#!/usr/bin/python
from mpl_toolkits.basemap import Basemap
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import csv
from gifly import gif_maker
font = {'family' : 'sans-serif',
'size' : 26}
matplotlib.rc('font', **font)
# Grabbing the .csv data
lats,lons,capacity,year = [],[],[],[]
with open('./uswtdbCSV/uswtdb_v1_1_20180710.csv') as csvfile:
reader = csv.DictReader(csvfile,delimiter=',')
for data in reader:
# only taking continental U.S. data and getting rid of unknown years
if float(data['p_year'])<1950.0 or float(data['ylat'])>50 or\
float(data['ylat'])<24 or float(data['xlong'])>-66 or\
float(data['xlong'])<-124 or float(data['t_cap'])<0:
continue
lats.append(float(data['ylat']))
lons.append(float(data['xlong']))
capacity.append(float(data['t_cap']))
year.append(float(data['p_year']))
# sorting the data based on year the turbine was built
y = np.argsort(year)
year_sort = np.array(year)[y]
lats_sort = np.array(lats)[y]
lons_sort = np.array(lons)[y]
capacity_sort = np.array(capacity)[y]
# plot and loop parameters
zoom_scale = 3
curr_year = year_sort[0]
x_array,y_array, = [],[]
# Setup the bounding box for the zoom and bounds of the map
bbox = [np.min(lats_sort)-zoom_scale,np.max(lats_sort)+zoom_scale,\
np.min(lons_sort)-zoom_scale,np.max(lons_sort)+zoom_scale]
# create the basemap for lat/lon scatter plotting
m = Basemap(projection='merc',llcrnrlat=bbox[0],urcrnrlat=bbox[1],\
llcrnrlon=bbox[2],urcrnrlon=bbox[3],lat_ts=10,resolution=None)
# directory to be created for .png files that the GIF will need
png_dir = './png_files/'
# indexing for loop year
gif_indx = 0
loop_size = len(year_sort)
num_gifs = len(np.unique(year_sort))
for pp in range(0,loop_size):
if year_sort[pp]==curr_year:
x,y = m(lons_sort[pp],lats_sort[pp])
x_array.append(x)
y_array.append(y)
if pp!=loop_size-1:
continue
else:
curr_year = year_sort[pp]
# recreate figure each loop
fig = plt.figure(figsize=(12,7))
m = Basemap(projection='merc',llcrnrlat=bbox[0],urcrnrlat=bbox[1],\
llcrnrlon=bbox[2],urcrnrlon=bbox[3],lat_ts=10,resolution=None)
m.bluemarble() # this plots the earth-like contour to the U.S. map
# scatter new data
plt.scatter(x_array,y_array,s=20,c='#D5D8DC',linewidths='0.3',edgecolors='#34495E',alpha=0.8)
plt.ylabel(str(year_sort[pp-1])) # updated year
gif_maker('function_test.gif',png_dir,gif_indx,num_gifs,90)
gif_indx+=1

Conclusion

Above, I presented a Python function for creating GIFs using the 'imageio' method and saving .png files to a nearby directory. This method has been successful in my career relating to data visualization, and I hope it will be useful for a scientist/researcher somewhere in the industry or academia. The wind turbine database proved to be a very interesting application, and the visualizations produced using the 'gifly' method resulted in very meaningful and insightful information regarding the boom of wind turbine energy. 

Several issues arose while creating this particular tutorial. First, the issue of Basemap needing to re-plot every iteration was an issue I couldn't solve. Perhaps someone else is able to update the scatter atop the Basemap, however, I was unable. Another issue I discovered was the transition from .png images to .gif image - I found that some colors were lost and produced at times a highly differing animation from the .png images. This can be solved by sticking to a conventional 256 color palette when saving .png files. 


See more in Python: