Home Python How to plot in Python

How to plot in Python

The response to the question “How do I make plots in Python?” used to be simple: Matplotlib was the only way. However, Python is now the language of data science, and it offers a lot more options.

In this article, we illustrate how to use each of the four most popular Python plotting libraries—Matplotlib, Seaborn, Plotly, and Bokeh—as well as a couple of promising newcomers: Altair, with its expressive API, and Pygal, with its beautiful SVG performance. We’ll also take a look at pandas’ extremely useful plotting API.

Plotting in Python

Matplotlib

Matplotlib is the oldest and most widely used visualization, graphing, and Python plotting library. It was developed as part of the SciPy Stack, an open-source scientific computing library similar to Matlab, in 2003.

Installing Matplotlib

Pip is the simplest way to install matplotlib. In the terminal, run the following command:

pip install matplotlib

The other option is to do a manual download and install it.

Graphing x and y coordinates

In a diagram, the plot() function is used to draw points (markers).

The plot() function is responsible for drawing a line from point to point by design.

The function accepts parameters for defining diagram points.

The first parameter is an array containing the x-axis points.

The y-axis points are defined by parameter 2, which is an array.

If we want to plot a line from (1, 3) to (8, 10), we must pass two arrays to the plot function: [1, 8] and [3, 10].

Example 1: Plotting a Linear Line

import matplotlib.pyplot as plt
import numpy as np

x_coords = np.array([5, 12])
y_coords = np.array([7, 14])
plt.plot(x_coords, y_coords)
plt.show()
Example 1: Plotting a Linear Line
Example 1: Plotting a Linear Line

Plotting a line

# importing the required module
import matplotlib.pyplot as plt

# values in the x axis
x_coords = [3,4,5]
# corresponding values in the y axis
y_coords = [4,6,3]

# finally, getting the points  plotted
plt.plot(x_coords, y_coords)

# name the x axis as 'the x axis'
plt.xlabel('the x axis')

# name the y axis as 'the y axis.'
plt.ylabel('the y axis')

# giving a title to the graph
plt.title('The initial Graph by matplotlib.pyplot  !')

# display the plot
plt.show()
Plotting a line
Plotting a line

It appears that the code is self-explanatory. The measures were as follows:

As a list, there is the definition of the x-axis and corresponding y-axis values:

  • Use the .plot() method to plot them on a canvas.
  • Using the .xlabel() and.ylabel() functions, give the x- and y-axes a name.
  • Using the .title() method, give your plot a title.

Finally, we use the.show() function to display your story.

Using the same plot to plot two or more lines

import matplotlib.pyplot as plt

# first line points
first_x = [1,2,3]
first_y = [2,4,1]

# plotting the first line points
plt.plot(first_x, first_y, label = "first line")

# second line points
second_x = [1,2,3]
second_y = [4,1,3]

# plotting the second line points
plt.plot(second_x, second_y, label = "second line")

# x axis name
plt.xlabel(' the x axis')

# y axis name
plt.ylabel('the y axis')

# give a title to the graph
plt.title(' Plotting Two lines on the same graph!')

# legend -key
plt.legend()

# display the plot
plt.show()
Using the same plot to plot two or more lines
Using the same plot to plot two or more lines

On the same graph, we map two lines. We distinguish them by assigning them a name(label) passed as an argument to the .plot() function.

The legend is a small rectangular box that contains details about the type of line and its color. Using the .legend() feature, we can add a legend to our story.

How to customize the Plot

We’ll go through some basic customization that can be applied to almost any story.

import matplotlib.pyplot as plt

# values in the x axis
x_coords = [3,4,5,6,7,8]
# corresponding values in the y axis
y_coords = [4,6,3,7,4,8]

# finally plotting the points
plt.plot(x_coords, y_coords, color='green', linestyle='dashed', linewidth = 3,
		marker='o', markerfacecolor='r', markersize=12)

# setting x and y axis range
plt.ylim(3,10)
plt.xlim(3,10)

# x axis naming
plt.xlabel('the x axis')

# y axis naming
plt.ylabel('the y axis')

# specify graph title
plt.title('Graph Customizations !')

# fdisplayplot
plt.show()
Graph customizations
Graph customizations

As you can see, we’ve made several changes, including

  • line-width, line-style, and line-color.
  • setting the marker, the color of the marker’s face, and the height of the marker
  • overriding the axis ranges on the x and y axes. If overriding is not achieved, the auto-scale function of the pyplot module is used to set the axis range and scale.
Bar Graph
import matplotlib.pyplot as plt

# bars left sides' x-coordinates
left = [3, 4, 5, 6, 7]

# bars' height
height = [12, 26, 38, 42, 7]

# bars'  labels
bar_labels = ['first', 'second', 'third', 'fourth', 'fifth']


# bar chart plotting
plt.bar(left, height, label = bar_labels,
		width = 0.8, color = ['red', 'blue'])

# the x-axis'  naming
plt.xlabel(' the x axis')

# the y-axis  naming
plt.ylabel('the y axis')

#  title of the chart
plt.title(' The Bar Chart!')

# display the plot
plt.show()
bar graph in matplotlib
bar graph in matplotlib

To build a bar map, we use the plt.bar() function.

The x-coordinates of the left side of the windows and the heights of the bars are transferred.

By defining tick labels, you can also give x-axis coordinates a name.

Hexagonal histogram

import matplotlib.pyplot as plt

# frequencies
ages = [7,10,75,45,35,50,55,50,48,45,49,
		65,12,18,62,23,95,82,37,26,25,45]

# set no. of intervals and  the ranges
range = (0, 100)
bins = 10

#histogram plotting
plt.hist(ages, bins, range, color = 'blue',
		histtype = 'bar', rwidth = 0.65)

# label for the x-axis
plt.xlabel('age')

# label for the  frequency
plt.ylabel('The count of people')

# title  plotting
plt.title('The Histogram')

# display the plot
plt.show()
Hexagonal histogram in matpotlib
Hexagonal histogram in matpotlib

To plot a histogram, we use the plt.hist() function.
The ages list is passed as the frequency list.
A tuple containing min and max values may be used to define a range.
What follows is to “bin” the range of values, which involves dividing the entire range of values into a series of intervals and taking the count of values that fall into each interval. We’ve set bins = 10 in this case. As a result, there are 100/10 = 10 cycles in total.

Scatter graph
import matplotlib.pyplot as plt

# values of the  x-axis
x = [3,4,6,8,7,8,9,10,11,12]

# y-axis values
y = [4,6,7,9,8,10,11,13,14,14]

# scatter plot
plt.scatter(x, y, label= "stars", color= "blue",
			marker= "*", s=30)

# the x axis label
plt.xlabel('the x axis')

# label for the frequency
plt.ylabel('the y axis')

# title for the  plot
plt.title('The scatter plot')

# display the legend
plt.legend()

# display the plot
plt.show()
scatter plot in matplotlib
scatter plot in matplotlib

To map a scatter plot, we use the plt.scatter() function.
We define x and y-axis values in the same way as we define them in a line.
The character to use as a marker is defined by the marker claim. The parameter can be used to specify its scale.

Pie-Chart
import matplotlib.pyplot as plt

# labels definition
my_hobbies = ['walk', 'read', 'dance', 'work']

# each labels' portion
slices = [5, 9, 10, 8]

# each label's color
colors = ['r', 'y', 'g', 'b']

# pie chart - plotting
plt.pie(slices, labels = my_hobbies, colors=colors,
		startangle=90, shadow = True, explode = (0, 0, 0.1, 0),
		radius = 1.2, autopct = '%1.1f%%')

# plot the legend
plt.legend()

# display the plot
plt.show()
Pie-Chart in matplotlib

Using the plt.pie() process, we create a pie map.
To begin, we use a list called activities to define the labels.
Then, using a separate list called slices, each label’s portion can be described.
A list of named colors is used to describe the color for each mark.
If shadow = True, a shadow will appear underneath each mark in the pie chart.
The start angle rotates the pie chart’s beginning by a defined number of degrees counterclockwise from the x-axis.
The fraction of radius with which we offset each wedge is set using explode.
The meaning of each label is formatted using autopct. We’ve set it to only display the percentage value up to one decimal place.

How to plot an equation’s curves?
# the needed modules are imported
import matplotlib.pyplot as plt
import numpy as np

#set the x coordinates here
x_coords = np.arange(0, 2*(np.pi), 0.1)

# setting the corresponding y - coordinates
y_coords = np.sin(x_coords)

# pot the given points
plt.plot(x_coords, y_coords)

plt.title ('How to plot an equation\'s curves)

# display the plot
plt.show()
How to plot an equations curves
How to plot an equations curves

NumPy

NumPy is a general-purpose array-processing package in Python.

We use the np.arange() method to set the x-axis values, with the first two arguments being a range and the third being a step-wise increment. A numpy array is the end product.

We simply use the numpy array’s predefined np.sin() method to get the corresponding y-axis values.

Finally, we use the plt.plot() function to plot the points using the x and y arrays. So, in this section, we explore the different types of plots we can make with matplotlib. More plots haven’t been explored.

Subplots

Subplots are a plot within a plot.

When we want to display two or more plots in the same figure, we need to use subplots.

Approach 1:

# importing required modules
import matplotlib.pyplot as plt
import numpy as np

# generation of coordinates
def create_plot(ptype):

	# x-axis values set
	x_coords = np.arange(-10, 10, 0.01)
	
	# y-axis values set
	if ptype == 'linear':
		y_coords = x_coords
	elif ptype == 'quadratic':
		y_coords  = x_coords **2
	elif ptype == 'cubic':
		y_coords  = x_coords**3
	elif ptype == 'quartic':
		y_coords  = x_coords **4
			
	return(x_coords , y_coords )

# set your preferred style
plt.style.use('fivethirtyeight')

#  figure creation
_fig = plt.figure()

# in the figure define subplots and their positions
plot_1 = _fig.add_subplot(221)
plot_2  = _fig.add_subplot(222)
plot_3  = _fig.add_subplot(223)
plot_4  = _fig.add_subplot(224)

# plotting points on every single subplot
x_coords, y_coords = create_plot('linear')
plot_1.plot(x_coords, y_coords, color ='r')
plot_1.set_title('$y_1 = x$')

x_coords, y_coords = create_plot('quadratic')
plot_2.plot(x_coords, y_coords, color ='b')
plot_2.set_title('$y_2 = x^2$')

x_coords, y_coords = create_plot('cubic')
plot_3.plot(x_coords, y_coords, color ='g')
plot_3.set_title('$y_3 = x^3$')

x_coords, y_coords = create_plot('quartic')
plot_4.plot(x_coords, y_coords, color ='k')
plot_4.set_title('$y_4 = x^4$')

# adjust the space between subplots
_fig.subplots_adjust(hspace=.5,wspace=0.5)

# display the plot
plt.show()
NumPy Approach 1
NumPy Approach 1

Let’s take a look at this software one stage at a time:

plt.style.use('fivethirtyeight')

Plots can be styled in various ways, including using one of the available models or creating your own.

plt.figure = plt.figure ()

All plot elements are contained in a top-level container called a figure. As a result, we describe a figure as _fig, which includes all of our subplots.

plot_1 = _fig.add_subplot (221)
plot_2 = _fig.add_subplot(222)
plot_3 = _fig.add_subplot(223)
plot_4 = _fig.add_subplot(224)

To define subplots and their locations, we use the _fig.add_subplot process. It is how the feature prototype looks:

add_subplot(nrows, ncols, plot number)

When you apply a subplot to a number, the figure is split into ‘nrows’ * ‘ncols’ sub-axes. The ‘plot number’ parameter specifies the subplot that the function call must construct. ‘plot number’ can be anything from 1 to ‘nrows’ * ‘ncols.’

Suppose the three parameters have values less than 10. In that case, the function subplot can be named with only one int parameter, with the hundreds representing ‘nrows,’ the tens representing ‘ncols,’ and the units representing ‘plot number.’ It means that we should write subplot(2, 3, 4) instead of subplot(2, 3, 4). (234).

This diagram will demonstrate how positions are defined:

x_coords,y_coords =create_plot('linear')
plot_1.plot(x, y, color ='r')
plot_1.set_title('$y_1 = x$')

Then, on each subplot, we plot our points. But, first, we use the create plot function to generate x and y-axis coordinates by specifying the type of curve we want.

Then, using the .plot form, we plot those points on our subplot. The set title method is used to change the title of a subplot. When you use $ at the beginning and end of the title document, ‘_’ (underscore) is read as a subscript. While and ‘^’ is read as a superscript.

_fig.subplots_adjust(hspace=.5,wspace=0.5)

Another useful approach for creating space between subplots is to use this technique.

plt.show ()

Finally, we use the plt.show() method to display the current figure.

Approach 2:

# importing required modules
import matplotlib.pyplot as plt
import numpy as np

# generation of coordinates
def create_plot(ptype):
	# set values for the x-axis
	x_coords = np.arange(0, 5, 0.01)
	
	# set the values for y-axis
	if ptype == 'sin':
		# a sine wave
		y_coords = np.sin(2*np.pi*x_coords)
	elif ptype == 'exp':
		# exponential function is  negative
		y_coords = np.exp(-x_coords)
	elif ptype == 'hybrid':
		# sine wave is  damped
		y_coords = (np.sin(2*np.pi*x_coords))*(np.exp(-x_coords))
			
	return(x_coords, y_coords)

# set the style to use
plt.style.use('ggplot')

# defining subplots and their positions
plot_1 = plt.subplot2grid((11,1), (0,0), rowspan = 3, colspan = 1)
plot_2 = plt.subplot2grid((11,1), (4,0), rowspan = 3, colspan = 1)
plot_3 = plt.subplot2grid((11,1), (8,0), rowspan = 3, colspan = 1)

# plotting points on each subplot
x_coords, y_coords = create_plot('sin')
plot_1.plot(x_coords, y_coords, label = 'sine wave', color ='b')
x_coords, y_coords = create_plot('exp')
plot_2.plot(x_coords, y_coords, label = 'negative exponential', color = 'r')
x_coords, y_coords = create_plot('hybrid')
plot_3.plot(x_coords, y_coords, label = 'damped sine wave', color = 'g')

# show legends of each subplot
plot_1.legend()
plot_2.legend()
plot_3.legend()

# function to show plot
plt.show()
NumPy Approach 2
NumPy Approach 2

Let’s go through some of the most critical aspects of this program:

plot_1 = plt.subplot2grid((11,1), (0,0), rowspan = 3, colspan = 1)
plot_2 = plt.subplot2grid((11,1), (4,0), rowspan = 3, colspan = 1)
plot_3 = plt.subplot2grid((11,1), (8,0), rowspan = 3, colspan = 1)

subplot2grid is similar to “pyplot.subplot,” but it employs 0-based indexing and allows the subplot to occupy several cells.

Let’s take a look at the subplot2grid method’s arguments:

  1. argument 1: is the grid’s geometry
  2. argument 2: grid position of the subplot
  3. argument 3: (rowspan) The number of rows that the subplot covers.
  4. argument 4: (colspan) The number of columns that the subplot covers.

This diagram will help to clarify the concept:

Each subplot in our example spans three rows and one column, with two empty rows (rows number 4,8).

x_coords, y_coords = create_plot('sin')
plot_1.plot(x_coords, y_coords, label ='sine wave', color ='b')

There’s nothing remarkable about this section because the syntax for plotting points on a subplot remains the same.

plt.legend()

It will show the subplot’s label on the figure.

plt.show ()

Finally, the plt.show() function is used to display the current story.

Note: Based on the above two examples, we may conclude that when plots are standardized in scale, the subplot() method should be used, while the subplot2grid() method should be used when we want more flexibility in the location and sizes of our subplots.

Plotting in three dimensions

Matplotlib makes it easy to build 3-D graphs. Subsequently, let’s examine some of the most important and widely used 3-D plots.

How to Plot Points

from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
from matplotlib import style
import numpy as np

# custom style – is set here
style.use('ggplot')

#construct a new plotting figure
_fig = plt.figure()

# Make a new subplot on our diagram and make the projection 3d
ax_1 = _fig.add_subplot(111, projection='3d')

# define x, y, z co-ordinates
x_coords = np.random.randint(0, 10, size = 20)
y_coords = np.random.randint(0, 10, size = 20)
z_coords = np.random.randint(0, 10, size = 20)

# plotting the points on subplot


# setting labels for the axes
ax_1 .set_xlabel('the x axis')
ax_1 .set_ylabel('the y axis')
ax_1 .set_zlabel('the z axis')

# display the plot
plt.show()
Make a new subplot on our diagram with a 3d projection
Make a new subplot on our diagram with a 3d projection

The above program’s output will give you a window where you can rotate or expand the plot. Here’s an example: dark points are closer to each other than light points

This section breaks down the code’s most relevant features

from mpl_toolkits.mplot3d import axes3d

It is the module required to plot in three dimensions.

ax_1 = fig.add_subplot(111, projection='3d')

On our figure, we construct a subplot and set the projection argument to 3d.

ax_1.scatter(x, y, z, c = 'm', marker = 'o')

To map the points in the XYZ plane, we now use the.scatter() function.

Line’s plotting
# required modules  imported
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
from matplotlib import style
import numpy as np

# deciding on a unique style to use
style.use('ggplot')

# construct a new plotting figure
_fig = plt.figure()

# create a new subplot on our figure
ax_1 = _fig.add_subplot(111, projection='3d')

# defining x, y, z co-ordinates
x_coords = np.random.randint(0, 10, size = 5)
y_coords = np.random.randint(0, 10, size = 5)
z_coords = np.random.randint(0, 10, size = 5)

# plotting the points on subplot
ax1.plot_wireframe(x,y,z)

# setting the labels
ax_1.set_xlabel('the x axis')
ax_1.set_ylabel('the y axis')
ax_1.set_zlabel('the z axis')

plt.show()

A screenshot of the above program’s 3-D plot will look like this:

Lines plotting in 3d
Lines plotting in 3d

The following is the key difference between this program and the previous one:

ax_1.plot_wireframe(x,y,z)

To plot lines over a range of 3-D points, we used the.plot_wireframe() form.

Bars’ Plotting
# importing required modules
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
from matplotlib import style
import numpy as np

# setting a custom style to use
style.use('ggplot')

# construct a new plotting figure
_fig = plt.figure()

# add a new subplot to our diagram
ax_1 = _fig.add_subplot(111, projection='3d')

# defining x, y, z co-ordinates for bar position
x_coords = [3,4,5,6,7,8,9,10,11,12]
y_coords = [6,5,3,8,7,5,9,7,5,9]
z_coords = np.zeros(10)

# size of bars
_dx = np.ones(10)			 # length measured along the x-axis
_dy = np.ones(10)			 # length measured along the y-axs
_dz = [5,7,8,6,10,12,9,9,14,13] # height of bar

# establishing a color scheme
color = []
for val in _dz:
	if val > 7:
		color.append('r')
	else:
		color.append('b')

# bars  plotting
ax_1.bar3d(x_coords, y_coords, z_coords, _dx, _dy , _dz , color = color)

# setting axes labels
ax_1.set_xlabel('the x axis')
ax_1.set_ylabel('the y axis')
ax_1.set_zlabel('the z axis')

plt.show()

It is a screenshot of the 3-D environment that was created:

Bars Plotting in 3d
Bars Plotting in 3d

Let’s go through some of the most critical aspects of this program:

x_coords = [3,4,5,6,7,8,9,10,11,12]
y_coords = [6,5,3,8,7,5,9,7,5,9]
z_coords = np.zeros(10)

The base positions of bars are described here. When z = 0 is set, all bars begin on the XY plane.

_dx = np.ones(10) # length measured along the x-axis
_dy = np.ones(10) # length measured along the y-axs

The bar scale is indicated by the letters _dx, _dy, and _dz. Consider the bar to be a cuboid, and the expansions along the x, y, and z axes are _dx, _dy, and _dz, respectively.

for val in _dz :
  if val > 7:
    color
    append('r')
  else:
    color
    append('b')

As a list, we set the color for each bar. For bars with a height greater than 5, the color scheme is red, and for bars with a height less than 5, the color scheme is blue.

ax1.bar3d(x, y, z, _dx, _dy, _dz, color = color)

The function to plot the bars is use .bar3d()

Curve plotting
# importing required modules
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
from matplotlib import style
import numpy as np

# deciding on a unique style to use
style.use('ggplot')

# construct a new plotting figure
_fig = plt.figure()

# create a new subplot on our figure
ax_1 = _fig.add_subplot(111, projection='3d')

# get points for a mesh grid
u, v = np.mgrid[0:2*np.pi:200j, 0:np.pi:100j]

# setting x, y, z co-ordinates
x_coords=np.cos(u)*np.sin(v)
y_coords=np.sin(u)*np.sin(v)
z_coords=np.cos(v)

# Currently, the curve is being plotted.
ax_1.plot_wireframe(x_coords, y_coords, z_coords, rstride = 8, cstride = 8, linewidth = 1)

plt.show()

This program’s output would look like this:

Curve plotting in 3d
Curve plotting in 3d

We used a sphere as a mesh grid in this example.

The vital points worth considering include:

u, v = np.mgrid[0:2*np.pi:200j, 0:np.pi:100j]

We use np.mgrid to obtain points to build a mesh.
More information on this can be found here.

x_coords=np.cos(u)*np.sin(v)
y_coords=np.sin(u)*np.sin(v)
z_coords=np.cos(v)

It is nothing more than a sphere’s parametric equation.

ax_1.plot_wireframe(x_coords, y_coords, z_coords, rstride = 8, cstride = 8, linewidth = 1)

Alternatively, we can use the.plot wireframe() form. The rstride and cstride arguments can be used to specify how thick our mesh needs to be in this case.

Without a Line of Sight

You may use the shortcut string notation parameter ‘o’, which stands for ‘lines,’ to plot only the markers.

import matplotlib.pyplot as plt
import numpy as np

x_coords = np.array([3, 10])
y_coords = np.array([5, 12])

plt.plot(x_coords, y_coords, 'o')
plt.show()
Without a Line of Sight
Without a Line of Sight
Several Points

You have no limitation to the number of points you would like to plot, as long as both axes have the same number of points.

As an illustration, in a diagram, draw a line from position (1, 5) to position (4, 10), then to position (8, 3), and finally to position (10, 12):

import matplotlib.pyplot as plt<br>import numpy as np
x_coords = np.array([3, 4, 8, 10])
y_coords = np.array([5, 10, 3, 12])

plt.plot(x_coords, y_coords)
plt.show()
plotting several points
plotting several points
X-Points by default

If the x-axis points are not defined, they will be assigned the default values of 0, 3, 4, 5, 6, 7, etc. However, that depends on the duration of the y-points.

So, if we use the same example before but don’t have the x-points, the diagram looks like this:

Plotting without x-points as an example:

import matplotlib.pyplot as plt
import numpy as np

y_coords = np.array([5, 11, 3, 12, 7, 9])

plt.plot(y_coords)
plt.show()
X-Points by default
X-Points by default

Matplotlib allows you to fine-tune your plots—for example, and you can specify the x-position of each bar in a barplot.

The racing results are plotted in Matplotlib as follows:

import matplotlib.pyplot as plt
time = [0, 1, 2, 3]
position = [0, 100, 200, 300]

plt.plot(time, position)
plt.xlabel('Time (hr)')
plt.ylabel('Position (km)')
plt.show()
racing results
racing results

To view numeric data in plots, graphs, and charts in Python, Pythonistas usually use the Matplotlib plotting library. In addition, matplotlib’s two APIs (Application Programming Interfaces) have a wide range of functionality:

OO (Object-Oriented) API interface provides a list of objects constructed with greater flexibility than pyplot. Pyplot API interface has a hierarchy of code objects that make matplotlib function like MATLAB. The OO API gives you direct access to the backend layer of matplotlib.

How to Use the Plot() Function to Make a Simple Plot

The matplotlib.pyplot.plot() function offers a single interface for making various plot types.

The plot() function is used in the simplest example to plot values as x,y coordinates in a data plot. Plot() takes two parameters to define plot coordinates in this case:

An array of X-axis coordinates is passed as a parameter.
An array of Y-axis coordinates is passed as a parameter.
By generating two arrays of (2,8) and (4,9), a line spanning from x=2, y=4 to x=8, y=9 can be plotted:

import matplotlib.pyplot as plt
import numpy as np

# coordinates on the x axis
x_coords = np.array([4, 10])

# coordinates on the Y axis
y_coords = np.array([6, 11])

plt.plot(x_coords, y_coords)
plt.show()
How to Use the Plot() Function to Make a Simple Plot
How to Use the Plot() Function to Make a Simple Plot

Markers and Linestyles –Modify the Look of a Plot

The matplotlib keywords marker and linestyle can be used to customize the appearance of data in a plot without changing the data values.

Each data value in a plot is labeled with a ‘marker ‘using the marker statement.

The linestyle argument can change the appearance of lines between data values or delete them entirely. The letter “o” labels every data value and gives a dashed linestyle “–” in this example:

import matplotlib.pyplot as plt
import numpy as np

x_coords = np.array([4, 14, 5, 11])

# Customize the linestyle for each data value:
plt.plot(x_coords, marker = "o", linestyle = "-.")
plt.show()
Use markers and linestyles to modify the look of a plot
Use markers and linestyles to modify the look of a plot

A partial list of string characters that can be used as markers and line styles is as follows:

  • “-” solid line style
  • “–” dashed line style
  • ” ” no line
  • “o” letter marker

Advanced plots, such as scatter plots, are also supported by Matplotlib. For example, the scatter() function is used to view data values as a set of x,y coordinates represented by single dots in this example.

Two identical arrays, one for X-axis values and the other for Y-axis values, are plotted in this example. Again, a dot is used to indicate each value:

Example of a Matplotlib Scatter Plot
import matplotlib.pyplot as plt

# values in the X axis
x_coords = [4,5,9,31,10,7,15,13,24,35]

# values in the Y axis
y_coords = [6,9,57,45,4,6,13,24,35,46]

# plotting a scatter
plt.scatter(x_coords, y_coords)

plt.show()
Example of a Matplotlib Scatter Plot
Example of a Matplotlib Scatter Plot

Multiple Data Sets in One Plot with Matplotlib

Matplotlib is a powerful plotting library that can handle several datasets in a single plot. We’ll plot two different data sets, xdata1 and xdata2, in this example:

import matplotlib.pyplot as plt
import numpy as np

# random seed generation
np.random.seed(5484849901)

# Creation of random data
xdata = np.random.random([6, 12])  

# Creation of two datasets from the random floats
x_data1 = xdata[0, :]  
x_data2 = xdata[1, :]  

# Sort the data in both datasets:
x_data1.sort()  
x_data2.sort()

# Creation of y data points
y_data1 = x_data1 ** 2
y_data2 = 1 - x_data2 ** 4

# data plotting
plt.plot(x_data1, y_data1)  
plt.plot(x_data2, y_data2)  

# Set lower and upper limits for  x,y
plt.xlim([0, 1])  
plt.ylim([0, 1])  

plt.title("Multiple Datasets in One Plot")
plt.show()
Multiple Data Sets in One Plot with Matplotlib
Multiple Data Sets in One Plot with Matplotlib
Subplots with Matplotlib

Matplotlib can also be used to generate complex figures with multiple plots. Multiple axes are enclosed in one figure and shown in subplots in this example:

import matplotlib.pyplot as plt
import numpy as np

# Make a figure with two rows and two columns of subplots like follows
fig, ax = plt.subplots(2, 2)

x = np.linspace(25,30 , 125)

# Within a single figure, index four axes arrays in four subplots:
ax[0, 0].plot(x, np.sin(x), 'g') #row=0, column=0
ax[1, 0].plot(range(100), 'b') #row=1, column=0
ax[0, 1].plot(x, np.cos(x), 'r') #row=0, column=1
ax[1, 1].plot(x, np.tan(x), 'k') #row=1, column=1

plt.show()
Subplots with Matplotlib
Subplots with Matplotlib
Plotting the Phase Spectrum in Matplotlib

The frequency characteristics of a signal can be visualized using a phase spectrum map.

We’ll plot the phase spectrum of two signals represented as functions with different frequencies in this advanced example:

import matplotlib.pyplot as plt
import numpy as np

# pseudo-random numbers  generation
np.random.seed(0)

# interval of sampling  
dt = 0.01

# Frequency of  sampling
Fs = 1 / dt  # ex[;aom Fs]

# noise generation
t = np.arange(0, 10, dt)
res = np.random.randn(len(t))
r = np.exp(-t / 0.05)

# Convolution of 2 signals or functions
conv_res = np.convolve(res, r)*dt
conv_res = conv_res[:len(t)]
s = 0.5 * np.sin(1.5 * np.pi * t) + conv_res

# plot  creation
fig, (ax) = plt.subplots()
ax.plot(t, s)

# the phase spectrum function plots
ax.phase_spectrum(s, Fs = Fs)

plt.title("Plotting of Phase Spectrum ")
plt.show()
Plotting the Phase Spectrum in Matplotlib
Plotting the Phase Spectrum in Matplotlib
3D Plot with Matplotlib

By allowing the use of a Z-axis, Matplotlib can also handle 3D plots. We’ve already made a 2D scatter plot, but we’ll make a 3D scatter plot in this example:

from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt

fig = plt.figure()

# Creation of a single 3D subplot
ax = fig.add_subplot(111, projection='3d')

# '111' is a MATlab convention for creating a grid with one row and one column that is utilized by Matplotlib.
# The new Axes location is the first cell in the grid.
# Create x,y,z coordinates:

x_coords =[3,4,5,6,7,8,9,10,11,12]
y_coords =[13,6,4,7,15,6,12,4,6,10]
z_coords =[4,5,6,7,7,9,11,13,21,11]

# Create a 3D scatter plot with x,y,z orthogonal axis, and red "o" markers:
ax.scatter(x_coords, y_coords, z_coords, c='blue', marker="o")

# Create x,y,z axis labels:
ax.set_xlabel(' the x Axis')
ax.set_ylabel('the y Axis')
ax.set_zlabel('the z Axis')

plt.show()
3D Plot with Matplotlib
3D Plot with Matplotlib
What Is a Matplotlib Backend and How Can I Use It?

Matplotlib can output to almost any format you can imagine. Plots are typically displayed in a data scientist’s Jupyter notebook, but they can also be displayed inside an application.

Matplotlib’s OO backend uses the Tkinter TkAgg() function to generate high-quality Agg (Anti-Grain Geometry) rendering and the Tk mainloop() function to show a plot in this example:

from tkinter import *
from tkinter.ttk import *

import matplotlib
matplotlib.use("TkAgg")
from matplotlib.figure import Figure

# Object Oriented backend (Tkinter) tkagg() function
from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg

root = Tk()

_fig = Figure(figsize=(5, 4), dpi=100)
plot = _fig.add_subplot(1, 1, 1)

x_coords = [ 0.1, 0.2, 0.3, 0.4 ]
y_coords = [ -0.1, -0.2, -0.3, -0.4 ]

plot.plot(x_coords, y_coords, color="red", marker="o",  linestyle="--")

canvas = FigureCanvasTkAgg(_fig , root)
canvas.get_tk_widget().grid(row=0, column=0)

root.mainloop()

Seaborn

Seaborn is an abstraction layer built on top of Matplotlib that provides a user-friendly interface for quickly creating various useful plot types.

It does not, however, make any concessions in terms of control! You still have full control since Seaborn provides escape hatches to access the underlying Matplotlib properties. Seaborn creates some of the most attractive statistical graphs that are very informative.

import seaborn as sns
import matplotlib.pyplot as plt

# default theme is applied
sns.set_theme()

# Load an example dataset
tips_data = sns.load_dataset("tips")

# Sex visualization
sns.relplot(
    data=tips_data,
    x="total_bill", y="tip", col="time",
    hue="sex", style="sex", size="size", facet_kws=dict(sharex=False),
)
plt.show()
Seaborn sample graph
Seaborn sample graph

Plotly

Plotly is a Python plotting library that comes with a plotting ecosystem. It comes with three different user interfaces:

  • An object-oriented user interface
  • An imperative interface for specifying your plot using JSON-like data structures.
  • Plotly Express is a high-level GUI close to Seaborn.
  • Plotly plots are made to be used in web applications. Plotly is a JavaScript library at its heart! The plots are drawn with D3 and stack.gl.

Bypassing JSON to the JavaScript library, you can build Plotly libraries in other languages. That is exactly what the official Python and R libraries do. The Python Plotly API was ported to run in the web browser.

Example Scatter Plot
import plotly.graph_objects as go
import numpy as np

x_coords = np.linspace(0, 100, 1000)
y_coords = np.sin(x_coords)

fig = go.Figure(data=go.Scatter(x=x_coords, y=y_coords, mode='markers'))

fig.show()
Example Scatter Plot in Plotly
Example Scatter Plot in Plotly
Scatter and Line Plots
import plotly.graph_objects as go

# Create random data with numpy
import numpy as np
np.random.seed(1)

count_val = 1000
random_x = np.linspace(0, 100, count_val)
random_y0 = np.random.randn(count_val) + 5
random_y1 = np.random.randn(count_val)
random_y2 = np.random.randn(count_val) - 5

_fig = go.Figure()

# Add traces
_fig.add_trace(go.Scatter(x=random_x, y=random_y0,
                    mode='markers',
                    name='markers only'))
_fig.add_trace(go.Scatter(x=random_x, y=random_y1,
                    mode='lines+markers',
                    name='lines & markers'))
_fig.add_trace(go.Scatter(x=random_x, y=random_y2,
                    mode='lines',
                    name='lines only'))

_fig.show()
Scatter and Line Plots in plotly
Scatter and Line Plots in plotly
Bubble Scatter Plots in Plotly
import plotly.graph_objects as go

x_coords =[3, 4, 5, 6]
y_coords =[15, 16, 17, 18]

fig = go.Figure(data=go.Scatter(
    x=x_coords,
    y=y_coords,
    mode='markers',
    marker=dict(size=[50, 70, 90, 110],
                color=[0, 1, 2, 3])
))

fig.show()
Bubble Scatter Plots in Plotly
Bubble Scatter Plots in Plotly

Style Scatter Plot in Plotly

import plotly.graph_objects as go
import numpy as np


var_counts = np.linspace(0, 10, 100)

_fg = go.Figure()

_fg.add_trace(go.Scatter(
    x=var_counts, y=np.sin(var_counts),
    name='Sine',
    mode='markers',
    marker_color='rgba(180, 0, 0, .7)'
))

_fg.add_trace(go.Scatter(
    x=var_counts, y=np.cos(var_counts),
    name='Cosine',
    marker_color='rgba(250, 187, 190, 1)'
))

# With fig.update traces, you can choose settings that apply to all traces.
_fg.update_traces(mode='markers', marker_line_width=2, marker_size=10)
_fg.update_layout(title='Style Scatter Plots',
                  yaxis_zeroline=True, xaxis_zeroline=True)


_fg.show()
Style Scatter Plot in Plotly
Style Scatter Plot in Plotly

Bokeh

Since Bokeh (pronounced “BOE-kay”) specializes in immersive plots, this typical example doesn’t do it justice. Bokeh’s plots, like Plotly’s, are planned to be inserted in web apps and are saved as HTML files.

Bokeh allows you to build interactive, JavaScript-powered visualizations that can be seen in a web browser.

Bokeh is essentially a two-step process: To begin creating your visualization, you must first choose among Bokeh’s building elements. Second, you personalize these building pieces to meet your specific requirements.

Bokeh does this by combining two elements:

  • A Python library for specifying your visualization’s content and interactive features.
  • BokehJS is a JavaScript library that displays your interactive visualizations in a web browser in the background.

Bokeh produces all of the necessary JavaScript and HTML code based on your Python code.

How to install Boker in Debian Distros (e.g., Ubuntu)

Copy the following commands on the terminal and press enter.

pip install bokeh
Drawing a Line Chart using Bokeh
from bokeh.plotting import figure, show

# data preparation
x_coords = [1, 2, 3, 4, 5]
y1_coords = [6, 7, 2, 4, 5]
y2_coords = [2, 3, 4, 5, 6]
y3_coords = [4, 5, 5, 7, 2]

#  title and axis labels contained in the new plot
p = figure(title="Multiple line example", x_axis_label="x", y_axis_label="y")

#  using a several renderers
p.line(x_coords, y1_coords, legend_label="Temp.", line_color="blue", line_width=2)
p.line(x_coords, y2_coords, legend_label="Rate", line_color="red", line_width=2)
p.line(x_coords, y3_coords, legend_label="Objects", line_color="green", line_width=2)

# results are displayed using the subsequent command
show(p)
Drawing a Line Chart using Bokeh
Drawing a Line Chart using Bokeh
Drawing a Bar Chart using Bokeh
from bokeh.plotting import figure, show

# data preparation
x_coords = [1, 2, 3, 4, 5]
y1_coords = [6, 7, 2, 4, 5]
y2_coords = [2, 3, 4, 5, 6]
y3_coords = [4, 5, 5, 7, 2]

# Make a new plot with a title and labels for the axes.
p = figure(title=" Bar Chart Example in Bokeh", x_axis_label="x", y_axis_label="y")

# addition of multiple renderers
p.line(x_coords, y1_coords, legend_label="Temp.", line_color="blue", line_width=2)
p.vbar(x=x_coords, top=y2_coords, legend_label="Rate", width=0.5, bottom=0, color="red")
p.circle(x_coords, y3_coords, legend_label="Objects", line_color="yellow", size=12)

# results are shown here
show(p)
Drawing a Bar Chart using Bokeh
Drawing a Bar Chart using Bokeh
Glyphs Customizations in Bokeh
from bokeh.plotting import figure, show

# prepare some data
x_coords = [1, 2, 3, 4, 5]
y_coords = [6, 7, 7, 9, 4]

#  new plot has a title as well as the labels for the axis
p = figure(title="Customizing Glyphs properties in Bokeh", x_axis_label="x", y_axis_label="y")

# additional arguments added with  circle renderer
circle = p.circle(
    x_coords,
    y_coords,
    legend_label="Objects",
    fill_color="red",
    fill_alpha=0.5,
    line_color="blue",
    size=80,
)

# modify the color of the glyph
glyph = circle.glyph
glyph.fill_color = "green"

# display results as a diagrammatic representation
show(p)
Glyphs Customizations in Bokeh
Glyphs Customizations in Bokeh
Combined Line and Glyphs in Bokeh
from bokeh.plotting import figure, show

# data preparation
x_coords = [3, 4, 5, 6, 7]
y1_coords = [5, 6, 6, 8, 3]
y2_coords = [3, 4, 5, 6, 7]

# new plot
p = figure(title="Line and Glyphs Combined")

# addition of the circle renderer alongside legend_label arguments
line = p.line(x, y1, legend_label="Line.", line_color="green", line_width=1.5)
circle = p.circle(
    x_coords,
    y2_coords,
    legend_label=" The Objects",
    fill_color="red",
    fill_alpha=0.5,
    line_color="green",
    size=80,
)

#  declare the legend to be positioned in the top left corner
p.legend.location = "top_left"

# addition of legend title
p.legend.title = "Legend title"

# legend text appearance can be modified as follows
p.legend.label_text_font = "times"
p.legend.label_text_font_style = "italic"
p.legend.label_text_color = "navy"

# legend border and background is altered as below
p.legend.border_line_width = 3
p.legend.border_line_color = "navy"
p.legend.border_line_alpha = 0.8
p.legend.background_fill_color = "navy"
p.legend.background_fill_alpha = 0.2

# display the results
show(p)
Combined Line and Glyphs in Bokeh
Combined Line and Glyphs in Bokeh

Glyph Properties Vectorizing

You’ll use data vectors to alter the features of your plot and its elements in this section.

Color vectorization

So far, you’ve used properties like fill_color to assign certain colors to a glyph.

Pass a variable containing color information to the fill color property to alter colors based on its values, as shown below.

import random

from bokeh.plotting import figure, show

# generate some data (1-10 for x, random values for y)
x = list(range(0, 26))
y = random.sample(range(0, 100), 26)

#make a list of rgb hex colors that are related to y
colors = ["#%02x%02x%02x" % (255, int(round(value * 255 / 100)), 255) for value in y]

# creation of a new plot
p = figure(
    title="Example's of vectorized Bokeh Colors",
    sizing_mode="stretch_width",
    max_width=500,
    plot_height=250,
)

# addition of both line and circle renderers
line = p.line(x, y, line_color="green", line_width=1)
circle = p.circle(x, y, fill_color=colors, line_color="red", size=15)

# display results
show(p)
Color vectorization
Color vectorization
Colors and sizes are vectorizations

Apply the same technique to your renderer’s radius argument to build a plot with colors and sizes in proportion to your data as shown.

import numpy as np

from bokeh.plotting import figure, show

# data  generation
N = 1000
x_coords = np.random.random(size=N) * 100
y_coords = np.random.random(size=N) * 100

# based on the data given, generate colors and  radii
radii = y_coords / 100 * 2
colors = ["#%02x%02x%02x" % (130, int(round(value * 255 / 100)), 216) for value in y_coords]

# establish a  plot with a given size
p = figure(
    title="Vectorized Radii & Colors in Bokeh ",
    sizing_mode="stretch_width",
    max_width=500,
    plot_height=250,
)

# addition of a circle renderer
p.circle(
    x_coords,
    y_coords,
    radius=radii,
    fill_color=colors,
    fill_alpha=0.6,
    line_color="grey",
)

# display the results
show(p)
Colors and sizes are vectorizations
Colors and sizes are vectorizations
Palettes for color mapping

You may utilize Bokeh’s dozens of pre-defined color palettes to map colors to your data because it contains Brewer, D3, and Matplotlib palettes.

from bokeh.io import show
from bokeh.palettes import Turbo256
from bokeh.plotting import figure
from bokeh.transform import linear_cmap

# data generation
x_coords = list(range(-32, 33))
y_coords = [i**2 for i in x_coords ]

# creation of color mapper that is linear
mapper = linear_cmap(field_name="y", palette=Turbo256, low=min(y_coords), high=max(y_coords))

# plot creation
_plot = figure(plot_width=500, plot_height=250)

# circle renderer created with color mapper
_plot.circle(x_coords, y_coords, color=mapper, size=10)

show(_plot)
Palettes for color mapping
Palettes for color mapping

Combining Plots
from bokeh.layouts import row
from bokeh.plotting import figure, show

# data preparation
x_coords = list(range(11))
y0_coords = x_coords
y1_coords = [10 - i for i in x_coords]
y2_coords = [abs(i - 5) for i in x_coords]

#  single renderer with three different plots
first_plot = figure(plot_width=250, plot_height=250, background_fill_color="#fafafa")
first_plot.circle(x_coords, y0_coords, size=12, color="#0000FF", alpha=0.8)

second_plot = figure(plot_width=250, plot_height=250, background_fill_color="#fafafa")
second_plot.triangle(x_coords, y1_coords, size=12, color="#00FF7F", alpha=0.8)

third_plot = figure(plot_width=250, plot_height=250, background_fill_color="#fafafa")
third_plot .square(x_coords, y2_coords, size=12, color="#FFFF00", alpha=0.8)

#  placement of results in the same row automatically adjusts in line with browser window's width
show(row(children=[first_plot, second_plot, third_plot ], sizing_mode="scale_width"))
Combining Plots in Bokeh
Combining Plots in Bokeh
Data collection and filtering

To import and filter data, you’ll use a variety of sources and structures in this section.

Making use of ColumnDataSource

Bokeh’s data structure is the ColumnDataSource. To date, you’ve passed data to Bokeh using data sequences such as Python lists and NumPy arrays. These lists have been automatically turned into ColumnDataSource objects by Bokeh.

To construct a ColumnDataSource directly, follow these steps:

  • Import ColumnDataSource first.
  • Create a dict with your data next: The keys of the dict are the column names (strings). The dict’s values are data lists or arrays.
  • Then, as the data argument, provide your dict to ColumnDataSource:
  • Your renderer can then utilize your ColumnDataSource as a source.
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource

# a dict  is created as the foundation for ColumnDataSource
data = {'x_values': [1, 2, 3, 4, 5],
        'y_values': [6, 7, 2, 3, 6]}

# ColumnDataSource is then created based on the previous dict created initially
source = ColumnDataSource(data=data)

#  a plot and renderer with ColumnDataSource data is subsequently created
p = figure()
p.circle(x='x_values', y='y_values', source=source)
show(p)
Making use of ColumnDataSource
Making use of ColumnDataSource
Converting data in Pandas

Pass your pandas data to a ColumnDataSource to use data from a pandas DataFrame as follows.

ColumnDataSource(df)
Data filtering

Bokeh has several filtering options. If you want to construct a specific subset of the data in your ColumnDataSource, use these filters.

These filtered subsets are referred to as “views” in Bokeh. The CDSView class in Bokeh represents views. Pass a CDSView object to your renderer’s view argument to plot with a filtered subset of data.

There are two properties on a CDSView object:

  • source: the ColumnDataSource to which the filters should be applied.
  • a collection of Filter items

The IndexFilter is the most basic filter. An IndexFilter takes a set of index locations and provides a view that only shows the data points corresponding to those positions.

If your ColumnDataSource has a list of five values and you apply an IndexFilter with [0,2,4], the resulting view will only show the first, third, and fifth entries from your original list:

from bokeh.layouts import gridplot
from bokeh.models import CDSView, ColumnDataSource, IndexFilter
from bokeh.plotting import figure, show

# ColumnDataSource is first created from a dict
vals = ColumnDataSource(data=dict(x=[3, 4, 5, 6, 7], y=[1, 2, 3, 4, 5]))

# using an IndexFilter create a view with the following index positions [0, 2, 4]
view = CDSView(source=vals, filters=[IndexFilter([0, 2, 4])])

# define the setup tools
setup_tools = ["box_select", "hover", "reset"]

# The first plot is created with all data in the ColumnDataSource
p = figure(plot_height=300, plot_width=300, tools=setup_tools )
p.circle(x="x", y="y", size=10, hover_color="red", source=vals)

# The second plot is created with a subset of ColumnDataSource, based on view
p_filtered = figure(plot_height=300, plot_width=300, tools=setup_tools)
p_filtered.circle(x="x", y="y", size=10, hover_color="red", source=vals, view=view)

# plots next to each other  are both shown in a gridplot layout
show(gridplot([[p, p_filtered]]))
Data filtering
Data filtering
Using Widgets

You’ll add interactive widgets to your plots in this section.

Widgets to be added

Widgets are extra visual elements that can be added to your display. For example, widgets can display additional information or control elements of your Bokeh page interactively, as illustrated below.

from bokeh.layouts import layout
from bokeh.models import Div, RangeSlider, Spinner
from bokeh.plotting import figure, show

# data preparation
x_coords = [3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
y_coords = [4, 5, 5, 7, 2, 6, 4, 9, 1, 3]

# circle glyphs used to create plots
p = figure(x_range=(1, 9), plot_width=500, plot_height=250)
points = p.circle(x=x_coords, y=y_coords, size=30, fill_color="#21a7df")

# set up textarea (div)
div = Div(
    text="""
          <p>Use this control element to adjust the circle's size:</p>
          """,
    width=200,
    height=30,
)

#  spinner setup
spinner = Spinner(
    title="Circle size",
    low=5,
    high=50,
    step=5,
    value=points.glyph.size,
    width=200,
)
spinner.js_link("value", points.glyph, "size")

#  RangeSlider Setup
range_slider = RangeSlider(
    title="Adjust x-axis range",
    start=0,
    end=10,
    step=1,
    value=(p.x_range.start, p.x_range.end),
)
range_slider.js_link("value", p.x_range, "start", attr_selector=0)
range_slider.js_link("value", p.x_range, "end", attr_selector=1)

# layout creation
layout = layout(
    [
        [div, spinner],
        [range_slider],
        [p],
    ]
)

# result display
show(layout)
Using Widgets
Using Widgets
Displaying and exporting

You generated, altered, and merged visualizations in the previous steps. You’ll utilize a variety of approaches to show and export your visualizations in this section.

Making a stand-alone HTML document

To save your visualization to an HTML file, all of the examples so far have utilized the output_file() function. This HTML file provides all of the information you’ll need to see your plot.

output_file() takes a number of arguments. Consider the following scenario:

filename: the HTML file’s filename.

title: your document’s title (to be used in the HTML’s tag)</p> <p>When you use the show() function, Bokeh generates an HTML file. This function also launches a web browser to see the HTML file.</p> <p>Use the save() function instead if you only want Bokeh to generate the file and not open it in a browser. First, import it, then you can use the save() exactly like you did with show()</p>

from bokeh.plotting import figure, output_file, save

# prepare some data
x_coords = [3, 4, 5, 6, 7]
y_coords = [4, 5, 5, 7, 2]

# setting the output to a designated HTML file that is  static
output_file(filename="custom_filename.html", title="HTML File Output in Bokeh")

# specifics of a new plot given here
p = figure(sizing_mode="stretch_width", max_width=500, plot_height=250)

# circle renderer added
circle = p.circle(x_coords, y_coords, fill_color="red", size=15)

# results saved to a file
save(p)
Making a stand-alone HTML document
Making a stand-alone HTML document
Displaying in a Jupyter notebook

Replace Bokeh’s output_file() with output_notebook() if you’re using Jupyter notebooks ().

To see your visualization right inside your notebook, use the show() function:

PNG files can be exported

Additional requirements may be required to export PNG or SVG files.

Bokeh employs Selenium to generate PNG and SVG files. Bokeh can operate in a browser without a graphical user interface thanks to Selenium (GUI). Bokeh uses this browser to render PNG and SVG files. Selenium must access either a Firefox (through a program named geckodriver) or a Chromium browser for this to operate (through the chromedriver package).

Check that you have all the essential packages installed by using any of the subsequent commands. That depends on whether you’re using conda or pip:

conda install selenium geckodriver firefox -c conda-forge

or

pip install selenium geckodriver firefox
from bokeh.io import export_png
from bokeh.plotting import figure

# data preparation
x_coords = [3, 4, 5, 6, 7]
y_coords = [4, 5, 5, 7, 2]

# a new plot is created with fixed dimensions
p = figure(plot_width=350, plot_height=250)

# a circle renderer is added here
circle = p.circle(x_coords, y_coords, fill_color="red", size=15)

# the results are saved to a file
export_png(p, filename="plot.png")
Summary on Bokeh plotting tool

Even a straightforward graph like this has interactive elements. To investigate, use the tools to the right of the plot:

  • pan tool – to move the graph within your plot, use the pan tool.
  • Box zoom Icon – to zoom into a specific section of your plot, use the box zoom tool.
  • Zoom wheel – with a mouse wheel, zoom in and out with the wheel zoom tool.
  • Save tool – to save the current view of your plot as a PNG file, use the save tool.
  • Reset tool – to return to the plot’s default settings, use the reset tool.
  • Help tool – to understand more about the tools available in Bokeh, click the help sign.
Building visualizations in a nutshell

You’ve just finished all of the basic steps required by Bokeh’s bokeh.plotting interface for most simple visualizations. These included:

Step 1: Getting the data ready

Although you used a simple Python list, other types of serialized data will also work.

Step 2: Making a call to the figure() method

figure() produces a plot with the most commonly used default settings. You may change your plot’s title, tools, and axes labels, among other things.

Step 3: Renderers are being added.
To make a line, you used line(). Renderers include many choices for specifying visual features, including colors, legends, and widths.

Step 4: Requesting that Bokeh display or save the results using show() or save()
These options allow you to save your plot as an HTML file or see it in a web browser.

Altair

It finds its foundation in plotting declarative language or “visualization grammar” called Vega. Thus, it means it’s a well-thought-through API that scales well for complex plots, saving you from getting lost in nested-for-loop hell.

As with Bokeh, Altair outputs its plots as HTML files.

Bar chart using Altair

import altair as alt
from vega_datasets import data

data_source = data.wheat()

bar_chart =alt.Chart(data_source).mark_bar().encode(
    x='year:O',
    y="wheat:Q",
    # The consequence of a conditional statement will be highlighted.
 # If the year is 1700 this test returns True, and sets the bar red
 # And if it's not false it sets the bar steelblue
    color=alt.condition(
        alt.datum.year == 1700,
        alt.value('red'),     
        alt.value('steelblue')   
    )
).properties(width=750)
bar_chart .save('bar_chart.html')
Altair
Altair

Pygal

Pygal is primarily concerned with appearance. It creates SVG plots by default, so you can zoom in and out as much as you like without them being pixellated. Pygal plots also have some built-in interactivity features, making it another underappreciated choice for embedding plots in a web app.

import pygal
from pygal.style import Style
from IPython.display import display, HTML
from pygal.style import Style


custom_style = Style(
  colors=('#E80080', '#404040', '#9BC850'))


#  prepare the bar plot, set for data
bar_chart = pygal.Bar(style=custom_style)
        

def getFactorial(n):
    if n == 1 or n == 0:
        return 1
    else:
        return n * getFactorial(n-1)

listOfFactorials = [getFactorial(i) for i in range(5)]


bar_chart = pygal.Bar(height=400)
bar_chart.add('Factorial List', listOfFactorials)
bar_chart.render_in_browser()

And here’s the graph:

Pygal

Pandas

Pandas is a Python data science library that is extremely popular. It not only allows you to do scalable data manipulation, but it also has a plotting API. The panda’s example is the shortest code snippet in this article, even shorter than the Seaborn code since it operates directly on data frames.

If you’re learning about a dataset or getting ready to report your results, visualization is a must-have method. With .plot(), Python’s common data analysis library, pandas, you can visualize the data in various ways. But, even if you’re just getting started with pandas, you’ll soon be able to create simple plots that enhance understanding of your data.

What are the various styles of pandas plots, and when should they be used?

  • Using a histogram to get a quick overview of your data
  • How to Use a Scatter Plot to find connection
  • How to examine various groups and their proportions

Since the pandas API is a wrapper around Matplotlib, you can use the underlying Matplotlib – API to manage your plots more precisely.

Here’s a Panda’s plot of the election results. Again, the code is incredibly short and to the point!

import pandas as pd
import matplotlib.pyplot as plt

_url_download = ("https://raw.githubusercontent.com/fivethirtyeight/"
     "data/master/college-majors/recent-grads.csv")

df = pd.read_csv(_url_download)

type(df)
print(df)
df.plot(x="Rank", y=["P25th", "Median", "P75th"])
plt.show()
Panda's plot of the election results
Panda’s plot of the election results

Plot’s Observations:

When one’s rank falls, so does their median income. Since the median income determines the rank, this is to be expected.

Between the 25th and 75th percentiles in some majors, there are wide differences. These degrees may receive significantly less or significantly more than the median salary.

The variation existing between the 25th and 75th percentiles in other majors is very little. These degrees receive wages that are very similar to the median.

The first plot already suggests that there’s a lot more in the data to explore! For example, some large corporations have a wide variety of profits, while others have a more limited range. You’ll use a variety of plot styles to uncover these disparities.

.plot() accepts a number of optional parameters. The parameter, in particular, accepts eleven different string values and decides the type of plot you’ll make:

  • The term “area” refers to plots that cover a large area.
  • The term “bar” refers to vertical bar maps.
  • Horizontal bar maps are referred to as “barh.”
  • The term “box” refers to box plots.
  • Hexbin plots are denoted by the word “hexbin.”
  • Histograms are represented by the letter “hist.”
  • Kernel density estimation charts are abbreviated as “kde.”
  • “Kde” is an alias for “density.”
  • Line graphs are represented by the term “line.”
  • The word “pie” refers to pie charts.
  • The word “scatter” refers to scatter plots.
  • “row” is the default value.

Line graphs, such as the one above, are useful for getting a quick overview of your data. They can be used to spot broad patterns. They seldom have deep analysis, but they can point you in the right direction.

.plot() generates a line plot with the index on the x-axis. And all the numeric columns on the y-axis if you don’t have a parameter. Although this is good for datasets with a few columns, it seems like a mess for the college majors dataset, which has multiple numeric columns.

Note: DataFrame objects have several methods for creating the different types of plots mentioned above, in addition to passing strings to the kind parameter of.plot():

  • .area()
  • .bar()
  • .barh()
  • .box()
  • .hexbin()
  • .hist()
  • .kde()
  • .density()
  • .line()
  • .pie()
  • .scatter()

In this example, you’ll use the.plot() interface and the kind parameter to move strings. You should also give a try to the methods described above.

Matplotlib: A Look Behind the Scenes

Matplotlib generates the plot behind the scenes when you call.plot() on a DataFrame object. Check out two code snippets to see if this is true. To begin, use Matplotlib to generate a plot using two columns from your DataFrame:

import pandas as pd
import matplotlib.pyplot as plt

_url_download = ("https://raw.githubusercontent.com/fivethirtyeight/"
     "data/master/college-majors/recent-grads.csv")

df = pd.read_csv(_url_download)
plt.plot(df["Rank"], df["P75th"])
plt.show()
Matplotlib: A Look Behind the Scenes
Matplotlib: A Look Behind the Scenes

Using the DataFrame object’s.plot() form, you can make the exact same graph:

df.plot(x="Rank", y="P75th")

.plot() wraps pyplot.plot(), and the result is a graph that looks just like the one you made with Matplotlib:

To build the same graph from columns in a DataFrame object, use both pyplot.plot() and df.plot(). If you already have a DataFrame case, df.plot() is a better option than pyplot.plot().

Let’s look at the various types of plots you can make and how to make them. It is highly dependent on your understanding of the DataFrame object’s.plot() method as a wrapper for Matplotlib’s pyplot.plot() method.

Examine for Correlation

Frequently, you’ll want to check if two columns in a dataset are related. Do you have a lower risk of unemployment if you want a major with higher median earnings? Build a scatter plot with those two columns as a first step:

import pandas as pd
import matplotlib.pyplot as plt

_url_download = ("https://raw.githubusercontent.com/fivethirtyeight/"
"data/master/college-majors/recent-grads.csv")

df = pd.read_csv(_url_download)

df.plot(x="Median", y="Unemployment_rate", kind="scatter")
plt.show()

You should see a plot that appears to be completely random, such as this:

Examine for Correlation
Examine for Correlation

A brief look at this graph reveals that the earnings and unemployment rate have no meaningful relationship.

While a scatter plot is an excellent method for having a first impression of a potential correlation, it is far from conclusive evidence. You can use .corr to get a quick overview of the similarities between different columns (). If you assume a correlation between two values, you can use various methods to confirm your suspicions and determine how deep the correlation is.

However, keep in mind that just because two values have a connection does not mean that changing one would cause the other to change. To put it another way, association does not always mean causation.

Make a plan

Python provides a variety of ways to plot the same data with minimal code. While all of these methods will get you started quickly in creating charts, they require some local configuration.

Final Thoughts

You learned how to use Python and the various libraries available to visualize your dataset in this article. In addition, you’ve seen how a few simple plots will help you understand your data and guide your research.

In summary, you learned how to do the following in this tutorial:

  • A scatter plot is vital in determining the correlation.
  • Use bar plots to examine categories and pie plots to examine their ratios.
  • Determine which plot is best for your current project.
  • With a histogram, you can see how the dataset is distributed.
  • You’ve discovered some options for visualizing the data using plot() and a small DataFrame.
  • You’re now able to expand on your experience and experiment with even more advanced visualizations.

If you have any questions or suggestions, please leave them in the contact section below.

You may also like

Leave a Comment

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More