Plotting in Python: an overview

Author

Marie-Hélène Burle

There are many packages that provide plotting functionality in Python. Here is an overview of the most popular ones.

matplotlib

matplotlib is a very popular Python plotting library because it provides full control over the plots and produces graphs well-suited for publications. Many plot types can be created.

The downside of having full control is that it has a verbose imperative syntax. We will cover matplotlib in more details in the next course section.

Example

import matplotlib.pyplot as plt
import numpy as np

# make data:
np.random.seed(1)
x = np.random.uniform(-3, 3, 256)
y = np.random.uniform(-3, 3, 256)
z = (1 - x/2 + x**5 + y**3) * np.exp(-x**2 - y**2)
levels = np.linspace(z.min(), z.max(), 7)

# plot:
fig, ax = plt.subplots()

ax.plot(x, y, 'o', markersize=2, color='grey')
ax.tricontourf(x, y, z, levels=levels)

ax.set(xlim=(-3, 3), ylim=(-3, 3))

plt.show()

seaborn

Seaborn is a higher level library built on top of matplotlib. This means that the options are more limited, but the declarative syntax is very easy, the defaults look great, and it has lots of statistical functionality, making it a perfect library for exploratory data analysis (EDA) on tabular data. We will also cover seaborn in more details later in this course.

Example

import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
sns.set_theme(style="dark")

# Simulate data from a bivariate Gaussian
n = 10000
mean = [0, 0]
cov = [(2, .4), (.4, .2)]
rng = np.random.RandomState(0)
x, y = rng.multivariate_normal(mean, cov, n).T

# Draw a combo histogram and scatterplot with density contours
f, ax = plt.subplots(figsize=(6, 6))
sns.scatterplot(x=x, y=y, s=5, color=".15")
sns.histplot(x=x, y=y, bins=50, pthresh=.1, cmap="mako")
sns.kdeplot(x=x, y=y, levels=5, color="w", linewidths=1);

Website

bokeh

Bokeh creates interactive plots great for dashboards and web apps. It is more efficient than other libraries for streaming or interactions with large datasets. It does however have a fairly steep learning curve.

Example

from bokeh.models import LogColorMapper
from bokeh.palettes import Viridis6 as palette
from bokeh.plotting import output_notebook, figure, show
from bokeh.sampledata.unemployment import data as unemployment
from bokeh.sampledata.us_counties import data as counties

output_notebook()

palette = tuple(reversed(palette))

counties = {
    code: county for code, county in counties.items() if county["state"] == "tx"
}

county_xs = [county["lons"] for county in counties.values()]
county_ys = [county["lats"] for county in counties.values()]

county_names = [county['name'] for county in counties.values()]
county_rates = [unemployment[county_id] for county_id in counties]
color_mapper = LogColorMapper(palette=palette)

data=dict(
    x=county_xs,
    y=county_ys,
    name=county_names,
    rate=county_rates,
)

TOOLS = "pan,wheel_zoom,reset,hover,save"

p = figure(
    title="Texas Unemployment, 2009", tools=TOOLS,
    x_axis_location=None, y_axis_location=None,
    tooltips=[
        ("Name", "@name"), ("Unemployment rate", "@rate%"), ("(Long, Lat)", "($x, $y)"),
    ])
p.grid.grid_line_color = None
p.hover.point_policy = "follow_mouse"

p.patches('x', 'y', source=data,
          fill_color={'field': 'rate', 'transform': color_mapper},
          fill_alpha=0.7, line_color="white", line_width=0.5)

show(p)
BokehJS 3.8.2 successfully loaded.

Demos

You can look for examples in the demo page:

Website

plotly

Plotly also creates interactive plots. It comes with two APIs: an older, lower-level, imperative one (Plotly Graph Objects) that allows for more control and a newer, higher-level, declarative one (Plotly Express) which is very easy to use. Static plots however are less sophisticated than with matplotlib and it is not as good as Bokeh for dashboard interactions.

We will have a quick look at plotly later in the course.

Example

You need to run the following to render plotly plots in Jupyter notebooks:

import plotly.io as pio
pio.renderers.default = 'notebook'
import plotly.express as px
df = px.data.tips()

fig = px.density_contour(df, x="total_bill", y="tip")
fig.update_traces(contours_coloring="fill", contours_showlabels = True)
fig.show()
510152025010203040500246810
051015202530total_billtip

Website

Vega-Altair

Vega is a low-level, declarative, interactive visualization library for JSON.

On top of it was built Vega-Lite, a higher-level, simpler interactive JSON library.

Finally, Vega-Altair is a Python API built on top of Vega-Lite. It thus provides is a high-level, simple, declarative library for interactive plots. It is ideal for statistical visualization and EDA.

Example

import altair as alt
source = "https://cdn.jsdelivr.net/npm/vega-datasets/data/us-employment.csv"

predicate = alt.datum.nonfarm_change > 0
color = alt.when(predicate).then(alt.value("steelblue")).otherwise(alt.value("orange"))

alt.Chart(source).mark_bar().encode(
    alt.X("month:T").title("Month"),
    alt.Y("nonfarm_change:Q").title("Month to month change in unemployment"),
    color=color,
    tooltip=[alt.Tooltip("nonfarm:Q", title="Unemployment that month")]
).properties(width=650).interactive()
2006200720082009201020112012201320142015Month−1,000−800−600−400−2000200400600Month to month change in unemployment

Website

plotnine

Plotnine is an adaptation to Python of the popular R library ggplot2 based on the grammar of graphics concept.

Example

from plotnine import (
    ggplot,
    aes,
    theme_matplotlib,
    theme_set,
    geom_tile,
    scale_fill_continuous,
    coord_cartesian
)

from plotnine.data import faithfuld

# Set default theme for all the plots
theme_set(theme_matplotlib())

(
    ggplot(faithfuld, aes("waiting", "eruptions", fill="density")) 
    + geom_tile()
)

Website

Summary

Library Syntax Interactivity Ideal for Weaknesses
matplotlib Imperative Mostly no Publications. Arrays Verbose
seaborn Declarative No Statistical EDA Limited for rare plot types
bokeh Imperative Yes Live web apps Styling is less intuitive
plotly Both Yes Dashboards Static output less flexible
Altair Declarative Yes Fast EDA, web Fewer custom options
plotnine Declarative No R users Somewhat niche