Plot 2D histograms

The examples below make use of a numpy ndarray df containing dummy data (you may also use a pandas dataframe), that can be loaded with:

from plothist_utils import get_dummy_data

df = get_dummy_data()

Note

This page presents functions of plothist step by step and gives information about how to use them. To reproduce the examples, please visit the example gallery, because it contains a standalone script for each example, that you can run directly.

Simple plot

Plotting a simple 2D histogram is very similar to plotting a 1D histogram. The function make_2d_hist() is used to create the 2D histogram, and the function plot_2d_hist() is used to plot it.

from plothist import make_2d_hist, plot_2d_hist

name_x = "variable_0"
name_y = "variable_1"

h = make_2d_hist([df[name_x], df[name_y]], bins=[10, 10])

fig, ax, ax_colorbar = plot_2d_hist(h, colorbar_kwargs={"label": "Entries"})

ax.set_xlabel(name_x)
ax.set_ylabel(name_y)

ax.set_xlim(-9, 9)
ax.set_ylim(-9, 9)

fig.savefig("2d_hist_simple.svg", bbox_inches="tight")
Simple 2d hist

Change the colormap

The argument cmap in pcolormesh_kwargs can be used to change the colormap. Any matplotlib colormap can be used (see Style and colors). By default, pcolormesh_kwargs = {'cmap': 'viridis'}.

Here is a more complex example on how to get a discrete colormap:

from matplotlib.colors import ListedColormap

from plothist import get_color_palette

# 0 entries will be white, the rest will have one color from the plasma colormap per entry value
cmap = ListedColormap(
    ["white", *list(get_color_palette("plasma", int(h.values().max()) * 2 - 1))]
)

fig, ax, ax_colorbar = plot_2d_hist(
    h, colorbar_kwargs={"label": "Entries"}, pcolormesh_kwargs={"cmap": cmap}
)

ax.set_xlabel(name_x)
ax.set_ylabel(name_y)

ax.set_xlim(-4.5, 4.5)
ax.set_ylim(-4.5, 4.5)

fig.savefig("2d_hist_simple_discrete_colormap.svg", bbox_inches="tight")
Simple 2d hist with discrete colormap

Correlations with variable registry

The variable registry is a really useful tool to manage and plot the correlation between multiple variables (see Variable registry on how to use it.)

import os
import time
from itertools import combinations

from plothist import (
    create_variable_registry,
    get_variable_from_registry,
    make_2d_hist,
    plot_2d_hist,
    update_variable_registry_binning,
)

# No need to redo this step if the registry was already created before
variable_keys = ["variable_0", "variable_1", "variable_2"]
unique_id = str(int(time.time() * 1000))[-8:]  # unique ID based on current time
temporary_registry_path = f"./_temporary_variable_registry_{unique_id}.yaml"
create_variable_registry(variable_keys, path=temporary_registry_path)
update_variable_registry_binning(df, variable_keys, path=temporary_registry_path)

# Get all the correlation plot between the variables
variable_keys_combinations = list(combinations(variable_keys, 2))

figs = []

for variable_keys_combination in variable_keys_combinations:
    variable0 = get_variable_from_registry(
        variable_keys_combination[0], path=temporary_registry_path
    )
    variable1 = get_variable_from_registry(
        variable_keys_combination[1], path=temporary_registry_path
    )

    h = make_2d_hist(
        [df[variable0["name"]], df[variable1["name"]]],
        bins=(variable0["bins"], variable1["bins"]),
        range=(variable0["range"], variable1["range"]),
    )

    fig, ax, ax_colorbar = plot_2d_hist(h, colorbar_kwargs={"label": "Entries"})

    ax.set_xlabel(variable0["name"])
    ax.set_ylabel(variable1["name"])

    ax.set_xlim(variable0["range"])
    ax.set_ylim(variable1["range"])

    figs.append(fig)

for i, fig in enumerate(figs):
    fig.savefig(f"2d_hist_correlations_{i}.svg", bbox_inches="tight")

os.remove(temporary_registry_path)

2d correlation plot 2d correlation plot 2d correlation plot

Uneven binning

Instead of inputing a number of bins and a range in make_2d_hist(), the bins parameter can be a list of all the edges:

from plothist import make_2d_hist, plot_2d_hist

name_x = "variable_0"
name_y = "variable_1"
# Bins [-10,0], [0,10] for variable 1,
# and bins [-10,-5], [-5,0], [0,5], [5,10] for variable 2
bins = [[-10, 0, 10], [-10, -5, 0, 5, 10]]

h = make_2d_hist([df[name_x], df[name_y]], bins=bins)

fig, ax, ax_colorbar = plot_2d_hist(h, colorbar_kwargs={"label": "Entries"})

ax.set_xlabel(name_x)
ax.set_ylabel(name_y)

fig.savefig("2d_hist_uneven.svg", bbox_inches="tight")
2d hist with uneven binning

Display 1D distributions

To plot a 2d histogram with x and y projections, the function plot_2d_hist_with_projections() can be used. The corresponding projections are displayed on the top and right of the 2D histogram, and the colorbar is clearly displayed and labeled.

from plothist import make_2d_hist, plot_2d_hist_with_projections

name_x = "variable_0"
name_y = "variable_1"

h = make_2d_hist([df[name_x], df[name_y]])

(
    fig,
    ax_2d,
    ax_x_projection,
    ax_y_projection,
    ax_colorbar,
) = plot_2d_hist_with_projections(
    h,
    xlabel="variable_0",
    ylabel="variable_1",
    ylabel_x_projection="Entries",
    xlabel_y_projection="Entries",
    offset_x_labels=False,
    colorbar_kwargs={"label": "Entries"},
)

fig.savefig("2d_hist_with_projections.svg", bbox_inches="tight")
2d hist with x and y projections