Seaborn Plots with 2 Legends

Posted here because I will inevitably forget this painfully worked-out answer for having legends for two different types of plots in Seaborn…

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# We will need to access some of these matplotlib classes directly
from matplotlib.lines import Line2D # For points and lines
from matplotlib.patches import Patch # For KDE and other plots
from matplotlib.legend import Legend
from matplotlib import cm

# Initialise random number generator
rng = np.random.default_rng(seed=42)

# Generate sample of 25 numbers
n = 25

clusters = []

for c in range(0,3):
    # Crude way to get different distributions
    # for each cluster
p = rng.integers(low=1, high=6, size=4)
df = pd.DataFrame({
'x': rng.normal(p[0], p[1], n),
'y': rng.normal(p[2], p[3], n),
'name': f"Cluster {c+1}"
})
clusters.append(df)

# Flatten to a single data frame
clusters = pd.concat(clusters)

# Now do the same for data to feed into
# the second (scatter) plot...
n = 8
points = []

for c in range(0,2):

p = rng.integers(low=1, high=6, size=4)

df = pd.DataFrame({
'x': rng.normal(p[0], p[1], n),
'y': rng.normal(p[2], p[3], n),
'name': f"Group {c+1}"
})
points.append(df)

points = pd.concat(points)

# And create the figure
f, ax = plt.subplots(figsize=(8,8))

# The KDE-plot generates a Legend 'as usual'
k = sns.kdeplot(
data=clusters,
x='x', y='y',
hue='name',
shade=True,
thresh=0.05,
n_levels=2,
alpha=0.2,
ax=ax,
)

# Notice that we access this legend via the
# axis to turn off the frame, set the title,
# and adjust the patch alpha level so that
# it closely matches the alpha of the KDE-plot
ax.get_legend().set_frame_on(False)
ax.get_legend().set_title("Clusters")
for lh in ax.get_legend().get_patches():
lh.set_alpha(0.2)

# You would probably want to sort your data
# frame or set the hue and style order in order
# to ensure consistency for your own application
# but this works for demonstration purposes
groups = points.name.unique()
markers = ['o', 'v', 's', 'X', 'D', '<', '>']
colors = cm.get_cmap('Dark2').colors

# Generate the scatterplot: notice that Legend is
# off (otherwise this legend would overwrite the
# first one) and that we're setting the hue, style,
# markers, and palette using the 'name' parameter
# from the data frame and the number of groups in
# the data.
p = sns.scatterplot(
data=points,
x="x",
y="y",
hue='name',
style='name',
markers=markers[:len(groups)],
palette=colors[:len(groups)],
legend=False,
s=30,
alpha=1.0
)

# Here's the 'magic' -- we use zip to link together
# the group name, the color, and the marker style. You
# *cannot* retreive the marker style from the scatterplot
# since that information is lost when rendered as a
# PathCollection (as far as I can tell). Anyway, this allows
# us to loop over each group in the second data frame and
# generate a 'fake' Line2D plot (with zero elements and no
# line-width in our case) that we can add to the legend. If
# you were overlaying a line plot or a second plot that uses
# patches you'd have to tweak this accordingly.
patches = []
for x in zip(groups, colors[:len(groups)], markers[:len(groups)]):
patches.append(Line2D([0],[0], linewidth=0.0, linestyle='',
color=x[1], markerfacecolor=x[1],
marker=x[2], label=x[0], alpha=1.0))

# And add these patches (with their group labels) to the new
# legend item and place it on the plot.
leg = Legend(ax, patches, labels=groups,
loc='upper left', frameon=False, title='Groups')
ax.add_artist(leg);

# Done
plt.show();

A Seaborn plot showing 2 legends for different types of plots.

The Full Stack: Tools & Processes for Urban Data Scientists

Recently, I was asked to give talks at both UCL’s CASA and the ETH Future Cities Lab in Singapore for students and staff new to ‘urban data science’ and the sorts of workflows involved in collecting, processing, analysing, and reporting on urban geo-data. Developing the talk proved to be a rather enjoyable opportunity to reflect on more than a decade in commercial data mining and academic research – not only did I realise how far I had come, I realised how far the domain had come in that time.

Continue reading

Multiple MySQL Servers on a Single Machine

Note: this was previously posted at simulacra.info, but I am in the process of (re)organising my technical notes and tutorials.

A bit of a dry post here, but I thought I’d share my experience of trying to get two instances of MySQL (and two different versions, to boot) running simultaneously on a single piece of hardware as I’ve spent the past two days tearing my hear out and swearing profusely (mostly) under my breath. Continue reading

The MapThing Processing Library

MapThing allows you to perform a range of useful mapping (in the geographical sense) functions within Processing and offers a collection of classes for reading ESRI-compliant Shape files (a.k.a. shapefiles), CSV point data, and GPX files, and then displaying them as part of a sketch. Continue reading

Plotting & iGraph on Lion and Mountain Lion

Note: this was previously posted at simulacra.info, but I am in the process of (re)organising my technical notes and tutorials.

After giving up on Gephi (again, I really should learn), I decided it was time to get to grips with Python and iGraph since I really need to produce multiple iterations of a graph. The matmos at CASA have, of course, been touting Python for ages, but I’ve just not had the time/incentive to install and, more importantly, actually get around to using it… until now. Continue reading

Powered by WordPress
Theme: Esquire by Matthew Buchanan.