Seaborn Plots with 2 Legends

Posted here because I will inevitably forget this painfully worked-out answer for having legends for two different types of plots in Seaborn…

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# We will need to access some of these matplotlib classes directly
from matplotlib.lines import Line2D # For points and lines
from matplotlib.patches import Patch # For KDE and other plots
from matplotlib.legend import Legend
from matplotlib import cm

# Initialise random number generator
rng = np.random.default_rng(seed=42)

# Generate sample of 25 numbers
n = 25

clusters = []

for c in range(0,3):
    # Crude way to get different distributions
    # for each cluster
p = rng.integers(low=1, high=6, size=4)
df = pd.DataFrame({
'x': rng.normal(p[0], p[1], n),
'y': rng.normal(p[2], p[3], n),
'name': f"Cluster {c+1}"
})
clusters.append(df)

# Flatten to a single data frame
clusters = pd.concat(clusters)

# Now do the same for data to feed into
# the second (scatter) plot...
n = 8
points = []

for c in range(0,2):

p = rng.integers(low=1, high=6, size=4)

df = pd.DataFrame({
'x': rng.normal(p[0], p[1], n),
'y': rng.normal(p[2], p[3], n),
'name': f"Group {c+1}"
})
points.append(df)

points = pd.concat(points)

# And create the figure
f, ax = plt.subplots(figsize=(8,8))

# The KDE-plot generates a Legend 'as usual'
k = sns.kdeplot(
data=clusters,
x='x', y='y',
hue='name',
shade=True,
thresh=0.05,
n_levels=2,
alpha=0.2,
ax=ax,
)

# Notice that we access this legend via the
# axis to turn off the frame, set the title,
# and adjust the patch alpha level so that
# it closely matches the alpha of the KDE-plot
ax.get_legend().set_frame_on(False)
ax.get_legend().set_title("Clusters")
for lh in ax.get_legend().get_patches():
lh.set_alpha(0.2)

# You would probably want to sort your data
# frame or set the hue and style order in order
# to ensure consistency for your own application
# but this works for demonstration purposes
groups = points.name.unique()
markers = ['o', 'v', 's', 'X', 'D', '<', '>']
colors = cm.get_cmap('Dark2').colors

# Generate the scatterplot: notice that Legend is
# off (otherwise this legend would overwrite the
# first one) and that we're setting the hue, style,
# markers, and palette using the 'name' parameter
# from the data frame and the number of groups in
# the data.
p = sns.scatterplot(
data=points,
x="x",
y="y",
hue='name',
style='name',
markers=markers[:len(groups)],
palette=colors[:len(groups)],
legend=False,
s=30,
alpha=1.0
)

# Here's the 'magic' -- we use zip to link together
# the group name, the color, and the marker style. You
# *cannot* retreive the marker style from the scatterplot
# since that information is lost when rendered as a
# PathCollection (as far as I can tell). Anyway, this allows
# us to loop over each group in the second data frame and
# generate a 'fake' Line2D plot (with zero elements and no
# line-width in our case) that we can add to the legend. If
# you were overlaying a line plot or a second plot that uses
# patches you'd have to tweak this accordingly.
patches = []
for x in zip(groups, colors[:len(groups)], markers[:len(groups)]):
patches.append(Line2D([0],[0], linewidth=0.0, linestyle='',
color=x[1], markerfacecolor=x[1],
marker=x[2], label=x[0], alpha=1.0))

# And add these patches (with their group labels) to the new
# legend item and place it on the plot.
leg = Legend(ax, patches, labels=groups,
loc='upper left', frameon=False, title='Groups')
ax.add_artist(leg);

# Done
plt.show();

A Seaborn plot showing 2 legends for different types of plots.

The Full Stack: Tools & Processes for Urban Data Scientists

Recently, I was asked to give talks at both UCL’s CASA and the ETH Future Cities Lab in Singapore for students and staff new to ‘urban data science’ and the sorts of workflows involved in collecting, processing, analysing, and reporting on urban geo-data. Developing the talk proved to be a rather enjoyable opportunity to reflect on more than a decade in commercial data mining and academic research – not only did I realise how far I had come, I realised how far the domain had come in that time.

Continue reading

Bridging the Qual/Quant Divide

I’ve been in my new post in the Geography department at King’s College London for nearly nine months now and — together with another new-ish colleague — have been asked to design a programme to teach quantitative research methods to students who often seem to think that their interests are solely qualitative. Continue reading

Fear of Failure

An ongoing preoccupation of many governments, but perhaps most especially this one, has been the fostering of innovation and the training of the next generation of entrepreneurs. The positioning of tertiary education under Business, Innovation & Skills is one obvious sign of this focus and so, as I noted before, is the Government’s investment in (and messaging around) ‘Tech City‘. Continue reading

Multiple MySQL Servers on a Single Machine

Note: this was previously posted at simulacra.info, but I am in the process of (re)organising my technical notes and tutorials.

A bit of a dry post here, but I thought I’d share my experience of trying to get two instances of MySQL (and two different versions, to boot) running simultaneously on a single piece of hardware as I’ve spent the past two days tearing my hear out and swearing profusely (mostly) under my breath. Continue reading

The MapThing Processing Library

MapThing allows you to perform a range of useful mapping (in the geographical sense) functions within Processing and offers a collection of classes for reading ESRI-compliant Shape files (a.k.a. shapefiles), CSV point data, and GPX files, and then displaying them as part of a sketch. Continue reading

Plotting & iGraph on Lion and Mountain Lion

Note: this was previously posted at simulacra.info, but I am in the process of (re)organising my technical notes and tutorials.

After giving up on Gephi (again, I really should learn), I decided it was time to get to grips with Python and iGraph since I really need to produce multiple iterations of a graph. The matmos at CASA have, of course, been touting Python for ages, but I’ve just not had the time/incentive to install and, more importantly, actually get around to using it… until now. Continue reading

Extracting files from Moodle MBZ Archives with Python

These days it seems that just about every university is using Moodle, the “open-source community-based tools for learning”, to manage the delivery of course material and handling of deadlines, assignments, etc. Now I’m a fan of the OS community, but Moodle has… quirks. Continue reading

Academic Presentations: the Anti-TED Talk

After a few months back on the conference speaking/attendance circuit, I’ve had something of a refresher course in the joys of academic meetings and decided it was time to write up the range of feelings — from irritation to rage — that have been stirred up as a result. I’m not going to name names in this piece, because in nearly every case the absence of value at the conference had little or nothing to do with the organisers and everything to do with the speakers and the audience. Continue reading

Powered by WordPress
Theme: Esquire by Matthew Buchanan.