Seaborn Plots with 2 Legends

Posted here because I will inevitably forget this painfully worked-out answer for having legends for two different types of plots in Seaborn…

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# We will need to access some of these matplotlib classes directly
from matplotlib.lines import Line2D # For points and lines
from matplotlib.patches import Patch # For KDE and other plots
from matplotlib.legend import Legend
from matplotlib import cm

# Initialise random number generator
rng = np.random.default_rng(seed=42)

# Generate sample of 25 numbers
n = 25

clusters = []

for c in range(0,3):
    # Crude way to get different distributions
    # for each cluster
p = rng.integers(low=1, high=6, size=4)
df = pd.DataFrame({
'x': rng.normal(p[0], p[1], n),
'y': rng.normal(p[2], p[3], n),
'name': f"Cluster {c+1}"
})
clusters.append(df)

# Flatten to a single data frame
clusters = pd.concat(clusters)

# Now do the same for data to feed into
# the second (scatter) plot...
n = 8
points = []

for c in range(0,2):

p = rng.integers(low=1, high=6, size=4)

df = pd.DataFrame({
'x': rng.normal(p[0], p[1], n),
'y': rng.normal(p[2], p[3], n),
'name': f"Group {c+1}"
})
points.append(df)

points = pd.concat(points)

# And create the figure
f, ax = plt.subplots(figsize=(8,8))

# The KDE-plot generates a Legend 'as usual'
k = sns.kdeplot(
data=clusters,
x='x', y='y',
hue='name',
shade=True,
thresh=0.05,
n_levels=2,
alpha=0.2,
ax=ax,
)

# Notice that we access this legend via the
# axis to turn off the frame, set the title,
# and adjust the patch alpha level so that
# it closely matches the alpha of the KDE-plot
ax.get_legend().set_frame_on(False)
ax.get_legend().set_title("Clusters")
for lh in ax.get_legend().get_patches():
lh.set_alpha(0.2)

# You would probably want to sort your data
# frame or set the hue and style order in order
# to ensure consistency for your own application
# but this works for demonstration purposes
groups = points.name.unique()
markers = ['o', 'v', 's', 'X', 'D', '<', '>']
colors = cm.get_cmap('Dark2').colors

# Generate the scatterplot: notice that Legend is
# off (otherwise this legend would overwrite the
# first one) and that we're setting the hue, style,
# markers, and palette using the 'name' parameter
# from the data frame and the number of groups in
# the data.
p = sns.scatterplot(
data=points,
x="x",
y="y",
hue='name',
style='name',
markers=markers[:len(groups)],
palette=colors[:len(groups)],
legend=False,
s=30,
alpha=1.0
)

# Here's the 'magic' -- we use zip to link together
# the group name, the color, and the marker style. You
# *cannot* retreive the marker style from the scatterplot
# since that information is lost when rendered as a
# PathCollection (as far as I can tell). Anyway, this allows
# us to loop over each group in the second data frame and
# generate a 'fake' Line2D plot (with zero elements and no
# line-width in our case) that we can add to the legend. If
# you were overlaying a line plot or a second plot that uses
# patches you'd have to tweak this accordingly.
patches = []
for x in zip(groups, colors[:len(groups)], markers[:len(groups)]):
patches.append(Line2D([0],[0], linewidth=0.0, linestyle='',
color=x[1], markerfacecolor=x[1],
marker=x[2], label=x[0], alpha=1.0))

# And add these patches (with their group labels) to the new
# legend item and place it on the plot.
leg = Legend(ax, patches, labels=groups,
loc='upper left', frameon=False, title='Groups')
ax.add_artist(leg);

# Done
plt.show();

A Seaborn plot showing 2 legends for different types of plots.

The Full Stack: Tools & Processes for Urban Data Scientists

Recently, I was asked to give talks at both UCL’s CASA and the ETH Future Cities Lab in Singapore for students and staff new to ‘urban data science’ and the sorts of workflows involved in collecting, processing, analysing, and reporting on urban geo-data. Developing the talk proved to be a rather enjoyable opportunity to reflect on more than a decade in commercial data mining and academic research – not only did I realise how far I had come, I realised how far the domain had come in that time.

Continue reading

Pint of Science: Curious About the Housing Crisis?

As a follow-on to my earlier piece on Hex-Binning Land Registry Data, here’s a talk I gave on the housing crisis as part of the Pint of Science Festival a couple of weeks back.

Continue reading

Hex Binning Land Registry Data

One of the known problems with choropleth maps is that small zones, even if they contain very significant values, tend to get lost in amongst much larger zones. A current example is that the ridings in London are much smaller than those outside of London, so it can be hard to tell what’s happening in the capital if you are looking at a map of the entire UK. One solution to this is the hexagonal bin. Continue reading

2 Funded PhD Positions at King’s

It’s been a long time coming, but I’m really pleased to be able to share details about two PhDs at King’s for which I have funding: one to look at the growth and evolution of the UK’s landline network, and one to look at the interface between smart city systems and urban governance. Read on for details about each.

Continue reading

‘Mapping the Space of Flows’: the geography of the London Mega-City Region

I’m pleased to be able to post here the penultimate version of an article that Duncan Smith and I recently had accepted to Regional Studies. In this article we look at ways of combining ‘big data’ from a telecoms network with standard BRES employment data to generate a more nuanced understanding of where ‘work’ happens in the Greater Southeast of England across several key sectors. Continue reading

Bridging the Qual/Quant Divide

I’ve been in my new post in the Geography department at King’s College London for nearly nine months now and — together with another new-ish colleague — have been asked to design a programme to teach quantitative research methods to students who often seem to think that their interests are solely qualitative. Continue reading

Fear of Failure

An ongoing preoccupation of many governments, but perhaps most especially this one, has been the fostering of innovation and the training of the next generation of entrepreneurs. The positioning of tertiary education under Business, Innovation & Skills is one obvious sign of this focus and so, as I noted before, is the Government’s investment in (and messaging around) ‘Tech City‘. Continue reading

Multiple MySQL Servers on a Single Machine

Note: this was previously posted at simulacra.info, but I am in the process of (re)organising my technical notes and tutorials.

A bit of a dry post here, but I thought I’d share my experience of trying to get two instances of MySQL (and two different versions, to boot) running simultaneously on a single piece of hardware as I’ve spent the past two days tearing my hear out and swearing profusely (mostly) under my breath. Continue reading

The MapThing Processing Library

MapThing allows you to perform a range of useful mapping (in the geographical sense) functions within Processing and offers a collection of classes for reading ESRI-compliant Shape files (a.k.a. shapefiles), CSV point data, and GPX files, and then displaying them as part of a sketch. Continue reading

Powered by WordPress
Theme: Esquire by Matthew Buchanan.