Seaborn Plots with 2 Legends

Posted here because I will inevitably forget this painfully worked-out answer for having legends for two different types of plots in Seaborn…

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# We will need to access some of these matplotlib classes directly
from matplotlib.lines import Line2D # For points and lines
from matplotlib.patches import Patch # For KDE and other plots
from matplotlib.legend import Legend
from matplotlib import cm

# Initialise random number generator
rng = np.random.default_rng(seed=42)

# Generate sample of 25 numbers
n = 25

clusters = []

for c in range(0,3):
    # Crude way to get different distributions
    # for each cluster
p = rng.integers(low=1, high=6, size=4)
df = pd.DataFrame({
'x': rng.normal(p[0], p[1], n),
'y': rng.normal(p[2], p[3], n),
'name': f"Cluster {c+1}"
})
clusters.append(df)

# Flatten to a single data frame
clusters = pd.concat(clusters)

# Now do the same for data to feed into
# the second (scatter) plot...
n = 8
points = []

for c in range(0,2):

p = rng.integers(low=1, high=6, size=4)

df = pd.DataFrame({
'x': rng.normal(p[0], p[1], n),
'y': rng.normal(p[2], p[3], n),
'name': f"Group {c+1}"
})
points.append(df)

points = pd.concat(points)

# And create the figure
f, ax = plt.subplots(figsize=(8,8))

# The KDE-plot generates a Legend 'as usual'
k = sns.kdeplot(
data=clusters,
x='x', y='y',
hue='name',
shade=True,
thresh=0.05,
n_levels=2,
alpha=0.2,
ax=ax,
)

# Notice that we access this legend via the
# axis to turn off the frame, set the title,
# and adjust the patch alpha level so that
# it closely matches the alpha of the KDE-plot
ax.get_legend().set_frame_on(False)
ax.get_legend().set_title("Clusters")
for lh in ax.get_legend().get_patches():
lh.set_alpha(0.2)

# You would probably want to sort your data
# frame or set the hue and style order in order
# to ensure consistency for your own application
# but this works for demonstration purposes
groups = points.name.unique()
markers = ['o', 'v', 's', 'X', 'D', '<', '>']
colors = cm.get_cmap('Dark2').colors

# Generate the scatterplot: notice that Legend is
# off (otherwise this legend would overwrite the
# first one) and that we're setting the hue, style,
# markers, and palette using the 'name' parameter
# from the data frame and the number of groups in
# the data.
p = sns.scatterplot(
data=points,
x="x",
y="y",
hue='name',
style='name',
markers=markers[:len(groups)],
palette=colors[:len(groups)],
legend=False,
s=30,
alpha=1.0
)

# Here's the 'magic' -- we use zip to link together
# the group name, the color, and the marker style. You
# *cannot* retreive the marker style from the scatterplot
# since that information is lost when rendered as a
# PathCollection (as far as I can tell). Anyway, this allows
# us to loop over each group in the second data frame and
# generate a 'fake' Line2D plot (with zero elements and no
# line-width in our case) that we can add to the legend. If
# you were overlaying a line plot or a second plot that uses
# patches you'd have to tweak this accordingly.
patches = []
for x in zip(groups, colors[:len(groups)], markers[:len(groups)]):
patches.append(Line2D([0],[0], linewidth=0.0, linestyle='',
color=x[1], markerfacecolor=x[1],
marker=x[2], label=x[0], alpha=1.0))

# And add these patches (with their group labels) to the new
# legend item and place it on the plot.
leg = Legend(ax, patches, labels=groups,
loc='upper left', frameon=False, title='Groups')
ax.add_artist(leg);

# Done
plt.show();

A Seaborn plot showing 2 legends for different types of plots.

The Full Stack: Tools & Processes for Urban Data Scientists

Recently, I was asked to give talks at both UCL’s CASA and the ETH Future Cities Lab in Singapore for students and staff new to ‘urban data science’ and the sorts of workflows involved in collecting, processing, analysing, and reporting on urban geo-data. Developing the talk proved to be a rather enjoyable opportunity to reflect on more than a decade in commercial data mining and academic research – not only did I realise how far I had come, I realised how far the domain had come in that time.

Continue reading

Land Registry Consultation #2: Reasons to Respond

In some circles (e.g. mine) news that the government is trying (again) to sell off the Land Registry has caused something of a stir. The curtain closed on the first act of this drama in March 2014, by which time 91% of respondents to the consultation opposed the Land Registry’s transition to a service delivery company. Apparently, it wasn’t the overwhelming opposition from, well, everyone that scuppered the deal, it was Vince Cable.

Government appears to have decided that if your first consultation doesn’t go the way you want, then why not try again with a more radical option? Should you worry?

Continue reading

Installing PostgreSQL Extensions on Mac OS X

I’ve been making a lot of use of PostgreSQL and PostGIS for working with geo-data over the past year and, having finally gotten over my hatred of the non-standard administrative commands, I am seriously impressed with what this setup makes possible. Even on a MacBook Air with just 8GB of RAM! However, one area where I’ve run into problems is the use of extensions on OS X so this post is intended as a handy reference for how to install them.

Continue reading

Peer Programming for Academics

I recently wrote up some thoughts on the value of peer programming as a tool for academic use in course planning and administration. The short answer: really useful but, like all things, best used in moderation.

Read more at: Peer Programming for Academics

Mapping the Changing Affordability of Manchester

Building on yesterday’s post about my London affordability maps, here are the equivalent maps for the Manchester area (sorry Liverpool, I’ll get there!) from 1997 and 2012. It’s obviously a very different picture in terms of price, volume and distribution; these differences were well-known anecdotally but a lot of the detail was hidden until the Land Registry opened up its pricing data and, for my money, this represents one of the most useful and timely open data sets available.

Continue reading

Mapping the Changing Affordability of London

Last night I discovered how many of my friends watch C4’s Dispatches since quite a few of them texted me to say that they had seen me talking about property affordability on “The Great British Property Divide”. However, since Dispatches has to somehow keep the running time down to just 30 minutes, there’s not much of a chance in the show to really explore the data underpinning my chat with Morland. So with that it mind, below are links to A0-sized static data visualisations.

Continue reading

History of Telephony: Funded PhD Award with King’s College London, BT and the Science Museum Group

Applications are invited for an AHRC-funded doctoral student to join King’s College London, BT Archives, and the Science Museum Group in late September 2015 or early January 2016 to investigate the impact of the telephone landline network on British society and culture(s).

The project is informed by the rise of the Internet and social media, the interest this has generated in understanding how networks grow and evolve over time, and how this can be connected to wider changes in society. The comprehensive historical and technical archive managed by BT represents a unique resource for researchers, grounding an analysis of ‘impact’ in an understanding of the network as an object materialised through a range of artefacts: from physical cables and switches, to abstract statistics on usage by homes and businesses.

Continue reading

Pint of Science: Curious About the Housing Crisis?

As a follow-on to my earlier piece on Hex-Binning Land Registry Data, here’s a talk I gave on the housing crisis as part of the Pint of Science Festival a couple of weeks back.

Continue reading

Hex Binning Land Registry Data

One of the known problems with choropleth maps is that small zones, even if they contain very significant values, tend to get lost in amongst much larger zones. A current example is that the ridings in London are much smaller than those outside of London, so it can be hard to tell what’s happening in the capital if you are looking at a map of the entire UK. One solution to this is the hexagonal bin. Continue reading

Powered by WordPress
Theme: Esquire by Matthew Buchanan.