For a recent talk in my department I talked a little bit about agent based modeling and in the process I came across the simple but quite interesting SIR model in epidemiology. The inspiration for this post was Simon Dobson's post on Epidemic spreading processes, which will provide a much more detailed scientific background and take you through some of the code step by step. However as a brief introduction
I've made some minor tweaks to the model by adding vaccinated and dead states. I've also unified the function based approach into a single Parameterized class, which takes care of initializing, running and visualizing the network.
In this blog post I'll primarily look at how we can quickly create complex visualization about this model using HoloViews. In the process I'll look at some predictions this model can make about herd immunity but won't be giving it any rigorous scientific treatment.
Here's the code for the model relying only on numpy, networkx, holoviews and matplotlib in the background.
import collections
import itertools
import math
import numpy as np
np.seterr(divide='ignore')
import numpy.random as rnd
import networkx as nx
import param
import holoviews as hv
SPREADING_SUSCEPTIBLE = 'S'
SPREADING_VACCINATED = 'V'
SPREADING_INFECTED = 'I'
SPREADING_RECOVERED = 'R'
DEAD = 'D'
class SRI_Model(param.Parameterized):
"""
Implementation of the SRI epidemiology model
using NetworkX and HoloViews for visualization.
This code has been adapted from Simon Dobson's
code here:
http://www.simondobson.org/complex-networks-complex-processes/epidemic-spreading.html
In addition to his basic parameters I've added
additional states to the model, a node may be
in one of the following states:
* Susceptible: Can catch the disease from a connected node.
* Vaccinated: Immune to infection.
* Infected: Has the disease and may pass it on to any connected node.
* Recovered: Immune to infection.
* Dead: Edges are removed from graph.
"""
network = param.ClassSelector(class_=nx.Graph, default=None, doc="""
A custom NetworkX graph, instead of the default Erdos-Renyi graph.""")
visualize = param.Boolean(default=True, doc="""
Whether to compute layout of network for visualization.""")
# Initial parameters
N = param.Integer(default=1000, doc="""
Number of nodes to simulate.""")
mean_connections = param.Number(default=10, doc="""
Mean number of connections to make to other nodes.""")
pSick = param.Number(default=0.01, doc="""
Probability of a node to be initialized in sick state.""", bounds=(0, 1))
pVaccinated = param.Number(default=0.1, bounds=(0, 1), doc="""
Probability of a node to be initialized in vaccinated state.""")
# Simulation parameters
pInfect = param.Number(default=0.3, doc="""
Probability of infection on each time step.""", bounds=(0, 1))
pRecover = param.Number(default=0.05, doc="""
Probability of recovering if infected on each timestep.""", bounds=(0, 1))
pDeath = param.Number(default=0.1, doc="""
Probability of death if infected on each timestep.""", bounds=(0, 1))
def __init__(self, **params):
super(SRI_Model, self).__init__(**params)
if not self.network:
self.g = nx.erdos_renyi_graph(self.N, float(self.mean_connections)/self.N)
else:
self.g = self.network
self.vaccinated, self.infected = self.spreading_init()
self.model = self.spreading_make_sir_model()
self.color_mapping = [SPREADING_SUSCEPTIBLE,
SPREADING_VACCINATED,
SPREADING_INFECTED,
SPREADING_RECOVERED, DEAD]
if self.visualize:
self.pos = nx.spring_layout(self.g, iterations = 50,
k = 2/(math.sqrt(self.g.order())))
def spreading_init(self):
"""Initialise the network with vaccinated, susceptible and infected states."""
vaccinated, infected = 0, []
for i in self.g.node.keys():
self.g.node[i]['transmissions'] = 0
if(rnd.random() <= self.pVaccinated):
self.g.node[i]['state'] = SPREADING_VACCINATED
vaccinated += 1
elif(rnd.random() <= self.pSick):
self.g.node[i]['state'] = SPREADING_INFECTED
infected.append(i)
else:
self.g.node[i]['state'] = SPREADING_SUSCEPTIBLE
return vaccinated, infected
def spreading_make_sir_model(self):
"""Return an SIR model function for given infection and recovery probabilities."""
# model (local rule) function
def model( g, i ):
if g.node[i]['state'] == SPREADING_INFECTED:
# infect susceptible neighbours with probability pInfect
for m in g.neighbors(i):
if g.node[m]['state'] == SPREADING_SUSCEPTIBLE:
if rnd.random() <= self.pInfect:
g.node[m]['state'] = SPREADING_INFECTED
self.infected.append(m)
g.node[i]['transmissions'] += 1
# recover with probability pRecover
if rnd.random() <= self.pRecover:
g.node[i]['state'] = SPREADING_RECOVERED
elif rnd.random() <= self.pDeath:
edges = [edge for edge in self.g.edges() if i in edge]
g.node[i]['state'] = DEAD
g.remove_edges_from(edges)
return model
def step(self):
"""Run a single step of the model over the graph."""
for i in self.g.node.keys():
self.model(self.g, i)
def run(self, steps):
"""
Run the network for the specified number of time steps
"""
for i in range(steps):
self.step()
def network_data(self):
"""
Return the network edge paths and node positions,
requires visualize parameter to be enabled.
"""
if not self.visualize:
raise Exception("Enable visualize option to get network data.")
nodeMarkers = []
overlay = []
points = np.array([self.pos[v] for v in self.g.nodes_iter()])
paths = []
for e in self.g.edges_iter():
xs = [ self.pos[e[0]][0], self.pos[e[1]][0] ]
ys = [ self.pos[e[0]][1], self.pos[e[1]][1] ]
paths.append(np.array(zip(xs, ys)))
return paths, points
def stats(self):
"""
Return an ItemTable with statistics on the network data.
"""
state_labels = hv.OrderedDict([('S', 'Susceptible'), ('V', 'Vaccinated'), ('I', 'Infected'),
('R', 'Recovered'), ('D', 'Dead')])
counts = collections.Counter()
transmissions = []
for n in self.g.nodes_iter():
state = state_labels[self.g.node[n]['state']]
counts[state] += 1
if n in self.infected:
transmissions.append(self.g.node[n]['transmissions'])
data = hv.OrderedDict([(l, counts[l])
for l in state_labels.values()])
infected = len(set(self.infected))
unvaccinated = float(self.N-self.vaccinated)
data['$R_0$'] = np.mean(transmissions) if transmissions else 0
data['Death rate DR'] = np.divide(float(data['Dead']),self.N)
data['Infection rate IR'] = np.divide(float(infected), self.N)
if unvaccinated:
unvaccinated_dr = data['Dead']/unvaccinated
unvaccinated_ir = infected/unvaccinated
else:
unvaccinated_dr = 0
unvaccinated_ir = 0
data['Unvaccinated DR'] = unvaccinated_dr
data['Unvaccinated IR'] = unvaccinated_ir
return hv.ItemTable(data)
def animate(self, steps):
"""
Run the network for the specified number of steps accumulating animations
of the network nodes and edges changing states and curves tracking the
spread of the disease.
"""
if not self.visualize:
raise Exception("Enable visualize option to get compute network visulizations.")
# Declare HoloMap for network animation and counts array
network_hmap = hv.HoloMap(key_dimensions=['Time'])
sird = np.zeros((steps, 5))
# Declare dimensions and labels
spatial_dims = [hv.Dimension('x', range=(-1.1, 1.1)),
hv.Dimension('y', range=(-1.1, 1.1))]
state_labels = ['Susceptible', 'Vaccinated', 'Infected', 'Recovered', 'Dead']
# Text annotation
nlabel = hv.Text(0.9, 0.05, 'N=%d' % self.N)
for i in range(steps):
# Get path, point, states and count data
paths, points = self.network_data()
states = [self.color_mapping.index(self.g.node[n]['state'])
for n in self.g.nodes_iter()]
state_array = np.array(states, ndmin=2).T
(sird[i, :], _) = np.histogram(state_array, bins=list(range(6)))
# Create network path and node Elements
network_paths = hv.Path(paths, key_dimensions=spatial_dims)
network_nodes = hv.Points(np.hstack([points, state_array]),
key_dimensions=spatial_dims,
value_dimensions=['State'])
# Create overlay and accumulate in network HoloMap
network_hmap[i] = (network_paths * network_nodes * nlabel).relabel(group='Network', label='SRI')
self.step()
# Create Overlay of Curves
extents = (-1, -1, steps, np.max(sird)+2)
curves = hv.NdOverlay({label: hv.Curve(zip(range(steps), sird[:, i]), extents=extents,
key_dimensions=['Time'], value_dimensions=['Count'])
for i, label in enumerate(state_labels)},
key_dimensions=[hv.Dimension('State', values=state_labels)])
# Animate VLine on top of Curves
distribution = hv.HoloMap({i: (curves * hv.VLine(i)).relabel(group='Counts', label='SRI')
for i in range(steps)}, key_dimensions=['Time'])
return network_hmap + distribution
HoloViews allows use to define various style options in advance on the Store.options object.
hv.notebook_extension()
# Increase dpi and select the slider widget
%output dpi=120 holomap='widgets'
# Set colors and style options for the Element types
from holoviews import Store, Options
from holoviews.core.options import Palette
opts = Store.options()
opts.Path = Options('style', linewidth=0.2, color='k')
opts.ItemTable = Options('plot', aspect=1.2, fig_size=150)
opts.Curve = Options('style', color=Palette('hot_r'))
opts.Histogram = Options('plot', bgcolor='w', show_grid=False)
opts.Overlay = Options('plot', show_frame=False)
opts.HeatMap = Options('plot', show_values=False, show_grid=False,
aspect=1.5, xrotation=90)
opts.Overlay.Network = Options('plot', xaxis=None, yaxis=None, bgcolor='w')
opts.Overlay.Counts = Options('plot', aspect=1.2, show_grid=True)
opts.Points = {'style': Options(cmap='hot_r', s=50, edgecolors='k'),
'plot': Options(color_index=2)}
opts.VLine = {'style': Options(color='k', linewidth=1),
'plot': Options(show_grid=True)}
Next we'll simply enable the Seaborn plot style defaults because they look a bit better than the HoloViews defaults for this kind of data.
import seaborn
seaborn.set()
Having defined the model and defined the model we can run some real experiments. In particular we can investigate the effect of vaccination on our model.
We'll initialize our model with only 50 inviduals, who will on average make 10 connections to other individuals. Then we will infect a small population ($p=0.1$) so we can track how the disease spreads through the population. To really drive the point home we'll use a very infectious and deadly disease.
experiment1_params = dict(pInfect=0.08, pRecover=0.08, pSick=0.15,
N=50, mean_connections=10, pDeath=0.1)
Here we'll investigate the spread of the disease in population with a 10% vaccination rate:
sri_model = SRI_Model(pVaccinated=0.1, **experiment1_params)
sri_model.animate(21)
In figure A we can observe how the disease quickly spreads across almost the entire unvaccinated population. Additionally we can track the number of individuals in a particular state in B. As the disease spreads unimpeded the most individuals either die or recover and therefore gain immunity. Individuals that die are obviously no longer part of the network so their connections to other individuals get deleted, this way we can see the network thin out as the disease wreaks havok among the population.
Next we can view a breakdown of the final state of the simulation including infection and death rates:
sri_model.stats()
As you can see both the infection and death rates are very high in this population. The disease reached a large percentage all individuals causing death in a large fraction of them. Among the unvaccinated population they are of course even higher with almost >90% infected and >40% dead. The disease spread through our network completely unimpeded. Now let's see what happens if a large fraction of the population is vaccinated.
If we increase the initial probability of being vaccinated to $p=0.65$ we'll be able to observe how this affects the spread of the disease through the network:
sri_model = SRI_Model(pVaccinated=0.65, **experiment1_params)
sri_model.animate(21)
Even though we can still see the disease spreading among non-vaccinated individuals we can also observe how the vaccinated individuals stop the spread. If an infected individual is connected with a majority of vaccinated indivuals the probability of the disease spreading is strongly impeded. Unlike in low vaccinated population the disease stops its spread not because too many individuals have died off, rather it quickly runs out of steam, such that a majority of the initial, susceptible but healthy population remains completely unaffected.
This is what's known as herd immunity and its very important. This is because a small percentage of any population cannot be vaccinated, usually because they are immuno-compromised. However when a larger percentage of people decide that they do not want to get vaccinated (for various and invariably stupid reasons), they place the rest of the population in danger, particularly those that cannot get vaccinated for health reasons.
Let's look what higher vaccination rates did to our experimental population:
sri_model.stats()
The precipetous drop in the whole populations infection rate and death rate are obviously easily explained by the fact that a smaller fraction of the population was susceptible to the disease in the first place, however as herd immunity would predict, a smaller fraction of the unvaccinated population contracted and died of the disease as well. I hope this toy example once again emphasizes how important vaccination and herd immunity is.
Before we have a more systematic look at herd immunity we'll increase the population size to 1000 individuals and have a look at what our virulent disease does to this population, if nothing else it'll produce a pretty plot. If you're running this notebook live you could try also try out one of the interactive backends at this point. To choose mpld3 as a backend run:
%output backend='matplotlib:mpld3' size=100
or for nbagg:
%output backend='matplotlib:nbagg' widgets='live'
Instead we'll choose a video backend so you can look at the network in full screen:
%%output holomap='scrubber' size=150
sri_model_lv = SRI_Model(pVaccinated=0.1, **dict(experiment1_params, N=1000))
sri_layout = sri_model_lv.animate(31)
sri_layout.Network.SRI[::2]
sri_model_hv = SRI_Model(pVaccinated=0.65, visualize=False, **dict(experiment1_params, N=1000))
sri_model_hv.run(100)
(sri_model_lv.stats().relabel('Low Vaccination Population') +
sri_model_hv.stats().relabel('High Vaccination Population'))
As we can see the effect we observed in our smaller simulations from above still hold. Unvaccinated individuals are much safer in the high vaccination population than they are in the low vaccine population.
%output size=80
Now let's conduct a more systematic experiment by varying the vaccination rate and number of connections between individuals. In Experiment 1 we saw that vaccination rates could drastically reduce infection and death rates even among the unvaccinated population. Here we'll use a much less deadly disease as we're primarily interested in is how the disease spreads through populations with more and less connections and different vaccination rates. We'll also use a larger population (N=1000) to get a more representative sample.
experiment2_params = dict(N=1000, pInfect=0.05, pRecover=0.05,
pSick=0.05, pDeath=0.001, visualize=False)
Now we explore the parameter space, we'll run the model for vaccination rates from 0% to 100% in 5% increments and for increasing numbers of connections. To speed the whole thing up we've disabled computing the network layout with the visualize
parameter and will only be collecting the final simulation statistics. Finally we can simply deconstruct our data into a pandas data frame.
exp2_dims = ['Connections', 'pVaccinated']
hmap = hv.HoloMap(key_dimensions=exp2_dims)
vacc_rates = np.linspace(0, 1, 21)
mean_conns = [2**i for i in range(7)]
for v, c in itertools.product(vacc_rates, mean_conns):
sri_model = SRI_Model(mean_connections=c, pVaccinated=v, **experiment2_params)
sri_model.run(100)
hmap[c, v] = sri_model.stats()
df = hmap.dframe()
Before we start visualizing this data let's have a look at it:
df[::20]
Using the HoloViews pandas and seaborn extensions we can now perform regressions on the vaccination rates against infection and death rates. However since we also varied the mean number of connections between individuals in the network we want to consider these variables independently. By assigning the number of connections to a HoloMap we can view each plot independently with a widget.
Let's define the quantities we want to visualize
quantities = ['Unvaccinated IR', 'Infection rate IR', 'Death rate DR', '$R_0$']
state_labels = ['Susceptible', 'Vaccinated', 'Infected', 'Recovered', 'Dead']
%%opts Regression (order=2 x_bins=10)
hv.Layout([hv.Table(df).to.regression('pVaccinated', var, ['Connections'])
for var in quantities]).cols(2)
%%opts Layout [fig_size=200]
%%opts Trisurface (cmap='Reds_r' linewidth=0.1)
(hv.Table(df).to.trisurface(['pVaccinated', 'Connections'],
'$R_0$', [], group='$R_0$') +
hv.Table(df).to.trisurface(['pVaccinated', 'Connections'],
'Infection rate IR', [], group='Infection Rate'))
By varying the number of connections we can observe second order effects that would usually be invisible to us. After playing around with it for a little we can draw the following conclusions:
These results emphasize how important it is to maintain high vaccination rates in the highly connected societies we live in today. Even more importantly they show how important it is to continue vaccination programs in developing countries where they'll have the greatest impact.
We can also present the data in a different way examining all the data at once in a HeatMap.
%%output dpi=80 size=100
%%opts Layout [fig_inches=(12,7) aspect_weight=1]
group_colors = zip(quantities, ['Blues', 'Reds', 'Greens', 'Purples'])
hv.Layout([hv.Table(df).to.heatmap(['pVaccinated', 'Connections'],
q, [], group=q)(style=dict(cmap=c)).hist()
for q, c in group_colors]).display('all').cols(2)
This view highlights the gradient along which vaccination is highly effective and then becoming less effective as the saturation of the colors increases.
Remember "It's only a model", and a fairly simple one at that, but it provides some very clear predictions, which we've also observed in the real world. Getting the most out of models like this or even far more complex simulations requires tools that will allow us to make sense of interactions between many variables. I hope this post has gone some way towards persuading you that HoloViews is that tool, if so have a look at the HoloViews website, our Tutorials and other Examples.
The SRI_Model
class provided above is deliberately very customizable. If you want to play with the model further try varying some other parameters and explore the effects on the model or supply your own network via the network
parameter. There are a variety of tools to extract NetworkX Graph structures, so you could put together a model of your own social network. Hope you enjoyed it and have fun!