Analyzers

Each Starsim module can have its own results, which get added to the full list of results in the Sim object. For example, the ss.Pregnancy module adds results like sim.results.pregnancy.pregnant, and the ss.HIV module adds results like sim.results.hiv.new_infections. If you are writing your own module, you can add whatever custom results you want. However, another option is to create an Analyzer to store results that you might need for one particular analysis but won’t need all the time. An Analyzer is very similar to other Starsim modules in its structure, but the general idea of an analyzer is that it gets called at the end of a timestep, and reports of the state of things after everything else has been updated without changing any of the module states itself.

Simple usage

For simple reporting, it’s possible to use a single function as an analyzer. In this case, the function receives a single argument, sim, which it has full access to. For example, if you wanted to know the number of connections in the network on each timestep, you could write a small analyzer as follows:

import numpy as np
import starsim as ss
import matplotlib.pyplot as plt
ss.options(jupyter=True)

# Store the number of edges
n_edges = []

def count_edges(sim):
    """ Print out the number of edges in the network on each timestep """
    network = sim.networks[0] # Get the first network
    n = len(network)
    n_edges.append(n)
    print(f'Number of edges for network {network.name} on step {sim.ti}: {n}')
    return

# Create the sim
pars = dict(
    diseases = 'sis',
    networks = 'mf',
    analyzers = count_edges,
    demographics = True,
)

# Run the sim
sim = ss.Sim(pars).run()
sim.plot()

# Plot the number of edges
plt.figure()
plt.plot(sim.timevec, n_edges)
plt.title('Number of edges over time')
plt.ylim(bottom=0)
plt.show()
Initializing sim with 10000 agents
  Running 2000.01.01 ( 0/51) (0.00 s)  ———————————————————— 2%
Number of edges for network mfnet on step 0: 3286
Number of edges for network mfnet on step 1: 3319
Number of edges for network mfnet on step 2: 3357
Number of edges for network mfnet on step 3: 3395
Number of edges for network mfnet on step 4: 3426
Number of edges for network mfnet on step 5: 3475
Number of edges for network mfnet on step 6: 3509
Number of edges for network mfnet on step 7: 3548
Number of edges for network mfnet on step 8: 3593
Number of edges for network mfnet on step 9: 3618
  Running 2010.01.01 (10/51) (0.30 s)  ••••———————————————— 22%
Number of edges for network mfnet on step 10: 3634
Number of edges for network mfnet on step 11: 3682
Number of edges for network mfnet on step 12: 3713
Number of edges for network mfnet on step 13: 3740
Number of edges for network mfnet on step 14: 3775
Number of edges for network mfnet on step 15: 3816
Number of edges for network mfnet on step 16: 3850
Number of edges for network mfnet on step 17: 3896
Number of edges for network mfnet on step 18: 3945
Number of edges for network mfnet on step 19: 3987
  Running 2020.01.01 (20/51) (0.49 s)  ••••••••———————————— 41%
Number of edges for network mfnet on step 20: 4010
Number of edges for network mfnet on step 21: 4048
Number of edges for network mfnet on step 22: 4085
Number of edges for network mfnet on step 23: 4124
Number of edges for network mfnet on step 24: 4149
Number of edges for network mfnet on step 25: 4164
Number of edges for network mfnet on step 26: 4200
Number of edges for network mfnet on step 27: 4250
Number of edges for network mfnet on step 28: 4299
Number of edges for network mfnet on step 29: 4314
  Running 2030.01.01 (30/51) (0.57 s)  ••••••••••••———————— 61%
Number of edges for network mfnet on step 30: 4346
Number of edges for network mfnet on step 31: 4373
Number of edges for network mfnet on step 32: 4409
Number of edges for network mfnet on step 33: 4465
Number of edges for network mfnet on step 34: 4506
Number of edges for network mfnet on step 35: 4559
Number of edges for network mfnet on step 36: 4609
Number of edges for network mfnet on step 37: 4642
Number of edges for network mfnet on step 38: 4694
Number of edges for network mfnet on step 39: 4742
  Running 2040.01.01 (40/51) (0.65 s)  ••••••••••••••••———— 80%
Number of edges for network mfnet on step 40: 4768
Number of edges for network mfnet on step 41: 4830
Number of edges for network mfnet on step 42: 4887
Number of edges for network mfnet on step 43: 4947
Number of edges for network mfnet on step 44: 5010
Number of edges for network mfnet on step 45: 5047
Number of edges for network mfnet on step 46: 5109
Number of edges for network mfnet on step 47: 5165
Number of edges for network mfnet on step 48: 5216
Number of edges for network mfnet on step 49: 5259
  Running 2050.01.01 (50/51) (0.73 s)  •••••••••••••••••••• 100%

Number of edges for network mfnet on step 50: 5283
Figure(896x672)

Advanced usage

Suppose we wanted to create an analyzer that would report on the number of new HIV infections in pregnant women:

import starsim as ss
import starsim_examples as sse
import pandas as pd

class HIV_preg(ss.Analyzer):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        return
    
    def init_results(self):
        super().init_results()
        self.define_results(
            ss.Result('new_infections_pregnancy'),
        )
        return

    def step(self):
        sim = self.sim
        ti = sim.ti
        hiv = sim.diseases.hiv
        pregnant = sim.demographics.pregnancy.pregnant
        newly_infected = hiv.ti_infected == ti
        self.results['new_infections_pregnancy'][ti] = len((newly_infected & pregnant).uids)
        return

pregnancy = ss.Pregnancy(fertility_rate=pd.read_csv('test_data/nigeria_asfr.csv'))
hiv = sse.HIV(beta={'mfnet':[0.5,0.25]})
sim = ss.Sim(diseases=hiv, networks='mfnet', demographics=pregnancy, analyzers=HIV_preg())
sim.run()
print(f'Total infections among pregnant women: {sim.results.hiv_preg.new_infections_pregnancy.sum()}')
Initializing sim with 10000 agents
  Running 2000.01.01 ( 0/51) (0.00 s)  ———————————————————— 2%
  Running 2010.01.01 (10/51) (0.11 s)  ••••———————————————— 22%
  Running 2020.01.01 (20/51) (0.24 s)  ••••••••———————————— 41%
  Running 2030.01.01 (30/51) (0.40 s)  ••••••••••••———————— 61%
  Running 2040.01.01 (40/51) (0.57 s)  ••••••••••••••••———— 80%
  Running 2050.01.01 (50/51) (0.76 s)  •••••••••••••••••••• 100%

Total infections among pregnant women: 151.0

Analyzers are ideal for adding custom results, and because they get added to the sim in the same way as any other result, they also get automatically exported in the same format, e.g. using sim.to_df().

Here’s a plot of the results from our HIV in pregnancy analyzer:

import matplotlib.pyplot as plt

res = sim.results.hiv_preg

plt.figure()
plt.bar(res.timevec.years, res.new_infections_pregnancy)
plt.title('HIV infections acquired during pregnancy')
plt.show()

Built-in analyzers

Starsim comes with two built-in analyzers, an infection log and an age analyzer.

Infection log

The infection log produces a line list of infections. It’s integrated into the logic of ss.Disease in order to track infection sources and targets, which are not kept beyond this (for memory reasons).

# Demonstrate infection log
sim = ss.Sim(n_agents=1000, dt=0.2, dur=15, diseases='sir', networks='random', analyzers='infection_log')
sim.run()
infection_log = sim.analyzers[0]
infection_log.plot()
Initializing sim with 1000 agents
  Running 2000.01.01 ( 0/76) (0.00 s)  ———————————————————— 1%
  Running 2002.01.01 (10/76) (0.03 s)  ••—————————————————— 14%
  Running 2004.01.01 (20/76) (0.06 s)  •••••——————————————— 28%
  Running 2006.01.01 (30/76) (0.09 s)  ••••••••———————————— 41%
  Running 2008.01.01 (40/76) (0.12 s)  ••••••••••—————————— 54%
  Running 2010.01.01 (50/76) (0.15 s)  •••••••••••••——————— 67%
  Running 2012.01.01 (60/76) (0.18 s)  ••••••••••••••••———— 80%
  Running 2014.01.01 (70/76) (0.20 s)  ••••••••••••••••••—— 93%
Figure(672x480)

You can see how this raster plot aligns with the peak of infections:

sim.diseases.sir.plot()
Figure(672x480)

(Note: ss.infection_log() also has an .animate() method, which we will leave for you to try out!)

Dynamics by age

This analyzer illustrates how you could track infections by age. Since it illustrates some additional key principles of building analyzers, here it is in full:

class dynamics_by_age(ss.Analyzer):
    def __init__(self, state, age_bins=(0, 20, 40, 100)):
        super().__init__()
        self.state = state
        self.age_bins = age_bins
        self.mins = age_bins[:-1]
        self.maxes = age_bins[1:]
        self.hist = {k: [] for k in self.mins}
        return

    def step(self):
        people = self.sim.people
        for min, max in zip(self.mins, self.maxes):
            mask = (people.age >= min) & (people.age < max)
            self.hist[min].append(people.states[self.state][mask].sum())
        return

    def finalize_results(self):
        """ Convert to an array """
        super().finalize_results()
        for k,hist in self.hist.items():
            self.hist[k] = np.array(hist)
        return

    def plot(self, **kwargs):
        kw = ss.plot_args(kwargs)
        with ss.style(**kw.style):
            fig = plt.figure(**kw.fig)
            for minage, maxage in zip(self.mins, self.maxes):
                plt.plot(self.sim.t.timevec, self.hist[minage], label=f'Age {minage}-{maxage}', **kw.plot)
            plt.legend(**kw.legend)
            plt.xlabel('Model time')
            plt.ylabel('Count')
            plt.ylim(bottom=0)
        return ss.return_fig(fig, **kw.return_fig)

# Demonstrate
by_age = dynamics_by_age('sis.infected', age_bins=(0, 10, 30, 100))
sim = ss.Sim(diseases='sis', networks='random', analyzers=by_age, copy_inputs=False)
sim.run()
by_age.plot()
Initializing sim with 10000 agents
  Running 2000.01.01 ( 0/51) (0.00 s)  ———————————————————— 2%
  Running 2010.01.01 (10/51) (0.07 s)  ••••———————————————— 22%
  Running 2020.01.01 (20/51) (0.13 s)  ••••••••———————————— 41%
  Running 2030.01.01 (30/51) (0.18 s)  ••••••••••••———————— 61%
  Running 2040.01.01 (40/51) (0.24 s)  ••••••••••••••••———— 80%
  Running 2050.01.01 (50/51) (0.30 s)  •••••••••••••••••••• 100%

Figure(672x480)

Since we are using a random network, we wouldn’t expect any differences in transission by age, so what you’re seeing here is the difference in age bin size.