utils

utils

Numerical utilities and other helper functions

Classes

Name Description
ndict A dictionary-like class that provides additional functionalities for handling named items.

ndict

utils.ndict(
    self,
    *args,
    nameattr='name',
    type=None,
    strict=True,
    overwrite=False,
    **kwargs,
)

A dictionary-like class that provides additional functionalities for handling named items.

Parameters

Name Type Description Default
name str The attribute of the item to use as the dict key (i.e., all items should have this attribute defined) required
type type The expected type of items. None
strict bool If True, only items with the specified attribute will be accepted. True
overwrite bool whether to allow adding a key when one has already been added False

Examples:

networks = ss.ndict(ss.MFNet(), ss.MaternalNet())
networks = ss.ndict([ss.MFNet(), ss.MaternalNet()])
networks = ss.ndict({'mf':ss.MFNet(), 'maternal':ss.MaternalNet()})

Methods

Name Description
copy Shallow copy
extend Add new items to the ndict, by item, list, or dict
merge Merge another dictionary with this one
copy
utils.ndict.copy()

Shallow copy

extend
utils.ndict.extend(*args, **kwargs)

Add new items to the ndict, by item, list, or dict

merge
utils.ndict.merge(other)

Merge another dictionary with this one

Functions

Name Description
find_contacts Variation on Network.find_contacts() that avoids sorting.
load Alias to Sciris sc.loadany()
plot_args Process known plotting kwargs.
return_fig Do postprocessing on the figure: by default, don’t return if in Jupyter, but show instead; not for the user
save Alias to Sciris sc.save()
show Shortcut for matplotlib.pyplot.show()
standardize_data Standardize formats of input data
standardize_netkey Networks can be upper or lowercase, and have a suffix ‘net’ or not; this function standardizes them
validate_sim_data Validate data intended to be compared to the sim outputs, e.g. for calibration
warn Helper function to handle warnings – shortcut to warnings.warn

find_contacts

utils.find_contacts(p1, p2, inds)

Variation on Network.find_contacts() that avoids sorting.

A set is returned here rather than a sorted array so that custom tracing interventions can efficiently add extra people. For a version with sorting by default, see Network.find_contacts(). Indices must be an int64 array since this is what’s returned by true() etc. functions by default.

load

utils.load(filename, **kwargs)

Alias to Sciris sc.loadany()

Since Starsim uses Sciris for saving objects, they can be loaded back using this function. This can also be used to load other objects of known type (e.g. JSON), although this usage is discouraged.

Parameters

Name Type Description Default
filename str / path the name of the file to load required
kwargs dict passed to sc.loadany() {}

Returns

Name Type Description
The loaded object

plot_args

utils.plot_args(kwargs=None, _debug=False, **defaults)

Process known plotting kwargs.

This function handles arguments to sim.plot() and other plotting functions by splitting known kwargs among all the different aspects of the plot.

Note: the kwargs supplied to the parent function should be supplied as the first argument of this function; keyword arguments to this function are treated as default values that will be overwritten by user-supplied values in kwargs. The argument “_debug” is used internally to print debugging output, but is not typically set by the user.

Parameters

Name Type Description Default
fig_kw dict passed to sc.getrowscols(), then plt.subplots() and plt.figure() required
plot_kw dict passed to plt.plot() required
data_kw dict passed to plt.scatter(), for plotting the data required
style_kw dict passed to sc.options.with_style(), for controlling the detailed plotting style required
**kwargs dict parsed among the above dictionaries None

Returns

Name Type Description
A dict-of-dicts with plotting arguments, for use with subsequent plotting commands

Valid kwarg arguments are:

- fig: 'figsize', 'nrows', 'ncols', 'ratio', 'num', 'dpi', 'facecolor'
- plot: 'alpha', 'c', 'lw', 'linewidth', 'marker', 'markersize', 'ms'
- data: 'data_alpha', 'data_color', 'data_size'
- style: 'font', 'fontsize', 'interactive'
- return_fig: 'do_show', 'is_jupyter', 'is_reticulate'

Examples:

kw = ss.plot_args(kwargs, fig_kw=dict(figsize=(10,10)) # Explicit way to set figure size, passed to `plt.figure()` eventually
kw = ss.plot_args(kwargs, figsize=(10,10)) # Shortcut since known keyword

return_fig

utils.return_fig(fig, **kwargs)

Do postprocessing on the figure: by default, don’t return if in Jupyter, but show instead; not for the user

save

utils.save(filename, obj, **kwargs)

Alias to Sciris sc.save()

While some Starsim objects have their own save methods, this function can be used to save any arbitrary object. It can then be loaded with ss.load().

Parameters

Name Type Description Default
filename str / path the name of the file to save required
obj any the object to save required
kwargs dict passed to sc.save() {}

show

utils.show(**kwargs)

Shortcut for matplotlib.pyplot.show()

standardize_data

utils.standardize_data(
    data=None,
    metadata=None,
    min_year=1800,
    out_of_range=0,
    default_age=0,
    default_year=2024,
)

Standardize formats of input data

Input data can arrive in many different forms. This function accepts a variety of data structures, and converts them into a Pandas Series containing one variable, based on specified metadata, or an ss.Dist if the data is already an ss.Dist object.

The metadata is a dictionary that defines columns of the dataframe or keys of the dictionary to use as indices in the output Series. It should contain:

  • metadata['data_cols']['value'] specifying the name of the column/key to draw values from
  • metadata['data_cols']['year'] optionally specifying the column containing year values; otherwise the default year will be used
  • metadata['data_cols']['age'] optionally specifying the column containing age values; otherwise the default age will be used
  • metadata['data_cols'][<arbitrary>] optionally specifying any other columns to use as indices. These will form part of the multi-index for the standardized Series output.

If a sex column is part of the index, the metadata can also optionally specify a string mapping to convert the sex labels in the input data into the ‘m’/‘f’ labels used by Starsim. In that case, the metadata can contain an additional key like metadata['sex_keys'] = {'Female':'f','Male':'m'} which in this case would map the strings ‘Female’ and ‘Male’ in the original data into ‘m’/‘f’ for Starsim.

Parameters

Name Type Description Default
data (pandas.DataFrame, pandas.Series, dict, int, float) An associative array or a number, with the input data to be standardized. None
metadata dict Dictionary specifiying index columns, the value column, and optionally mapping for sex labels None
min_year float Optionally specify a minimum year allowed in the data. Default is 1800. 1800
out_of_range float Value to use for negative ages - typically 0 is a reasonable choice but other values (e.g., np.inf or np.nan) may be useful depending on the calculation. This will automatically be added to the dataframe with an age of -np.inf 0

Returns:

- A `pd.Series` for all supported formats of `data` *except* an `ss.Dist`. This series will contain index columns for 'year'
  and 'age' (in that order) and then subsequent index columns for any other variables specified in the metadata, in the order
  they appeared in the metadata (except for year and age appearing first).
- An `ss.Dist` instance - if the `data` input is an `ss.Dist`, that same object will be returned by this function

standardize_netkey

utils.standardize_netkey(key)

Networks can be upper or lowercase, and have a suffix ‘net’ or not; this function standardizes them

validate_sim_data

utils.validate_sim_data(data=None, die=None)

Validate data intended to be compared to the sim outputs, e.g. for calibration

Parameters

Name Type Description Default
data df / dict a dataframe (or dict) of data, with a column “time” plus data columns of the form “module.result”, e.g. “hiv.new_infections” None
die bool whether to raise an exception if the data cannot be converted (default: die if data is not None but cannot be converted) None

warn

utils.warn(msg, category=None, verbose=None, die=None)

Helper function to handle warnings – shortcut to warnings.warn