utils
Numerical utilities and other helper functions
Classes
| Name | Description |
|---|---|
| ndict | A dictionary-like class that provides additional functionalities for handling named items. |
| shrink | Define a class to indicate an object has been shrunken |
ndict
utils.ndict(
*args,
nameattr='name',
type=None,
strict=True,
overwrite=False,
**kwargs,
)A dictionary-like class that provides additional functionalities for handling named items.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| nameattr | str | The attribute of the item to use as the dict key (i.e., all items should have this attribute defined) | 'name' |
| type | type | The expected type of items. | None |
| strict | bool | If True, only items with the specified attribute will be accepted. | True |
| overwrite | bool | whether to allow adding a key when one has already been added | False |
Examples:
networks = ss.ndict(ss.MFNet(), ss.MaternalNet())
networks = ss.ndict([ss.MFNet(), ss.MaternalNet()])
networks = ss.ndict({'mf':ss.MFNet(), 'maternal':ss.MaternalNet()})
Methods
| Name | Description |
|---|---|
| copy | Shallow copy |
| extend | Add new items to the ndict, by item, list, or dict |
| get | Get an entry from this ndict |
| merge | Merge another dictionary with this one |
copy
utils.ndict.copy()Shallow copy
extend
utils.ndict.extend(*args, **kwargs)Add new items to the ndict, by item, list, or dict
get
utils.ndict.get(key, default=None, match_case=False)Get an entry from this ndict
See also sim.get_module(). Note: if a type is supplied that matches more than one entry, this function only returns the first match. Use Sim.get_modules() to handle multiple matches.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| key | str / type | the object to get, either a type (e.g. ss.SIR) or a string (e.g. 'sir') |
required |
| default | obj |
what to return if not found (default None) | None |
| match_case | bool | if False (default), ignore case with string matching | False |
Example: sim = ss.Sim(diseases=ss.SIR(name=‘MySIR’), networks=‘random’) sim.run()
# All return sim.diseases.MySIR
sim.diseases.get('MySIR')
sim.diseases.get('mysir')
sim.diseases.get(ss.SIR)
merge
utils.ndict.merge(other)Merge another dictionary with this one
shrink
utils.shrink()Define a class to indicate an object has been shrunken
Functions
| Name | Description |
|---|---|
| apply_age_range | Return a boolean mask for values in arr that fall within the age range. |
| combine_rands | Efficient algorithm for combining two arrays of random integers into an array |
| find_contacts | Variation on Network.find_contacts() that avoids sorting. |
| format_axes | Standard formatting for axis results; not for the user |
| get_result_plot_label | Helper function for getting the label to plot for a result; not for the user |
| load | Alias to Sciris sc.loadany() |
| match_result_keys | Ensure that the user-provided keys match available ones, and raise an exception if not |
| nlist_to_dict | Convert a list of named items (e.g. modules, states) to a dictionary; not for the user |
| parse_age_range | Parse an age range string into lower and upper bounds. |
| plot_args | Process known plotting kwargs. |
| return_fig | Do postprocessing on the figure: by default, don’t return if in Jupyter, but show instead; not for the user |
| save | Alias to Sciris sc.save() |
| show | Shortcut for matplotlib.pyplot.show() |
| standardize_data | Standardize formats of input data |
| standardize_netkey | Networks can be upper or lowercase, and have a suffix ‘net’ or not; this function standardizes them |
| validate_sim_data | Validate data intended to be compared to the sim outputs, e.g. for calibration |
| warn | Helper function to handle warnings – shortcut to warnings.warn |
apply_age_range
utils.apply_age_range(age_string, arr)Return a boolean mask for values in arr that fall within the age range.
For bracket/interval notation, respects inclusive [ ] vs exclusive ( ) bounds. For other formats, uses standard conventions:
- ``'5-9'``, ``'5 to 9'`` → ``[5, 9)`` (inclusive lower, exclusive upper)
- ``'<5'`` → ``[0, 5)``
- ``'>95'`` → ``(95, inf)``
- ``'95+'`` → ``[95, inf)``
- ``'[15,25)'`` → ``>= 15`` and ``< 25``
- ``'(15,25]'`` → ``> 15`` and ``<= 25``
Example usage::
>>> import numpy as np
>>> ages = np.array([14, 15, 20, 24, 25])
>>> ss.apply_age_range('[15,25)', ages)
array([False, True, True, True, False])
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| age_string | a string specifying an age range | required | |
| arr | a numeric array to filter | required |
Returns
| Name | Type | Description |
|---|---|---|
A boolean array of the same shape as arr. |
combine_rands
utils.combine_rands(a, b)Efficient algorithm for combining two arrays of random integers into an array of floats.
See ss.multi_random() for the user-facing version.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| a | array | array of random integers between 0 and np.iinfo(np.uint64).max, as from ss.rand_raw() | required |
| b | array | ditto, same size as a | required |
Returns
| Name | Type | Description |
|---|---|---|
| A new array of random numbers the same size as a and b |
find_contacts
utils.find_contacts(p1, p2, inds)Variation on Network.find_contacts() that avoids sorting.
A set is returned here rather than a sorted array so that custom tracing interventions can efficiently add extra people. For a version with sorting by default, see Network.find_contacts(). Indices must be an int64 array since this is what’s returned by true() etc. functions by default.
format_axes
utils.format_axes(ax, res, n_ticks=None, show_module=None)Standard formatting for axis results; not for the user
get_result_plot_label
utils.get_result_plot_label(res, show_module=None)Helper function for getting the label to plot for a result; not for the user
load
utils.load(filename, **kwargs)Alias to Sciris sc.loadany()
Since Starsim uses Sciris for saving objects, they can be loaded back using this function. This can also be used to load other objects of known type (e.g. JSON), although this usage is discouraged.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| filename | str / path |
the name of the file to load | required |
| kwargs | dict | passed to sc.loadany() |
{} |
Returns
| Name | Type | Description |
|---|---|---|
| The loaded object |
match_result_keys
utils.match_result_keys(results, key, show_skipped=False, flattened=False)Ensure that the user-provided keys match available ones, and raise an exception if not
nlist_to_dict
utils.nlist_to_dict(nlist, die=True)Convert a list of named items (e.g. modules, states) to a dictionary; not for the user
parse_age_range
utils.parse_age_range(age_string)Parse an age range string into lower and upper bounds.
Extracts the numeric bounds from a variety of age range formats. Note that bracket/interval notation (e.g. [15,25) vs (15,25]) is accepted but the brackets are stripped — all formats return a plain (lower, upper) tuple. To get a boolean mask that respects bracket semantics (inclusive vs exclusive bounds), use :func:apply_age_range instead.
Example usage::
>>> ss.parse_age_range('5-9')
(5.0, 9.0)
>>> ss.parse_age_range('[15,25)')
(15.0, 25.0)
>>> ss.parse_age_range('(15,25]')
(15.0, 25.0)
Supported formats
'5-9'or'5 to 9''<5'— returns(0.0, 5.0)'95+'— returns(95.0, np.inf)'>95'— returns(95.0, np.inf)'[15,25)','(15,25]','[15,25]','(15,25)'— bracket notation; brackets are stripped, returns(15.0, 25.0)in all cases
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| age_string | a string specifying an age range | required |
Returns
| Name | Type | Description |
|---|---|---|
| tuple | A tuple (lower, upper) as floats. |
plot_args
utils.plot_args(kwargs=None, _debug=False, **defaults)Process known plotting kwargs.
This function handles arguments to sim.plot() and other plotting functions by splitting known kwargs among all the different aspects of the plot.
Note: the kwargs supplied to the parent function should be supplied as the first argument of this function; keyword arguments to this function are treated as default values that will be overwritten by user-supplied values in kwargs. The argument “_debug” is used internally to print debugging output, but is not typically set by the user.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| fig_kw | dict | passed to sc.getrowscols(), then plt.subplots() and plt.figure() |
required |
| plot_kw | dict | passed to plt.plot() |
required |
| data_kw | dict | passed to plt.scatter(), for plotting the data |
required |
| style_kw | dict | passed to sc.options.with_style(), for controlling the detailed plotting style |
required |
| **kwargs | dict | parsed among the above dictionaries | None |
Returns
| Name | Type | Description |
|---|---|---|
| A dict-of-dicts with plotting arguments, for use with subsequent plotting commands |
Valid kwarg arguments are:
- fig: 'figsize', 'nrows', 'ncols', 'ratio', 'num', 'dpi', 'facecolor'
- plot: 'alpha', 'c', 'lw', 'linewidth', 'marker', 'markersize', 'ms'
- data: 'data_alpha', 'data_color', 'data_size'
- style: 'font', 'fontsize', 'interactive'
- return_fig: 'do_show', 'is_jupyter', 'is_reticulate'
Examples:
kw = ss.plot_args(kwargs, fig_kw=dict(figsize=(10,10)) # Explicit way to set figure size, passed to `plt.figure()` eventually
kw = ss.plot_args(kwargs, figsize=(10,10)) # Shortcut since known keyword
return_fig
utils.return_fig(fig, **kwargs)Do postprocessing on the figure: by default, don’t return if in Jupyter, but show instead; not for the user
save
utils.save(filename, obj, **kwargs)Alias to Sciris sc.save()
While some Starsim objects have their own save methods, this function can be used to save any arbitrary object. It can then be loaded with ss.load().
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| filename | str / path |
the name of the file to save | required |
| obj | any | the object to save | required |
| kwargs | dict | passed to sc.save() |
{} |
show
utils.show(**kwargs)Shortcut for matplotlib.pyplot.show()
standardize_data
utils.standardize_data(
data=None,
metadata=None,
min_year=1800,
out_of_range=0,
default_age=0,
default_year=2024,
)Standardize formats of input data
Input data can arrive in many different forms. This function accepts a variety of data structures, and converts them into a Pandas Series containing one variable, based on specified metadata, or an ss.Dist if the data is already an ss.Dist object.
The metadata is a dictionary that defines columns of the dataframe or keys of the dictionary to use as indices in the output Series. It should contain:
metadata['data_cols']['value']specifying the name of the column/key to draw values frommetadata['data_cols']['year']optionally specifying the column containing year values; otherwise the default year will be usedmetadata['data_cols']['age']optionally specifying the column containing age values; otherwise the default age will be usedmetadata['data_cols'][<arbitrary>]optionally specifying any other columns to use as indices. These will form part of the multi-index for the standardized Series output.
If a sex column is part of the index, the metadata can also optionally specify a string mapping to convert the sex labels in the input data into the ‘m’/‘f’ labels used by Starsim. In that case, the metadata can contain an additional key like metadata['sex_keys'] = {'Female':'f','Male':'m'} which in this case would map the strings ‘Female’ and ‘Male’ in the original data into ‘m’/‘f’ for Starsim.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| data | (pandas.DataFrame, pandas.Series, dict, int, float) | An associative array or a number, with the input data to be standardized. | None |
| metadata | dict | Dictionary specifiying index columns, the value column, and optionally mapping for sex labels | None |
| min_year | float | Optionally specify a minimum year allowed in the data. Default is 1800. | 1800 |
| out_of_range | float | Value to use for negative ages - typically 0 is a reasonable choice but other values (e.g., np.inf or np.nan) may be useful depending on the calculation. This will automatically be added to the dataframe with an age of -np.inf |
0 |
Returns:
- A `pd.Series` for all supported formats of `data` *except* an `ss.Dist`. This series will contain index columns for 'year'
and 'age' (in that order) and then subsequent index columns for any other variables specified in the metadata, in the order
they appeared in the metadata (except for year and age appearing first).
- An `ss.Dist` instance - if the `data` input is an `ss.Dist`, that same object will be returned by this function
standardize_netkey
utils.standardize_netkey(key)Networks can be upper or lowercase, and have a suffix ‘net’ or not; this function standardizes them
validate_sim_data
utils.validate_sim_data(data=None, die=None)Validate data intended to be compared to the sim outputs, e.g. for calibration
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| data | df / dict |
a dataframe (or dict) of data, with a column “time” plus data columns of the form “module.result”, e.g. “hiv.new_infections” | None |
| die | bool | whether to raise an exception if the data cannot be converted (default: die if data is not None but cannot be converted) | None |
warn
utils.warn(msg, category=None, verbose=None, die=None)Helper function to handle warnings – shortcut to warnings.warn