gemseo / core

cache module

Caching module to avoid multiple evaluations of a discipline.

class gemseo.core.cache.AbstractCache(tolerance=0.0, name=None)[source]

Bases: collections.abc.Mapping

An abstract base class for caches with a dictionary-like interface.

Caches are mainly used to store the MDODiscipline evaluations.

A cache entry is defined by:

  • an input data in the form of a dictionary of objects associated with input names, i.e. {"input_name": object},

  • an output data in the form of a dictionary of NumPy arrays associated with output names, i.e. {"output_name": array}.

  • an optional Jacobian data, in the form of a nested dictionary of NumPy arrays associating output and input names, i.e. {"output_name": {"input_name": array}}.

Example

The evaluation of the function \(y=f(x)=(x^2, 2x^3\))` and its derivative at \(x=1\) leads to cache the entry defined by:

  • the input data: \(1.\),

  • the output data: \((1., 2.)\),

  • the Jacobian data: \((2., 6.)^T\).

>>> input_data = {"x": array([1.])}
>>> output_data = {"y": array([1., 2.])}
>>> jacobian_data = {"y": {"x": array([[2.], [6.]])}}

For this input_data, one can cache the output data:

>>> cache.cache_outputs(input_data, output_data)

as well as the Jacobian data:

>>> cache.cache_jacobian(input_data, jacobian_data)

Caches have a abc.Mapping interface making them easy to set (cache[input_data] = (output_data, jacobian_data)), access (cache_entry = cache[input_data]) and update (cache.update(other_cache)).

Note

cache_entry is a CacheEntry with the ordered fields input, output and jacobian accessible either by index, e.g. input_data = cache_entry[0], or by name, e.g. input_data = cache_entry.inputs.

One can also get the number of cache entries with size = len(cache) and iterate over the cache, e.g. for input_data, output_data, _ in cache for index, (input_data, _, jacobian_data) in enumerate(cache) or [entry.outputs for entry in cache].

See also

SimpleCache to store the last discipline evaluation. MemoryFullCache to store all the discipline evaluations in memory. HDF5Cache to store all the discipline evaluations in a HDF5 file.

Parameters
  • tolerance (float) –

    The tolerance below which two input arrays are considered equal: norm(new_array-cached_array)/(1+norm(cached_array)) <= tolerance. If this is the case for all the input names, then the cached output data shall be returned rather than re-evaluating the discipline. This tolerance could be useful to optimize CPU time. It could be something like 2 * numpy.finfo(float).eps.

    By default it is set to 0.0.

  • name (str | None) –

    A name for the cache. If None, use the class name.

    By default it is set to None.

Return type

None

abstract cache_jacobian(input_data, jacobian_data)[source]

Cache the input and Jacobian data.

Parameters
  • input_data (Mapping[str, Any]) – The data containing the input data to cache.

  • jacobian_data (Mapping[str, Mapping[str, numpy.ndarray]]) – The Jacobian data to cache.

Return type

None

abstract cache_outputs(input_data, output_data)[source]

Cache input and output data.

Parameters
  • input_data (Mapping[str, Any]) – The data containing the input data to cache.

  • output_data (Mapping[str, Any]) – The data containing the output data to cache.

Return type

None

clear()[source]

Clear the cache.

Return type

None

export_to_dataset(name=None, by_group=True, categorize=True, input_names=None, output_names=None)[source]

Build a Dataset from the cache.

Parameters
  • name (str | None) –

    A name for the dataset. If None, use the name of the cache.

    By default it is set to None.

  • by_group (bool) –

    Whether to store the data by group in Dataset.data, in the sense of one unique NumPy array per group. If categorize is False, there is a unique group: Dataset.PARAMETER_GROUP`. If categorize is True, the groups are stored in Dataset.INPUT_GROUP and Dataset.OUTPUT_GROUP. If by_group is False, store the data by variable names.

    By default it is set to True.

  • categorize (bool) –

    Whether to distinguish between the different groups of variables. Otherwise, group all the variables in Dataset.PARAMETER_GROUP`.

    By default it is set to True.

  • input_names (Iterable[str] | None) –

    The names of the inputs to be exported. If None, use all the inputs.

    By default it is set to None.

  • output_names (Iterable[str] | None) –

    The names of the outputs to be exported. If None, use all the outputs.

    By default it is set to None.

Returns

A dataset version of the cache.

Return type

Dataset

get(k[, d]) D[k] if k in D, else d.  d defaults to None.
items() a set-like object providing a view on D's items
keys() a set-like object providing a view on D's keys
values() an object providing a view on D's values
property input_names: list[str]

The names of the inputs of the last entry.

abstract property last_entry: gemseo.core.cache.CacheEntry

The last cache entry.

name: str

The name of the cache.

property names_to_sizes: dict[str, int]

The sizes of the variables of the last entry.

property output_names: list[str]

The names of the outputs of the last entry.

tolerance: float

The tolerance below which two input arrays are considered equal.

class gemseo.core.cache.AbstractFullCache(tolerance=0.0, name=None)[source]

Bases: gemseo.core.cache.AbstractCache

Abstract cache to store all the data, either in memory or on the disk.

See also

MemoryFullCache: store all the data in memory. HDF5Cache: store all the data in an HDF5 file.

Parameters
  • tolerance (float) –

    The tolerance below which two input arrays are considered equal: norm(new_array-cached_array)/(1+norm(cached_array)) <= tolerance. If this is the case for all the input names, then the cached output data shall be returned rather than re-evaluating the discipline. This tolerance could be useful to optimize CPU time. It could be something like 2 * numpy.finfo(float).eps.

    By default it is set to 0.0.

  • name (str | None) –

    A name for the cache. If None, use the class name.

    By default it is set to None.

Return type

None

cache_jacobian(input_data, jacobian_data)[source]

Cache the input and Jacobian data.

Parameters
  • input_data (Mapping[str, Any]) – The data containing the input data to cache.

  • jacobian_data (Mapping[str, Mapping[str, numpy.ndarray]]) – The Jacobian data to cache.

Return type

None

cache_outputs(input_data, output_data)[source]

Cache input and output data.

Parameters
  • input_data (Mapping[str, Any]) – The data containing the input data to cache.

  • output_data (Mapping[str, Any]) – The data containing the output data to cache.

Return type

None

clear()[source]

Clear the cache.

Return type

None

export_to_dataset(name=None, by_group=True, categorize=True, input_names=None, output_names=None)

Build a Dataset from the cache.

Parameters
  • name (str | None) –

    A name for the dataset. If None, use the name of the cache.

    By default it is set to None.

  • by_group (bool) –

    Whether to store the data by group in Dataset.data, in the sense of one unique NumPy array per group. If categorize is False, there is a unique group: Dataset.PARAMETER_GROUP`. If categorize is True, the groups are stored in Dataset.INPUT_GROUP and Dataset.OUTPUT_GROUP. If by_group is False, store the data by variable names.

    By default it is set to True.

  • categorize (bool) –

    Whether to distinguish between the different groups of variables. Otherwise, group all the variables in Dataset.PARAMETER_GROUP`.

    By default it is set to True.

  • input_names (Iterable[str] | None) –

    The names of the inputs to be exported. If None, use all the inputs.

    By default it is set to None.

  • output_names (Iterable[str] | None) –

    The names of the outputs to be exported. If None, use all the outputs.

    By default it is set to None.

Returns

A dataset version of the cache.

Return type

Dataset

export_to_ggobi(file_path, input_names=None, output_names=None)[source]

Export the cache to an XML file for ggobi tool.

Parameters
  • file_path (str) – The path of the file to export the cache.

  • input_names (Iterable[str] | None) –

    The names of the inputs to export. If None, export all of them.

    By default it is set to None.

  • output_names (Iterable[str] | None) –

    The names of the outputs to export. If None, export all of them.

    By default it is set to None.

Return type

None

get(k[, d]) D[k] if k in D, else d.  d defaults to None.
items() a set-like object providing a view on D's items
keys() a set-like object providing a view on D's keys
update(other_cache)[source]

Update from another cache.

Parameters

other_cache (gemseo.core.cache.AbstractFullCache) – The cache to update the current one.

Return type

None

values() an object providing a view on D's values
property input_names: list[str]

The names of the inputs of the last entry.

property last_entry: gemseo.core.cache.CacheEntry

The last cache entry.

lock: multiprocessing.context.BaseContext.RLock

The lock used for both multithreading and multiprocessing.

Ensure safe multiprocessing and multithreading concurrent access to the cache.

lock_hashes: multiprocessing.context.BaseContext.RLock

The lock used for both multithreading and multiprocessing.

Ensure safe multiprocessing and multithreading concurrent access to the cache.

name: str

The name of the cache.

property names_to_sizes: dict[str, int]

The sizes of the variables of the last entry.

property output_names: list[str]

The names of the outputs of the last entry.

tolerance: float

The tolerance below which two input arrays are considered equal.

class gemseo.core.cache.CacheEntry(inputs, outputs=None, jacobian=None)

Bases: tuple

The entry of a cache.

Create new instance of CacheEntry(inputs, outputs, jacobian)

count(value, /)

Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

inputs

The input data as dict[str, ndarray].

jacobian

The Jacobian data as dict[str, dict[str, ndarray]].

outputs

The output data as dict[str, ndarray].

gemseo.core.cache.hash_data_dict(data)[source]

Hash data using xxh3_64 from the xxhash library.

Parameters

data (Mapping[str, ndarray | int | float]) – The data to hash.

Returns

The hash value of the data.

Return type

int

Examples

>>> from gemseo.core.cache import hash_data_dict
>>> from numpy import array
>>> data = {'x':array([1.,2.]),'y':array([3.])}
>>> hash_data_dict(data)
13252388834746642440
>>> hash_data_dict(data,'x')
4006190450215859422
gemseo.core.cache.to_real(data)[source]

Convert a NumPy array to a float NumPy array.

Parameters

data (numpy.ndarray) – The NumPy array to be converted to real.

Returns

A float NumPy array.

Return type

numpy.ndarray