gemseo / caches

hdf5_cache module

Caching module to store all the entries in an HDF file.

class gemseo.caches.hdf5_cache.HDF5Cache(hdf_file_path='cache.hdf5', hdf_node_path='node', tolerance=0.0, name=None)[source]

Bases: AbstractFullCache

Cache using disk HDF5 file to store the data.

Parameters:
  • hdf_file_path (str | Path) –

    The path of the HDF file. Initialize a singleton to access the HDF file. This singleton is used for multithreading/multiprocessing access with a lock.

    By default it is set to “cache.hdf5”.

  • hdf_node_path (str) –

    The name of node of the HDF file.

    By default it is set to “node”.

  • name (str) – A name for the cache. If None, use hdf_node_name`.

  • tolerance (float) –

    By default it is set to 0.0.

Warning

This class relies on some multiprocessing features, it is therefore necessary to protect its execution with an if __name__ == '__main__': statement when working on Windows. Currently, the use of an HDF5Cache is not supported in parallel on Windows platforms. This is due to the way subprocesses are forked in this architecture. The method DOEScenario.set_optimization_history_backup() is recommended as an alternative.

cache_jacobian(input_data, jacobian_data)

Cache the input and Jacobian data.

Parameters:
Return type:

None

cache_outputs(input_data, output_data)

Cache input and output data.

Parameters:
  • input_data (Mapping[str, Any]) – The data containing the input data to cache.

  • output_data (Mapping[str, Any]) – The data containing the output data to cache.

Return type:

None

clear()[source]

Clear the cache.

Return type:

None

export_to_dataset(name=None, by_group=True, categorize=True, input_names=None, output_names=None)

Build a Dataset from the cache.

Parameters:
  • name (str | None) – A name for the dataset. If None, use the name of the cache.

  • by_group (bool) –

    Whether to store the data by group in Dataset.data, in the sense of one unique NumPy array per group. If categorize is False, there is a unique group: Dataset.PARAMETER_GROUP`. If categorize is True, the groups are stored in Dataset.INPUT_GROUP and Dataset.OUTPUT_GROUP. If by_group is False, store the data by variable names.

    By default it is set to True.

  • categorize (bool) –

    Whether to distinguish between the different groups of variables. Otherwise, group all the variables in Dataset.PARAMETER_GROUP`.

    By default it is set to True.

  • input_names (Iterable[str] | None) – The names of the inputs to be exported. If None, use all the inputs.

  • output_names (Iterable[str] | None) – The names of the outputs to be exported. If None, use all the outputs. If an output name is also an input name, the output name is suffixed with [out].

Returns:

A dataset version of the cache.

Return type:

Dataset

export_to_ggobi(file_path, input_names=None, output_names=None)

Export the cache to an XML file for ggobi tool.

Parameters:
  • file_path (str) – The path of the file to export the cache.

  • input_names (Iterable[str] | None) – The names of the inputs to export. If None, export all of them.

  • output_names (Iterable[str] | None) – The names of the outputs to export. If None, export all of them.

Return type:

None

get(k[, d]) D[k] if k in D, else d.  d defaults to None.
items() a set-like object providing a view on D's items
keys() a set-like object providing a view on D's keys
update(other_cache)

Update from another cache.

Parameters:

other_cache (AbstractFullCache) – The cache to update the current one.

Return type:

None

static update_file_format(hdf_file_path)[source]

Update the format of a HDF5 file.

Parameters:

hdf_file_path (str | Path) – A HDF5 file path.

Return type:

None

values() an object providing a view on D's values
property hdf_file: HDF5FileSingleton

The HDF file handler.

property hdf_node_name: str

The name of the HDF node.

property input_names: list[str]

The names of the inputs of the last entry.

property last_entry: CacheEntry

The last cache entry.

lock: RLock

The lock used for both multithreading and multiprocessing.

Ensure safe multiprocessing and multithreading concurrent access to the cache.

lock_hashes: RLock

The lock used for both multithreading and multiprocessing.

Ensure safe multiprocessing and multithreading concurrent access to the cache.

name: str

The name of the cache.

property names_to_sizes: dict[str, int]

The sizes of the variables of the last entry.

For a Numpy array, its size is used. For a container, its length is used. Otherwise, a size of 1 is used.

property output_names: list[str]

The names of the outputs of the last entry.

tolerance: float

The tolerance below which two input arrays are considered equal.

Examples using HDF5Cache

HDF5 cache

HDF5 cache

HDF5 cache