gemseo / caches

hdf5_cache module¶

Caching module to avoid multiple evaluations of a discipline¶

class gemseo.caches.hdf5_cache.HDF5Cache(hdf_file_path, hdf_node_path, tolerance=0.0, name=None)[source]¶

Bases: gemseo.core.cache.AbstractFullCache

Cache using disk HDF5 file to store the data

Initialize a singleton to access a HDF file. This singleton is used for multithreaded/multiprocessing access with a Lock.

Initialize cache tolerance. By default, don’t use approximate cache. It is up to the user to choose to optimize CPU time with this or not could be something like 2 * finfo(float).eps

Parameters

hdf_file_path (str) – Path of the HDF file.
hdf_node_path (str) – Node of the HDF file.
tolerance (float) – Tolerance that defines if two input vectors are equal and cached data shall be returned. If 0, no approximation is made. Default: 0.
name (str) – Name of the cache.

Examples

>>> from gemseo.caches.hdf5_cache import HDF5Cache
>>> cache = HDF5Cache('my_cache.hdf5', 'my_node')

clear()[source]¶

Clear the cache

Examples

>>> from gemseo.caches.hdf5_cache import HDF5Cache
>>> from numpy import array
>>> cache = HDF5Cache('my_cache.hdf5', 'my_node')
>>> for index in range(5):
>>>     data = {'x': array([1.])*index, 'y': array([.2])*index}
>>>     cache.cache_outputs(data, ['x'], data, ['y'])
>>> cache.get_length()
5
>>> cache.clear()
>>> cache.get_length()
0

get_data(index, **options)[source]¶

Gets the data associated to a sample ID.

Parameters

index (str) – sample ID.
options – options passed to the _read_data() method.

Returns

input data, output data and jacobian.

Return type

dict

class gemseo.caches.hdf5_cache.HDF5FileSingleton(*args, **kwargs)[source]¶

Bases: object

Singleton to access a HDF file Used for multithreaded/multiprocessing access with a Lock

Constructor

Parameters: hdf_file_path – path to the HDF5 file

HASH_TAG = 'hash'¶

INPUTS_GROUP = 'inputs'¶

JACOBIAN_GROUP = 'jacobian'¶

OUTPUTS_GROUP = 'outputs'¶

clear(hdf_node_path)[source]¶: Clear the data in the cache :param hdf_node_path: node path to clear

has_group(group_number, group_name, hdf_node_path)[source]¶

Check if a group is present in the HDF file

Parameters

group_name – name of the group where data is written
group_number – number of the group
hdf_node_path – name of the main HDF group

Returns

True if the group exists

read_data(group_number, group_name, hdf_node_path, h5_open_file=None)[source]¶

Read a data dict in the hdf

Parameters

group_name – name of the group where data is written
group_number – number of the group :param hdf_node_path: name of the main HDF group
h5_open_file – eventually the already opened file. this improves performance but is incompatible with multiprocess/treading

Returns

data dict and jacobian

read_hashes(hashes_dict, hdf_node_path)[source]¶: Read the hashes in the HDF file :param hashes_dict: dict of hashes to fill :param hdf_node_path: name of the main HDF group :return: max_group

write_data(data, data_names, group_name, group_num, hdf_node_path, h5_open_file=None)[source]¶

Cache input data to avoid re evaluation

Parameters

data – the data to cache
data_names – list of data names
group_name – inputs or outputs or jacobian group
hdf_node_path – name of the main HDF group
h5_open_file – eventually the already opened file. this improves performance but is incompatible with multiprocess/treading