gemseo / caches

hdf5_cache module

Caching module to avoid multiple evaluations of a discipline

class gemseo.caches.hdf5_cache.HDF5Cache(hdf_file_path, hdf_node_path, tolerance=0.0, name=None)[source]

Bases: gemseo.core.cache.AbstractFullCache

Cache using disk HDF5 file to store the data

Initialize a singleton to access a HDF file. This singleton is used for multithreaded/multiprocessing access with a Lock.

Initialize cache tolerance. By default, don’t use approximate cache. It is up to the user to choose to optimize CPU time with this or not could be something like 2 * finfo(float).eps

Parameters
  • hdf_file_path (str) – Path of the HDF file.

  • hdf_node_path (str) – Node of the HDF file.

  • tolerance (float) – Tolerance that defines if two input vectors are equal and cached data shall be returned. If 0, no approximation is made. Default: 0.

  • name (str) – Name of the cache.

Examples

>>> from gemseo.caches.hdf5_cache import HDF5Cache
>>> cache = HDF5Cache('my_cache.hdf5', 'my_node')
clear()[source]

Clear the cache

Examples

>>> from gemseo.caches.hdf5_cache import HDF5Cache
>>> from numpy import array
>>> cache = HDF5Cache('my_cache.hdf5', 'my_node')
>>> for index in range(5):
>>>     data = {'x': array([1.])*index, 'y': array([.2])*index}
>>>     cache.cache_outputs(data, ['x'], data, ['y'])
>>> cache.get_length()
5
>>> cache.clear()
>>> cache.get_length()
0
get_data(index, **options)[source]

Gets the data associated to a sample ID.

Parameters
  • index (str) – sample ID.

  • options – options passed to the _read_data() method.

Returns

input data, output data and jacobian.

Return type

dict

class gemseo.caches.hdf5_cache.HDF5FileSingleton(*args, **kwargs)[source]

Bases: object

Singleton to access a HDF file Used for multithreaded/multiprocessing access with a Lock

Constructor

Parameters

hdf_file_path – path to the HDF5 file

HASH_TAG = 'hash'
INPUTS_GROUP = 'inputs'
JACOBIAN_GROUP = 'jacobian'
OUTPUTS_GROUP = 'outputs'
clear(hdf_node_path)[source]

Clear the data in the cache :param hdf_node_path: node path to clear

has_group(group_number, group_name, hdf_node_path)[source]

Check if a group is present in the HDF file

Parameters
  • group_name – name of the group where data is written

  • group_number – number of the group

  • hdf_node_path – name of the main HDF group

Returns

True if the group exists

read_data(group_number, group_name, hdf_node_path, h5_open_file=None)[source]

Read a data dict in the hdf

Parameters
  • group_name – name of the group where data is written

  • group_number – number of the group :param hdf_node_path: name of the main HDF group

  • h5_open_file – eventually the already opened file. this improves performance but is incompatible with multiprocess/treading

Returns

data dict and jacobian

read_hashes(hashes_dict, hdf_node_path)[source]

Read the hashes in the HDF file :param hashes_dict: dict of hashes to fill :param hdf_node_path: name of the main HDF group :return: max_group

write_data(data, data_names, group_name, group_num, hdf_node_path, h5_open_file=None)[source]

Cache input data to avoid re evaluation

Parameters
  • data – the data to cache

  • data_names – list of data names

  • group_name – inputs or outputs or jacobian group

  • hdf_node_path – name of the main HDF group

  • h5_open_file – eventually the already opened file. this improves performance but is incompatible with multiprocess/treading