gemseo / caches

hdf5_file_singleton module

HDF5 file singleton used by the HDF5 cache.

class gemseo.caches.hdf5_file_singleton.HDF5FileSingleton(*args, **kwargs)[source]

Bases: object

Singleton to access a HDF file.

Used for multithreaded/multiprocessing access with a lock.

Parameters

hdf_file_path (str) – The path to the HDF file.

Return type

None

clear(hdf_node_path)[source]
Parameters

hdf_node_path (str) – The name of the HDF group to clear.

Return type

None

has_group(index, group, hdf_node_path)[source]

Check if an entry has data corresponding to a given group.

Parameters
  • index (int) – The index of the entry.

  • group (str) – The name of the group.

  • hdf_node_path (str) – The name of the HDF group where the entries are stored.

Returns

Whether the entry has data for this group.

Return type

bool

read_data(index, group, hdf_node_path, h5_open_file=None)[source]

Read the data for given index and group.

Parameters
  • index (int) – The index of the entry.

  • group (str) – The name of the group.

  • hdf_node_path (str) – The name of the HDF group where the entries are stored.

  • h5_open_file (h5py.File | None) –

    The opened HDF file. This improves performance but is incompatible with multiprocess/treading. If None, open it.

    By default it is set to None.

Returns

The group data and the input data hash.

Return type

Data | None | int | None

read_hashes(hashes_to_indices, hdf_node_path)[source]

Read the hashes in the HDF file.

Parameters
  • hashes_to_indices (dict[str, numpy.ndarray]) – The indices associated to the hashes.

  • hdf_node_path (str) – The name of the HDF group where the entries are stored.

Returns

The maximum index.

Return type

int

classmethod update_file_format(hdf_file_path)[source]

Update the format of a HDF5 file.

GEMSEO 3.2.0 added a HDF5FileSingleton.FILE_FORMAT_VERSION to the HDF5 files, to allow handling its maintenance and evolutions. In particular, GEMSEO 3.2.0 fixed the hashing of the data dictionaries.

Parameters

hdf_file_path (str | Path) – A HDF5 file path.

Return type

None

write_data(data, group, index, hdf_node_path, h5_open_file=None)[source]

Cache input data to avoid re-evaluation.

Parameters
  • data (Data) – The data containing the values of the names to cache.

  • group (str) – The name of the group, either AbstractFullCache._INPUTS_GROUP, AbstractFullCache._OUTPUTS_GROUP or AbstractFullCache._JACOBIAN_GROUP.

  • index (int) – The index of the entry in the cache.

  • hdf_node_path (str) – The name of the HDF group to store the entries.

  • h5_open_file (h5py.File | None) –

    The opened HDF file. This improves performance but is incompatible with multiprocess/treading. If None, open it.

    By default it is set to None.

Return type

None

FILE_FORMAT_VERSION: ClassVar[int] = 2

The version of the file format.

HASH_TAG: ClassVar[str] = 'hash'

The label for the hash.

hdf_file_path: str

The path to the HDF file.

lock: multiprocessing.context.BaseContext.RLock

The lock used for multithreading.