Database#

class Database(name='', input_space=None)[source]

Storage of MDOFunction evaluations.

A Database is typically attached to an OptimizationProblem to store the evaluations of its objective, constraints and observables.

Then, a Database can be an optimization history or a collection of samples in the case of a DOE.

It is useful when simulations are costly because it avoids re-evaluating functions at points where they have already been evaluated.

See also

NormDBFunction

It can also be post-processed by an BasePost to visualize its content, e.g. OptHistoryView generating a series of graphs to visualize the histories of the objective, constraints and design variables.

A Database can be saved to an HDF file for portability and cold post-processing with its method to_hdf(). A database can also be initialized from an HDF file as database = Database.from_hdf(file_path).

Note

Saving an OptimizationProblem to an HDF file using its method to_hdf also saves its Database.

The database is based on a two-level dictionary-like mapping such as {x: {output_name: output_value, ...}, ...} with:

x: the input value as an HashableNdarray wrapping a NumPy array that can be accessed as x.array; if the types of the input variables are different, then they are promoted to the unique type that can represent all them, for instance integer would be promoted to float; if the user does not provide any input space at instantiation, after the first call to the store() method, the input_space will include a single variable called DEFAULT_INPUT_NAME, with the right dimension;

output_name: either the name of the function that has been evaluated at x_vect, the name of its gradient (the gradient of a function called "f" is typically denoted as "@f") and any additional information related to the methods which use the database;

outputs: the output value, typically a float or a 1D-array for a function output, a 1D- or 2D-array for a gradient or a list for the iteration.

Parameters:

name (str) --
The name to be given to the database. If empty, use the class name.

By default it is set to "".
input_space (DesignSpace | None) -- The input space associated with this database. If None, create a default DesignSpace.

classmethod from_hdf(file_path='optimization_history.h5', name='', hdf_node_path='', log=True)[source]

Create a database from an HDF file.

The order of the values for each key is not guaranteed to be preserved.

Parameters:

file_path (str | Path) --
The path of the HDF file.

By default it is set to "optimization_history.h5".
name (str) --
The name of the database.

By default it is set to "".
hdf_node_path (str) --
The path of the HDF node from which the database should be exported. If empty, the root node is considered.

By default it is set to "".
log (bool) --
Whether to log the import of the database.

By default it is set to True.

Returns:

The database defined in the file.

Return type:

Database

classmethod get_gradient_name(name)[source]

Return the name of the gradient related to a function.

This name is the concatenation of a GRAD_TAG, e.g. '@', and the name of the function, e.g. 'f'. With this example, the name of the gradient is '@f'.

Parameters:: name (str) -- The name of a function.
Returns:: The name of the gradient based on the name of the function.
Return type:: str

static get_hashable_ndarray(original_array, copy=False)[source]

Convert an array to a hashable array.

This hashable array basically represents a key of the database.

Parameters:

original_array (ndarray | HashableNdarray) -- An array.
copy (bool) --
Whether to copy the original array.

By default it is set to False.

Returns:

A hashable array wrapping the original array.

Raises:

KeyError -- If the original array is neither an array nor a HashableNdarray.

Return type:

HashableNdarray

add_new_iter_listener(function, output_names=())[source]

Add a function to be called when a new iteration is stored to the database.

Parameters:

function (ListenerType) -- The function to be called, it must have one argument that is the current input value.
output_names (Iterable[str]) --
The names of the output variables whose values are to be stored in the database by this listener.

By default it is set to ().

Returns:

Whether the function has been added; otherwise, it was already attached to the database.

Return type:

bool

add_store_listener(function, output_names=())[source]

Add a function to be called when an item is stored to the database.

Parameters:

function (ListenerType) -- The function to be called.
output_names (Iterable[str]) --
The names of the output variables whose values are to be stored in the database by this listener.

By default it is set to ().

Returns:

Whether the function has been added; otherwise, it was already attached to the database.

Return type:

bool

check_output_history_is_empty(output_name)[source]

Check if the history of an output is empty.

Parameters:: output_name (str) -- The name of the output.
Returns:: Whether the history of the output is empty.
Return type:: bool

clear()[source]

Clear the database.

Return type:: None

clear_from_iteration(iteration)[source]

Delete the items after a given iteration.

Parameters:: iteration (int) -- An iteration between 1 and the number of iterations; it can also be a negative integer if counting from the last iteration (e.g. -2 for the penultimate iteration).
Return type:: None

clear_listeners(new_iter_listeners=(), store_listeners=())[source]

Clear all the listeners.

Parameters:

new_iter_listeners (Iterable[ListenerType] | None) --
The functions to be removed that were notified of a new iteration. If empty, remove all such functions. If None, keep all these functions.

By default it is set to ().
store_listeners (Iterable[ListenerType] | None) --
The functions to be removed that were notified of a new entry in the database. If empty, remove all such functions. If None, keep all these functions.

By default it is set to ().

Returns:

The listeners that were notified of a new iteration and the listeners that were notified of a new entry in the database.

Return type:

tuple[Iterable[ListenerType], Iterable[ListenerType]]

filter(output_names)[source]

Keep only some outputs and remove the other ones.

Parameters:: output_names (Iterable[str]) -- The names of the outputs that must be kept.
Return type:: None

get_function_history(function_name, with_x_vect=False)[source]

Return the history of a function output.

Parameters:

function_name (str) -- The name of the function.
with_x_vect (bool) --
Whether to return also the input history.

By default it is set to False.

Returns:

The history of the function output, and possibly the input history.

Raises:

KeyError -- When the database contains no output value for this function.

Return type:

ndarray | tuple[ndarray, ndarray]

get_function_names(skip_grad=True)[source]

Return the names of the outputs contained in the database.

Parameters:

skip_grad (bool) --

Whether to skip the names of gradient functions.

By default it is set to True.

Returns:

The names of the outputs in alphabetical order.

Return type:

list[str]

get_function_value(function_name, x_vect_or_iteration, tolerance=0.0)[source]

Return the output value of a function corresponding to a given input value.

Parameters:

function_name (str) -- The name of the required output function.
x_vect_or_iteration (ndarray | HashableNdarray | int) -- An input value or an iteration between 1 and the number of iterations; it can also be a negative integer if counting from the last iteration (e.g. -2 for the penultimate iteration).
tolerance (float) --
The relative tolerance \(\epsilon\) such that the input value \(x\) is considered as equal to the input value \(x_{\text{database}}\) stored in the database if \(\|x-x_{\text{database}}\|/\|x_{\text{database}}\|\leq\epsilon\).

By default it is set to 0.0.

Returns:

The output value of the function at the given input value if any, otherwise None.

Return type:

float | ndarray | list[int] | None

get_gradient_history(function_name, with_x_vect=False)[source]

Return the history of the gradient of a function.

Parameters:

function_name (str) -- The name of the function for which we want the gradient history.
with_x_vect (bool) --
Whether the input history should be returned as well.

By default it is set to False.

Returns:

The history of the gradient of the function output, and possibly the input history.

Return type:

ndarray | tuple[ndarray, ndarray]

get_history(function_names=(), add_missing_tag=False, missing_tag='NA')[source]

Return the history of the inputs and outputs.

This includes the inputs, functions and gradients.

Parameters:

function_names (Iterable[str]) --
The names of functions.

By default it is set to ().
add_missing_tag (bool) --
Whether to add the tag missing_tag to the iterations where data are missing.

By default it is set to False.
missing_tag (str | float) --
The tag to represent missing data.

By default it is set to "NA".

Returns:

The history of the output values, then the history of the input values.

Raises:

ValueError -- When a function has no values in the database.

Return type:

tuple[list[list[float | ndarray]], list[ndarray]]

get_history_array(function_names=(), add_missing_tag=False, missing_tag='NA', input_names=(), with_x_vect=True)[source]

Return the database as a 2D array shaped as (n_iterations, n_features).

The features are the outputs of interest and possibly the input variables.

Parameters:

function_names (Iterable[str]) --
The names of the functions whose output values must be returned. If empty, use all the functions.

By default it is set to ().
input_names (str | Iterable[str]) --
The names of the input variables to name the columns of the x_vect when with_x_vect is True. These names must match the dimension of the design vector. If empty, the i-th column is named "x_i".

By default it is set to ().
add_missing_tag (bool) --
If True, add the tag specified in missing_tag for data that are not available.

By default it is set to False.
missing_tag (str | float) --
The tag that is added for data that are not available.

By default it is set to "NA".
with_x_vect (bool) --
If True, the input variables are returned in the history as np.hstack((get_output_history, x_vect_history)).

By default it is set to True.

Raises:

ValueError -- If the number of names does not match the dimension of the design vector.

Returns:

The history as an 2D array whose rows are observations and columns are the variables, the names of these columns and the names of the functions.

Return type:

tuple[NumberArray, list[str], Iterable[str]]

get_iteration(x_vect)[source]

Return the iteration of an input value in the database.

Parameters:: x_vect (ndarray) -- The input value.
Returns:: The iteration of the input values in the database.
Raises:: KeyError -- If the required input value is not found.
Return type:: int

get_last_n_x_vect(n)[source]

Return the last n input values.

Parameters:: n (int) -- The number of last iterations to be considered.
Returns:: The last n input value.
Raises:: ValueError -- If the number n is higher than the number of iterations.
Return type:: list[ndarray]

get_x_vect(iteration)[source]

Return the input value at a specified iteration.

Parameters:: iteration (int) -- An iteration between 1 and the number of iterations; it can also be a negative integer if counting from the last iteration (e.g. -2 for the penultimate iteration).
Returns:: The input value at this iteration.
Return type:: ndarray

get_x_vect_history()[source]

Return the history of the input vector.

Returns:: The history of the input vector.
Return type:: list[ndarray]

notify_new_iter_listeners(x_vect=None)[source]

Notify the listeners that a new iteration is ongoing.

Parameters:: x_vect (ndarray | HashableNdarray | None) -- The input value. If None, use the input value of the last iteration.
Return type:: None

notify_store_listeners(x_vect=None)[source]

Notify the listeners that a new entry was stored in the database.

Parameters:: x_vect (ndarray | HashableNdarray | None) -- The input value. If None, use the input value of the last iteration.
Return type:: None

remove_empty_entries()[source]

Remove the entries that do not have output values.

Return type:: None

store(x_vect, outputs)[source]

Store the output values associated to the input values.

Parameters:

x_vect (ndarray | HashableNdarray) -- The input value.
outputs (Mapping[str, float | ndarray | list[int]]) -- The output value corresponding to the input value.

Return type:

None

to_dataset(name='', export_gradients=False, input_values=(), dataset_class=<class 'gemseo.datasets.dataset.Dataset'>, input_group='parameters', output_group='parameters', gradient_group='gradients', optimization_metadata=None, groups_to_variables=mappingproxy({}))[source]

Export the database to a Dataset.

Parameters:

name (str) --
The name to be given to the dataset. If empty, use the name of the database.

By default it is set to "".
export_gradients (bool) --
Whether to export the gradients of the functions if the latter are available in the database of the problem.

By default it is set to False.
input_values (Iterable[RealArray]) --
The input values to be considered. If empty, consider all the input values of the database.

By default it is set to ().
dataset_class (type[Dataset]) --
The dataset class.

By default it is set to <class 'gemseo.datasets.dataset.Dataset'>.
input_group (str) --
The name of the group to store the input values.

By default it is set to "parameters".
output_group (str) --
The name of the group to store the output values. This argument is ignored when groups_to_variables is defined.

By default it is set to "parameters".
gradient_group (str) --
The name of the group to store the gradient values.

By default it is set to "gradients".
groups_to_variables (Mapping[str, Iterable[str]]) --
The variable names mapped to their corresponding group to be stored in.

By default it is set to {}.
optimization_metadata (OptimizationMetadata | None)

Returns:

A dataset built from the database.

Return type:

Dataset

to_ggobi(function_names=(), file_path='opt_hist.xml', input_names=())[source]

Export the database to an XML file for ggobi tool.

Parameters:

function_names (Iterable[str]) --
The names of functions. If empty, use all the functions.

By default it is set to ().
file_path (str | Path) --
The path to the XML file.

By default it is set to "opt_hist.xml".
input_names (str | Iterable[str]) --
The names of the input variables. If empty, use input_names.

By default it is set to ().

Return type:

None

to_hdf(file_path='optimization_history.h5', append=False, hdf_node_path='')[source]

Export the optimization database to an HDF file.

When exporting to an HDF file, the order of the values for each entry is not guaranteed to be preserved.

Parameters:

file_path (str | Path) --
The path of the HDF file.

By default it is set to "optimization_history.h5".
append (bool) --
Whether to append the data to the file.

By default it is set to False.
hdf_node_path (str) --
The path of the HDF node in which the database should be exported. If empty, the root node is considered.

By default it is set to "".

Return type:

None

update_from_hdf(file_path='optimization_history.h5', hdf_node_path='')[source]

Update the current database from an HDF file.

Parameters:

file_path (str | Path) --
The path of the HDF file.

By default it is set to "optimization_history.h5".
hdf_node_path (str) --
The path of the HDF node from which the database should be imported. If empty, the root node is considered.

By default it is set to "".

Return type:

None

update_from_opendace(database_file)[source]

Update the current database from an opendace XML database.

Parameters:: database_file (str | Path) -- The path to an opendace database.
Return type:: None

DEFAULT_INPUT_NAME: ClassVar[str] = 'input': The default input name.

GRAD_TAG: ClassVar[str] = '@'

The tag prefixing a function name to make it a gradient name.

E.g. "@f" is the name of the gradient of "f" when GRAD_TAG == "@".

MISSING_VALUE_TAG: ClassVar[str] = 'NA': The tag for a missing value.

property input_space: DesignSpace: The input space.

property last_item: Mapping[str, float | ndarray | list[int]]: The last item of the database.

property listener_output_names: list[str]: The names of the output variables whose values are stored by listeners.

property n_iterations: int

The number of iterations.

This is the number of entries in the database.

name: str: The name of the database.

Database#

This Page