gemseo.algos.database module#

A database of function calls and design variables.

class Database(name='', input_space=None)[source]#

Bases: Mapping

Storage of MDOFunction evaluations.

A Database is typically attached to an OptimizationProblem to store the evaluations of its objective, constraints and observables.

Then, a Database can be an optimization history or a collection of samples in the case of a DOE.

It is useful when simulations are costly because it avoids re-evaluating functions at points where they have already been evaluated.

See also

NormDBFunction

It can also be post-processed by an BasePost to visualize its content, e.g. OptHistoryView generating a series of graphs to visualize the histories of the objective, constraints and design variables.

A Database can be saved to an HDF file for portability and cold post-processing with its method to_hdf(). A database can also be initialized from an HDF file as database = Database.from_hdf(file_path).

Note

Saving an OptimizationProblem to an HDF file using its method to_hdf also saves its Database.

The database is based on a two-level dictionary-like mapping such as {x: {output_name: output_value, ...}, ...} with:

  • x: the input value as an HashableNdarray wrapping a NumPy array that can be accessed as x.array; if the types of the input variables are different, then they are promoted to the unique type that can represent all them, for instance integer would be promoted to float; if the user does not provide any input space at instantiation, after the first call to the store() method, the input_space will include a single variable called DEFAULT_INPUT_NAME, with the right dimension;

  • output_name: either the name of the function that has been evaluated at x_vect, the name of its gradient (the gradient of a function called "f" is typically denoted as "@f") and any additional information related to the methods which use the database;

  • outputs: the output value, typically a float or a 1D-array for a function output, a 1D- or 2D-array for a gradient or a list for the iteration.

Parameters:
  • name (str) --

    The name to be given to the database. If empty, use the class name.

    By default it is set to "".

  • input_space (DesignSpace | None) -- The input space associated with this database. If None, create a default DesignSpace.

add_new_iter_listener(function)[source]#

Add a function to be called when a new iteration is stored to the database.

Parameters:

function (Callable[[ndarray | HashableNdarray], None]) -- The function to be called, it must have one argument that is the current input value.

Returns:

Whether the function has been added; otherwise, it was already attached to the database.

Return type:

bool

add_store_listener(function)[source]#

Add a function to be called when an item is stored to the database.

Parameters:

function (Callable[[ndarray | HashableNdarray], None]) -- The function to be called.

Returns:

Whether the function has been added; otherwise, it was already attached to the database.

Return type:

bool

check_output_history_is_empty(output_name)[source]#

Check if the history of an output is empty.

Parameters:

output_name (str) -- The name of the output.

Returns:

Whether the history of the output is empty.

Return type:

bool

clear()[source]#

Clear the database.

Return type:

None

clear_from_iteration(iteration)[source]#

Delete the items after a given iteration.

Parameters:

iteration (int) -- An iteration between 1 and the number of iterations; it can also be a negative integer if counting from the last iteration (e.g. -2 for the penultimate iteration).

Return type:

None

clear_listeners(new_iter_listeners=(), store_listeners=())[source]#

Clear all the listeners.

Parameters:
  • new_iter_listeners (Iterable[Callable[[ndarray | HashableNdarray], None]] | None) --

    The functions to be removed that were notified of a new iteration. If empty, remove all such functions. If None, keep all these functions.

    By default it is set to ().

  • store_listeners (Iterable[Callable[[ndarray | HashableNdarray], None]] | None) --

    The functions to be removed that were notified of a new entry in the database. If empty, remove all such functions. If None, keep all these functions.

    By default it is set to ().

Returns:

The listeners that were notified of a new iteration and the listeners that were notified of a new entry in the database.

Return type:

tuple[Iterable[Callable[[ndarray | HashableNdarray], None]], Iterable[Callable[[ndarray | HashableNdarray], None]]]

filter(output_names)[source]#

Keep only some outputs and remove the other ones.

Parameters:

output_names (Iterable[str]) -- The names of the outputs that must be kept.

Return type:

None

classmethod from_hdf(file_path='optimization_history.h5', name='', hdf_node_path='', log=True)[source]#

Create a database from an HDF file.

Parameters:
  • file_path (str | Path) --

    The path of the HDF file.

    By default it is set to "optimization_history.h5".

  • name (str) --

    The name of the database.

    By default it is set to "".

  • hdf_node_path (str) --

    The path of the HDF node from which the database should be exported. If empty, the root node is considered.

    By default it is set to "".

  • log (bool) --

    Whether to log the import of the database.

    By default it is set to True.

Returns:

The database defined in the file.

Return type:

Database

get_function_history(function_name, with_x_vect=False)[source]#

Return the history of a function output.

Parameters:
  • function_name (str) -- The name of the function.

  • with_x_vect (bool) --

    Whether to return also the input history.

    By default it is set to False.

Returns:

The history of the function output, and possibly the input history.

Raises:

KeyError -- When the database contains no output value for this function.

Return type:

ndarray | tuple[ndarray, ndarray]

get_function_names(skip_grad=True)[source]#

Return the names of the outputs contained in the database.

Parameters:

skip_grad (bool) --

Whether to skip the names of gradient functions.

By default it is set to True.

Returns:

The names of the outputs in alphabetical order.

Return type:

list[str]

get_function_value(function_name, x_vect_or_iteration, tolerance=0.0)[source]#

Return the output value of a function corresponding to a given input value.

Parameters:
  • function_name (str) -- The name of the required output function.

  • x_vect_or_iteration (ndarray | HashableNdarray | int) -- An input value or an iteration between 1 and the number of iterations; it can also be a negative integer if counting from the last iteration (e.g. -2 for the penultimate iteration).

  • tolerance (float) --

    The relative tolerance \(\epsilon\) such that the input value \(x\) is considered as equal to the input value \(x_{\text{database}}\) stored in the database if \(\|x-x_{\text{database}}\|/\|x_{\text{database}}\|\leq\epsilon\).

    By default it is set to 0.0.

Returns:

The output value of the function at the given input value if any, otherwise None.

Return type:

float | ndarray | list[int] | None

get_gradient_history(function_name, with_x_vect=False)[source]#

Return the history of the gradient of a function.

Parameters:
  • function_name (str) -- The name of the function for which we want the gradient history.

  • with_x_vect (bool) --

    Whether the input history should be returned as well.

    By default it is set to False.

Returns:

The history of the gradient of the function output, and possibly the input history.

Return type:

ndarray | tuple[ndarray, ndarray]

classmethod get_gradient_name(name)[source]#

Return the name of the gradient related to a function.

This name is the concatenation of a GRAD_TAG, e.g. '@', and the name of the function, e.g. 'f'. With this example, the name of the gradient is '@f'.

Parameters:

name (str) -- The name of a function.

Returns:

The name of the gradient based on the name of the function.

Return type:

str

static get_hashable_ndarray(original_array, copy=False)[source]#

Convert an array to a hashable array.

This hashable array basically represents a key of the database.

Parameters:
  • original_array (ndarray | HashableNdarray) -- An array.

  • copy (bool) --

    Whether to copy the original array.

    By default it is set to False.

Returns:

A hashable array wrapping the original array.

Raises:

KeyError -- If the original array is neither an array nor a HashableNdarray.

Return type:

HashableNdarray

get_history(function_names=(), add_missing_tag=False, missing_tag='NA')[source]#

Return the history of the inputs and outputs.

This includes the inputs, functions and gradients.

Parameters:
  • function_names (Iterable[str]) --

    The names of functions.

    By default it is set to ().

  • add_missing_tag (bool) --

    Whether to add the tag missing_tag to the iterations where data are missing.

    By default it is set to False.

  • missing_tag (str | float) --

    The tag to represent missing data.

    By default it is set to "NA".

Returns:

The history of the output values, then the history of the input values.

Raises:

ValueError -- When a function has no values in the database.

Return type:

tuple[list[list[float | ndarray]], list[ndarray]]

get_history_array(function_names=(), add_missing_tag=False, missing_tag='NA', input_names=(), with_x_vect=True)[source]#

Return the database as a 2D array shaped as (n_iterations, n_features).

The features are the outputs of interest and possibly the input variables.

Parameters:
  • function_names (Iterable[str]) --

    The names of the functions whose output values must be returned. If empty, use all the functions.

    By default it is set to ().

  • input_names (str | Iterable[str]) --

    The names of the input variables. If empty, use input_names.

    By default it is set to ().

  • add_missing_tag (bool) --

    If True, add the tag specified in missing_tag for data that are not available.

    By default it is set to False.

  • missing_tag (str | float) --

    The tag that is added for data that are not available.

    By default it is set to "NA".

  • with_x_vect (bool) --

    If True, the input variables are returned in the history as np.hstack((get_output_history, x_vect_history)).

    By default it is set to True.

Returns:

The history as an 2D array whose rows are observations and columns are the variables, the names of these columns and the names of the functions.

Return type:

tuple[NumberArray, list[str], Iterable[str]]

get_iteration(x_vect)[source]#

Return the iteration of an input value in the database.

Parameters:

x_vect (ndarray) -- The input value.

Returns:

The iteration of the input values in the database.

Raises:

KeyError -- If the required input value is not found.

Return type:

int

get_last_n_x_vect(n)[source]#

Return the last n input values.

Parameters:

n (int) -- The number of last iterations to be considered.

Returns:

The last n input value.

Raises:

ValueError -- If the number n is higher than the number of iterations.

Return type:

list[ndarray]

get_x_vect(iteration)[source]#

Return the input value at a specified iteration.

Parameters:

iteration (int) -- An iteration between 1 and the number of iterations; it can also be a negative integer if counting from the last iteration (e.g. -2 for the penultimate iteration).

Returns:

The input value at this iteration.

Return type:

ndarray

get_x_vect_history()[source]#

Return the history of the input vector.

Returns:

The history of the input vector.

Return type:

list[ndarray]

notify_new_iter_listeners(x_vect=None)[source]#

Notify the listeners that a new iteration is ongoing.

Parameters:

x_vect (ndarray | HashableNdarray | None) -- The input value. If None, use the input value of the last iteration.

Return type:

None

notify_store_listeners(x_vect=None)[source]#

Notify the listeners that a new entry was stored in the database.

Parameters:

x_vect (ndarray | HashableNdarray | None) -- The input value. If None, use the input value of the last iteration.

Return type:

None

remove_empty_entries()[source]#

Remove the entries that do not have output values.

Return type:

None

store(x_vect, outputs)[source]#

Store the output values associated to the input values.

Parameters:
Return type:

None

to_dataset(name='', export_gradients=False, input_values=(), dataset_class=<class 'gemseo.datasets.dataset.Dataset'>, input_group='parameters', output_group='parameters', gradient_group='gradients')[source]#

Export the database to a Dataset.

Parameters:
  • name (str) --

    The name to be given to the dataset. If empty, use the name of the database.

    By default it is set to "".

  • export_gradients (bool) --

    Whether to export the gradients of the functions if the latter are available in the database of the problem.

    By default it is set to False.

  • input_values (Iterable[RealArray]) --

    The input values to be considered. If empty, consider all the input values of the database.

    By default it is set to ().

  • dataset_class (type[Dataset]) --

    The dataset class.

    By default it is set to <class 'gemseo.datasets.dataset.Dataset'>.

  • input_group (str) --

    The name of the group to store the input values.

    By default it is set to "parameters".

  • output_group (str) --

    The name of the group to store the output values.

    By default it is set to "parameters".

  • gradient_group (str) --

    The name of the group to store the gradient values.

    By default it is set to "gradients".

Returns:

A dataset built from the database.

Return type:

Dataset

to_ggobi(function_names=(), file_path='opt_hist.xml', input_names=())[source]#

Export the database to an XML file for ggobi tool.

Parameters:
  • function_names (Iterable[str]) --

    The names of functions. If empty, use all the functions.

    By default it is set to ().

  • file_path (str | Path) --

    The path to the XML file.

    By default it is set to "opt_hist.xml".

  • input_names (str | Iterable[str]) --

    The names of the input variables. If empty, use input_names.

    By default it is set to ().

Return type:

None

to_hdf(file_path='optimization_history.h5', append=False, hdf_node_path='')[source]#

Export the optimization database to an HDF file.

Parameters:
  • file_path (str | Path) --

    The path of the HDF file.

    By default it is set to "optimization_history.h5".

  • append (bool) --

    Whether to append the data to the file.

    By default it is set to False.

  • hdf_node_path (str) --

    The path of the HDF node in which the database should be exported. If empty, the root node is considered.

    By default it is set to "".

Return type:

None

update_from_hdf(file_path='optimization_history.h5', hdf_node_path='')[source]#

Update the current database from an HDF file.

Parameters:
  • file_path (str | Path) --

    The path of the HDF file.

    By default it is set to "optimization_history.h5".

  • hdf_node_path (str) --

    The path of the HDF node from which the database should be imported. If empty, the root node is considered.

    By default it is set to "".

Return type:

None

update_from_opendace(database_file)[source]#

Update the current database from an opendace XML database.

Parameters:

database_file (str | Path) -- The path to an opendace database.

Return type:

None

DEFAULT_INPUT_NAME: ClassVar[str] = 'input'#

The default input name.

GRAD_TAG: ClassVar[str] = '@'#

The tag prefixing a function name to make it a gradient name.

E.g. "@f" is the name of the gradient of "f" when GRAD_TAG == "@".

MISSING_VALUE_TAG: ClassVar[str] = 'NA'#

The tag for a missing value.

property input_space: DesignSpace#

The input space.

property last_item: Mapping[str, float | ndarray | list[int]]#

The last item of the database.

property n_iterations: int#

The number of iterations.

This is the number of entries in the database.

name: str#

The name of the database.

DatabaseKeyType(*args, **kwargs)#

The type of a Database key.

alias of ndarray | HashableNdarray

DatabaseValueType(*args, **kwargs)#

The type of a Database value.

alias of Mapping[str, float | ndarray | list[int]]

FunctionOutputValueType(*args, **kwargs)#

The type of a function output value stored in a Database.

alias of float | ndarray | list[int]

ListenerType(*args, **kwargs)#

The type of a listener attached to an Database.

alias of Callable[[ndarray | HashableNdarray], None]