gemseo.datasets.io_dataset module#

A Dataset to store input and output values.

class IODataset(data=None, index=None, columns=None, dtype=None, copy=None, *, dataset_name='')[source]#

Bases: Dataset

A Dataset to store input and output values.

Warning

A Dataset behaves like any multi-index DataFrame but its instantiation using the constructor dataset = Dataset(data, ...) can lead to some inconsistencies (multi-index levels, index values, dtypes, ...). Hence, the construction from the dedicated methods is recommended, e.g. dataset = Dataset(); dataset.add_variable("x", data).

Notes

The columns of a data structure (NumPy array, DataFrame, Dataset, ...) are called features. The features of a Dataset include all the components of all the variables of all the groups.

Initialize self. See help(type(self)) for accurate signature.

Parameters:
  • data (ndarray | Iterable | dict | DataFrame | None) -- See DataFrame.

  • index (Axes | None) -- See DataFrame.

  • columns (Axes | None) -- See DataFrame.

  • dtype (Dtype | None) -- See DataFrame.

  • copy (bool | None) -- See DataFrame.

  • dataset_name (str) --

    The name of the dataset.

    By default it is set to "".

add_input_group(data, variable_names='i', variable_names_to_n_components=None)[source]#

Add the data related to the input group.

Parameters:
  • data (ndarray | Iterable[Any] | Any) -- The data.

  • variable_names (str | Iterable[str]) --

    The names of the variables. If empty, use DEFAULT_VARIABLE_NAME.

    By default it is set to "i".

  • variable_names_to_n_components (dict[str, int] | None) -- The number of components of the variables. If variable_names is empty, this argument is not considered. If None, assume that all the variables have a single component.

Return type:

None

add_input_variable(variable_name, data, components=())[source]#

Add data related to an input variable.

Parameters:
  • variable_name (str) -- The name of the variable.

  • data (ndarray | Iterable[Any] | Any) -- The data, either an array shaped as (n_entries, n_features), an array shaped as (n_entries,) that will be reshaped as (n_entries, 1) or a scalar that will be converted into an array shaped as (n_entries, 1).

  • components (int | Iterable[int]) --

    The components considered. If empty, use [0, ..., n_features].

    By default it is set to ().

Return type:

None

add_output_group(data, variable_names='o', variable_names_to_n_components=None)[source]#

Add the data related to the output group.

Parameters:
  • data (ndarray | Iterable[Any] | Any) -- The data.

  • variable_names (str | Iterable[str]) --

    The names of the variables. If empty, use DEFAULT_VARIABLE_NAME.

    By default it is set to "o".

  • variable_names_to_n_components (dict[str, int] | None) -- The number of components of the variables. If variable_names is empty, this argument is not considered. If None, assume that all the variables have a single component.

Return type:

None

add_output_variable(variable_name, data, components=())[source]#

Add data related to an output variable.

Parameters:
  • variable_name (str) -- The name of the variable.

  • data (ndarray | Iterable[Any] | Any) -- The data, either an array shaped as (n_entries, n_features), an array shaped as (n_entries,) that will be reshaped as (n_entries, 1) or a scalar that will be converted into an array shaped as (n_entries, 1).

  • components (int | Iterable[int]) --

    The components considered. If empty, use [0, ..., n_features].

    By default it is set to ().

Return type:

None

INPUT_GROUP: Final[str] = 'inputs'#

The group name for the input variables.

OUTPUT_GROUP: Final[str] = 'outputs'#

The group name for the output variables.

property input_dataset: IODataset#

The view of the input dataset.

property input_names: list[str]#

The names of the inputs.

Warning

The names are sorted with the Python function sorted.

property n_samples: int#

The number of samples.

property output_dataset: IODataset#

The view of the output dataset.

property output_names: list[str]#

The names of the outputs.

Warning

The names are sorted with the Python function sorted.

property samples: list[int | str]#

The ordered samples.