The input-output dataset

The IODataset proposes two particular group names, namely INPUT_GROUP and OUTPUT_GROUP. This particular Dataset is useful for supervised machine learning and sensitivity analysis.

from __future__ import annotations

from gemseo.datasets.io_dataset import IODataset

First, we instantiate the IODataset:

dataset = IODataset()

and add some input and output variables using the methods add_input_variable() and add_output_variable() that are based on Dataset.add_variable():

dataset.add_input_variable("a", [[1.0, 2.0], [4.0, 5.0]])
dataset.add_input_variable("b", [[3.0], [6.0]])
dataset.add_output_variable("c", [[-1.0], [-2.0]])

as well as another variable:

dataset.add_variable("x", [[10.0], [20.0]])
print(dataset)
GROUP     inputs           outputs parameters
VARIABLE       a         b       c          x
COMPONENT      0    1    0       0          0
0            1.0  2.0  3.0    -1.0       10.0
1            4.0  5.0  6.0    -2.0       20.0

We could also do the same with the methods add_input_group() and add_output_group().

dataset = IODataset()
dataset.add_input_group(
    [[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]], ["a", "b"], {"a": 2, "b": 1}
)
dataset.add_output_group([[-1.0], [-2.0]], ["c"])
dataset.add_variable("x", [[10.0], [20.0]])
print(dataset)
GROUP     inputs           outputs parameters
VARIABLE       a         b       c          x
COMPONENT      0    1    0       0          0
0            1.0  2.0  3.0    -1.0       10.0
1            4.0  5.0  6.0    -2.0       20.0

Then, we can easily access the names of the input and output variables:

print(dataset.input_names)
print(dataset.output_names)
['a', 'b']
['c']

and those of all variables:

print(dataset.variable_names)
['a', 'b', 'c', 'x']

The IODataset provides also the number of samples:

print(dataset.n_samples)
2

and the samples:

print(dataset.samples)
[0, 1]

Lastly, we can get the input data as an IODataset view:

print(dataset.input_dataset)
GROUP     inputs
VARIABLE       a         b
COMPONENT      0    1    0
0            1.0  2.0  3.0
1            4.0  5.0  6.0
print(dataset.output_dataset)
GROUP     outputs
VARIABLE        c
COMPONENT       0
0            -1.0
1            -2.0

Total running time of the script: ( 0 minutes 0.049 seconds)

Gallery generated by Sphinx-Gallery