Introduction to dataset¶
dataset module implements the concept of dataset
which is a key element for machine learning, post-processing,
data analysis, …
Dataset is an object
defined by data stored as a dictionary of 2D numpy arrays,
whose rows are samples, a.k.a. realizations, and columns are features,
a.k.a. parameters or variables. The indices of this dictionary are either
names of groups of variables or names of variables.
Dataset is also defined by
a list of variables names, a dictionary of variables sizes
and a dictionary of variables groups.
Dataset can be set either from a numpy array or a file.
AbstractFullCache or an
can also be exported to a
Dataset, we can easily access its length and get the data,
either as 2D array or as dictionaries indexed by the variables names.
We can get either the whole data,
data associated to a group or data associated to a list of variables.
It is also possible to export the
AbstractFullCache or a pandas DataFrame.