In [None]:
%matplotlib inline


# Dataset from a cache

In this example, we will see how to build a :class:`.Dataset` from objects
of an :class:`.AbstractFullCache`.
For that, we need to import this :class:`.Dataset` class:


In [None]:
from gemseo.api import configure_logger
from gemseo.caches.memory_full_cache import MemoryFullCache
from numpy import array

configure_logger()

## Synthetic data
Let us consider a :class:`.MemoryFullCache` storing two parameters:

- x with dimension 1 which is a cache input,
- y with dimension 2 which is a cache output.



In [None]:
cache = MemoryFullCache()
cache[{"x": array([1.0])}] = ({"y": array([2.0, 3.0])}, None)
cache[{"x": array([4.0])}] = ({"y": array([5.0, 6.0])}, None)

## Create a dataset
We can easily build a dataset from this :class:`.MemoryFullCache`,
either by separating the inputs from the outputs (default option):



In [None]:
dataset = cache.export_to_dataset("toy_cache")
print(dataset)

or by considering all features as default parameters:



In [None]:
dataset = cache.export_to_dataset("toy_cache", categorize=False)
print(dataset)

## Access properties



In [None]:
dataset = cache.export_to_dataset("toy_cache")

### Variables names
We can access the variables names:



In [None]:
print(dataset.variables)

### Variables sizes
We can access the variables sizes:



In [None]:
print(dataset.sizes)

### Variables groups
We can access the variables groups:



In [None]:
print(dataset.groups)

## Access data
Access by group
~~~~~~~~~~~~~~~
We can get the data by group, either as an array (default option):



In [None]:
print(dataset.get_data_by_group("inputs"))

or as a dictionary indexed by the variables names:



In [None]:
print(dataset.get_data_by_group("inputs", True))

### Access by variable name
We can get the data by variables names,
either as a dictionary indexed by the variables names (default option):



In [None]:
print(dataset.get_data_by_names(["x"]))

or as an array:



In [None]:
print(dataset.get_data_by_names(["x", "y"], False))

### Access all data
We can get all the data, either as a large array:



In [None]:
print(dataset.get_all_data())

or as a dictionary indexed by variables names:



In [None]:
print(dataset.get_all_data(as_dict=True))

We can get these data sorted by category, either with a large array for each
category:



In [None]:
print(dataset.get_all_data(by_group=False))

or with a dictionary of variables names:



In [None]:
print(dataset.get_all_data(by_group=False, as_dict=True))