# Built-in datasets¶

## Dataset factory¶

This module contains a factory
to instantiate a `Dataset`

from its class name.
The class can be internal to GEMSEO or located in an external module whose path
is provided to the constructor. It also provides a list of available cache
types and allows you to test if a cache type is available.

## Burgers dataset¶

This `Dataset`

contains solutions to the Burgers’ equation with
periodic boundary conditions on the interval \([0, 2\pi]\) for different
time steps:

An analytical expression can be obtained for the solution, using the Cole-Hopf transform:

where \(\phi\) is solution to the heat equation \(\phi_t = \nu \phi_{xx}\).

This `Dataset`

is based on a full-factorial
design of experiments. Each sample corresponds to a given time step \(t\),
while each feature corresponds to a given spatial point \(x\).

## Iris dataset¶

This is one of the best known `Dataset`

to be found in the machine learning literature.

It was introduced by the statistician Ronald Fisher in his 1936 paper “The use of multiple measurements in taxonomic problems”, Annals of Eugenics. 7 (2): 179–188.

It contains 150 instances of iris plants:

50 Iris Setosa,

50 Iris Versicolour,

50 Iris Virginica.

Each instance is characterized by:

its sepal length in cm,

its sepal width in cm,

its petal length in cm,

its petal width in cm.

This `Dataset`

can be used for either clustering purposes
or classification ones.

## Rosenbrock dataset¶

This `Dataset`

contains 100 evaluations
of the well-known Rosenbrock function:

This function is known for its global minimum at point (1,1), its banana valley and the difficulty to reach its minimum.

This `Dataset`

is based on a full-factorial
design of experiments.