gemseo / problems / dataset

Hide inherited members

iris module

Iris dataset.

This is one of the best known Dataset to be found in the machine learning literature.

It was introduced by the statistician Ronald Fisher in his 1936 paper “The use of multiple measurements in taxonomic problems”, Annals of Eugenics. 7 (2): 179-188.

It contains 150 instances of iris plants:

  • 50 Iris Setosa,

  • 50 Iris Versicolour,

  • 50 Iris Virginica.

Each instance is characterized by:

  • its sepal length in cm,

  • its sepal width in cm,

  • its petal length in cm,

  • its petal width in cm.

This Dataset can be used for either clustering purposes or classification ones.

More information about the Iris dataset

gemseo.problems.dataset.iris.create_iris_dataset(as_io=False, as_numeric=True)[source]

Iris dataset parametrization.

Parameters:
  • as_io (bool) –

    Whether to use Input/Output group names.

    By default it is set to False.

  • as_numeric (bool) –

    Whether to consider a string label or a numeric one.

    By default it is set to True.

Returns:

The Iris dataset.

Return type:

Dataset