gemseo.mlearning.classification.algos.base_classifier module#

The base class for classification algorithms.

class BaseClassifier(data, settings_model=None, **settings)[source]#

Bases: BaseMLSupervisedAlgo

The base class for classification algorithms.

Parameters:
  • data (Dataset) -- The learning dataset.

  • settings_model (BaseMLAlgoSettings | None) -- The machine learning algorithm settings as a Pydantic model. If None, use **settings.

  • **settings (Any) -- The machine learning algorithm settings. These arguments are ignored when settings_model is not None.

Raises:

ValueError -- When both the variable and the group it belongs to have a transformer.

Settings#

alias of BaseClassifierSettings

predict_proba(input_data, hard=True)[source]#

Predict the probability of belonging to each cluster from input data.

The user can specify these input data either as a numpy array, e.g. array([1., 2., 3.]) or as a dictionary, e.g. {'a': array([1.]), 'b': array([2., 3.])}.

If the numpy arrays are of dimension 2, their i-th rows represent the input data of the i-th sample; while if the numpy arrays are of dimension 1, there is a single sample.

The type of the output data and the dimension of the output arrays will be consistent with the type of the input data and the size of the input arrays.

Parameters:
  • input_data (DataType) -- The input data.

  • hard (bool) --

    Whether clustering should be hard (True) or soft (False).

    By default it is set to True.

Returns:

The probability of belonging to each cluster.

Return type:

DataType

n_classes: int#

The number of classes computed when calling learn().