gemseo / mlearning / quality_measures

Show inherited members

rmse_measure module

The root mean squared error to measure the quality of a regression algorithm.

The mse_measure module implements the concept of root mean squared error measures for machine learning algorithms.

This concept is implemented through the RMSEMeasure class and overloads the MSEMeasure.evaluate_*() methods.

The root mean squared error (RMSE) is defined by

\[\operatorname{RMSE}(\hat{y})=\sqrt{\frac{1}{n}\sum_{i=1}^n(\hat{y}_i-y_i)^2},\]

where \(\hat{y}\) are the predictions and \(y\) are the data points.

class gemseo.mlearning.quality_measures.rmse_measure.RMSEMeasure(algo, fit_transformers=True)[source]

Bases: MSEMeasure

The root mean Squared Error measure for machine learning.

Parameters:
  • algo (MLRegressionAlgo) – A machine learning algorithm for regression.

  • fit_transformers (bool) –

    Whether to re-fit the transformers when using resampling techniques. If False, use the transformers of the algorithm fitted from the whole learning dataset.

    By default it is set to True.

evaluate_bootstrap(n_replicates=100, samples=None, multioutput=True, seed=None, as_dict=False)[source]

Evaluate the quality measure using the bootstrap technique.

Parameters:
  • n_replicates (int) –

    The number of bootstrap replicates.

    By default it is set to 100.

  • samples (Sequence[int] | None) – The indices of the learning samples. If None, use the whole learning dataset.

  • multioutput (bool) –

    If True, return the quality measure for each output component. Otherwise, average these measures.

    By default it is set to True.

  • seed (int | None) – The seed of the pseudo-random number generator. If None, then an unpredictable generator will be used.

  • as_dict (bool) –

    Whether to express the measure as a dictionary whose keys are the output names.

    By default it is set to False.

Returns:

The value of the quality measure.

Return type:

MeasureType

evaluate_kfolds(n_folds=5, samples=None, multioutput=True, randomize=True, seed=None, as_dict=False)[source]

Evaluate the quality measure using the k-folds technique.

Parameters:
  • n_folds (int) –

    The number of folds.

    By default it is set to 5.

  • samples (Sequence[int] | None) – The indices of the learning samples. If None, use the whole learning dataset.

  • multioutput (bool) –

    If True, return the quality measure for each output component. Otherwise, average these measures.

    By default it is set to True.

  • randomize (bool) –

    Whether to shuffle the samples before dividing them in folds.

    By default it is set to True.

  • seed (int | None) – The seed of the pseudo-random number generator. If None, then an unpredictable generator will be used.

  • as_dict (bool) –

    Whether to express the measure as a dictionary whose keys are the output names.

    By default it is set to False.

Returns:

The value of the quality measure.

Return type:

MeasureType

evaluate_learn(samples=None, multioutput=True, as_dict=False)[source]

Evaluate the quality measure from the learning dataset.

Parameters:
  • samples (Sequence[int] | None) – The indices of the learning samples. If None, use the whole learning dataset.

  • multioutput (bool) –

    If True, return the quality measure for each output component. Otherwise, average these measures.

    By default it is set to True.

  • as_dict (bool) –

    Whether to express the measure as a dictionary whose keys are the output names.

    By default it is set to False.

Returns:

The value of the quality measure.

Return type:

MeasureType

evaluate_test(test_data, samples=None, multioutput=True, as_dict=False)[source]

Evaluate the quality measure using a test dataset.

Parameters:
  • test_data (IODataset) – The test dataset.

  • samples (Sequence[int] | None) – The indices of the learning samples. If None, use the whole learning dataset.

  • multioutput (bool) –

    If True, return the quality measure for each output component. Otherwise, average these measures.

    By default it is set to True.

  • as_dict (bool) –

    Whether to express the measure as a dictionary whose keys are the output names.

    By default it is set to False.

Returns:

The value of the quality measure.

Return type:

MeasureType

algo: MLAlgo

The machine learning algorithm usually trained.