mse_measure module¶

The mean squared error to measure the quality of a regression algorithm.

The mse_measure module implements the concept of mean squared error measures for machine learning algorithms.

This concept is implemented through the MSEMeasure class and overloads the MLErrorMeasure._compute_measure() method.

The mean squared error (MSE) is defined by

\[\operatorname{MSE}(\hat{y})=\frac{1}{n}\sum_{i=1}^n(\hat{y}_i-y_i)^2,\]

where \(\hat{y}\) are the predictions and \(y\) are the data points.

Classes:

MSEMeasure(algo)

The Mean Squared Error measure for machine learning.

class gemseo.mlearning.qual_measure.mse_measure.MSEMeasure(algo)[source]¶

Bases: gemseo.mlearning.qual_measure.error_measure.MLErrorMeasure

The Mean Squared Error measure for machine learning.

Attributes

algo (MLAlgo) – The machine learning algorithm.
algo (MLAlgo) – The machine learning algorithm.

Parameters

algo (MLRegressionAlgo) – A machine learning algorithm for regression.

Return type

None

Attributes:

`BOOTSTRAP`
`KFOLDS`
`LEARN`
`LOO`
`SMALLER_IS_BETTER`
`TEST`

Methods:

`evaluate`([method, samples])	Evaluate the quality measure.
`evaluate_bootstrap`([n_replicates, samples, …])	Evaluate the quality measure using the bootstrap technique.
`evaluate_kfolds`([n_folds, samples, multioutput])	Evaluate the quality measure using the k-folds technique.
`evaluate_learn`([samples, multioutput])	Evaluate the quality measure using the learning dataset.
`evaluate_loo`([samples, multioutput])	Evaluate the quality measure using the leave-one-out technique.
`evaluate_test`(test_data[, samples, multioutput])	Evaluate the quality measure using a test dataset.
`is_better`(val1, val2)	Compare the quality between two values.

BOOTSTRAP = 'bootstrap'¶

KFOLDS = 'kfolds'¶

LEARN = 'learn'¶

LOO = 'loo'¶

SMALLER_IS_BETTER = True¶

TEST = 'test'¶

evaluate(method='learn', samples=None, **options)¶

Evaluate the quality measure.

Parameters

method (str) – The name of the method to evaluate the quality measure.
samples (Optional[List[int]]) – The indices of the learning samples. If None, use the whole learning dataset.
**options – The options of the estimation method (e.g. ‘test_data’ for
'test' method (the) –
for the bootstrap one ('n_replicates') –
...) –
options (Optional[Union[List[int], bool, int, gemseo.core.dataset.Dataset]]) –

Returns

The value of the quality measure.

Raises

ValueError – If the name of the method is unknown.

Return type

Union[float, numpy.ndarray]

evaluate_bootstrap(n_replicates=100, samples=None, multioutput=True)¶

Evaluate the quality measure using the bootstrap technique.

Parameters

n_replicates (int) – The number of bootstrap replicates.
samples (Optional[List[int]]) – The indices of the learning samples. If None, use the whole learning dataset.
multioutput (bool) – If True, return the quality measure for each output component. Otherwise, average these measures.

Returns

The value of the quality measure.

Return type

Union[float, numpy.ndarray]

evaluate_kfolds(n_folds=5, samples=None, multioutput=True)¶

Evaluate the quality measure using the k-folds technique.

Parameters

n_folds (int) – The number of folds.
samples (Optional[List[int]]) – The indices of the learning samples. If None, use the whole learning dataset.
multioutput (bool) – If True, return the quality measure for each output component. Otherwise, average these measures.

Returns

The value of the quality measure.

Return type

Union[float, numpy.ndarray]

evaluate_learn(samples=None, multioutput=True)¶

Evaluate the quality measure using the learning dataset.

Parameters

samples (Optional[List[int]]) – The indices of the learning samples. If None, use the whole learning dataset.
multioutput (bool) – If True, return the quality measure for each output component. Otherwise, average these measures.

Returns

The value of the quality measure.

Return type

Union[float, numpy.ndarray]

evaluate_loo(samples=None, multioutput=True)¶

Evaluate the quality measure using the leave-one-out technique.

Parameters

samples (Optional[List[int]]) – The indices of the learning samples. If None, use the whole learning dataset.
multioutput (bool) – If True, return the quality measure for each output component. Otherwise, average these measures.

Returns

The value of the quality measure.

Return type

Union[float, numpy.ndarray]

evaluate_test(test_data, samples=None, multioutput=True)¶

Evaluate the quality measure using a test dataset.

Parameters

dataset – The test dataset.
samples (Optional[List[int]]) – The indices of the learning samples. If None, use the whole learning dataset.
multioutput (bool) – If True, return the quality measure for each output component. Otherwise, average these measures.
test_data (gemseo.core.dataset.Dataset) –

Returns

The value of the quality measure.

Return type

Union[float, numpy.ndarray]

classmethod is_better(val1, val2)¶

Compare the quality between two values.

This methods returns True if the first one is better than the second one.

For most measures, a smaller value is “better” than a larger one (MSE etc.). But for some, like an R2-measure, higher values are better than smaller ones. This comparison method correctly handles this, regardless of the type of measure.

Parameters

val1 (float) – The value of the first quality measure.
val2 (float) – The value of the second quality measure.

Returns

Whether val1 is of better quality than val2.

Return type

bool