r2_measure module¶
The R2 to measure the quality of a regression algorithm.
The r2_measure
module
implements the concept of R2 measures for machine learning algorithms.
This concept is implemented through the R2Measure
class
and overloads the MLErrorMeasure._compute_measure()
method.
The R2 is defined by
where \(\hat{y}\) are the predictions, \(y\) are the data points and \(\bar{y}\) is the mean of \(y\).
- class gemseo.mlearning.quality_measures.r2_measure.R2Measure(algo, fit_transformers=True)[source]
Bases:
MLErrorMeasure
The R2 measure for machine learning.
- Parameters:
algo (MLRegressionAlgo) – A machine learning algorithm for regression.
fit_transformers (bool) –
Whether to re-fit the transformers when using resampling techniques. If
False
, use the transformers of the algorithm fitted from the whole learning dataset.By default it is set to True.
- compute_bootstrap_measure(n_replicates=100, samples=None, multioutput=True, seed=None, as_dict=False)[source]
Evaluate the quality measure using the bootstrap technique.
- Parameters:
n_replicates (int) –
The number of bootstrap replicates.
By default it is set to 100.
samples (list[int] | None) – The indices of the learning samples. If
None
, use the whole learning dataset.multioutput (bool) –
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
seed (int | None) – The seed of the pseudo-random number generator. If
None
, an unpredictable generator will be used.as_dict (bool) –
Whether the full quality measure is returned as a mapping from
algo.output names
to quality measures. Otherwise, the full quality measure as an array stacking these quality measures according to the order ofalgo.output_names
.By default it is set to False.
- Returns:
The value of the quality measure.
- Return type:
NoReturn
- compute_cross_validation_measure(n_folds=5, samples=None, multioutput=True, randomize=True, seed=None, as_dict=False)[source]
- as_dict: Whether the full quality measure is returned
as a mapping from
algo.output names
to quality measures. Otherwise, the full quality measure as an array stacking these quality measures according to the order ofalgo.output_names
.
- Parameters:
n_folds (int) –
The number of folds.
By default it is set to 5.
samples (list[int] | None) – The indices of the learning samples. If
None
, use the whole learning dataset.multioutput (bool) –
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
randomize (bool) –
Whether to shuffle the samples before dividing them in folds.
By default it is set to True.
seed (int | None) – The seed of the pseudo-random number generator. If
None
, an unpredictable generator is used.as_dict (bool) –
The description is missing.
By default it is set to False.
- Returns:
The value of the quality measure.
- Return type:
MeasureType
- evaluate_bootstrap(n_replicates=100, samples=None, multioutput=True, seed=None, as_dict=False)
Evaluate the quality measure using the bootstrap technique.
- Parameters:
n_replicates (int) –
The number of bootstrap replicates.
By default it is set to 100.
samples (list[int] | None) – The indices of the learning samples. If
None
, use the whole learning dataset.multioutput (bool) –
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
seed (int | None) – The seed of the pseudo-random number generator. If
None
, an unpredictable generator will be used.as_dict (bool) –
Whether the full quality measure is returned as a mapping from
algo.output names
to quality measures. Otherwise, the full quality measure as an array stacking these quality measures according to the order ofalgo.output_names
.By default it is set to False.
- Returns:
The value of the quality measure.
- Return type:
NoReturn
- evaluate_kfolds(n_folds=5, samples=None, multioutput=True, randomize=True, seed=None, as_dict=False)
- as_dict: Whether the full quality measure is returned
as a mapping from
algo.output names
to quality measures. Otherwise, the full quality measure as an array stacking these quality measures according to the order ofalgo.output_names
.
- Parameters:
n_folds (int) –
The number of folds.
By default it is set to 5.
samples (list[int] | None) – The indices of the learning samples. If
None
, use the whole learning dataset.multioutput (bool) –
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
randomize (bool) –
Whether to shuffle the samples before dividing them in folds.
By default it is set to True.
seed (int | None) – The seed of the pseudo-random number generator. If
None
, an unpredictable generator is used.as_dict (bool) –
The description is missing.
By default it is set to False.
- Returns:
The value of the quality measure.
- Return type:
MeasureType
- SMALLER_IS_BETTER: ClassVar[bool] = False
Whether to minimize or maximize the measure.
- algo: MLAlgo
The machine learning algorithm usually trained.