Measure the quality of a machine learning algorithm#
Introduction#
Measuring the quality of a machine learning algorithm.
- class BaseMLAlgoQuality(algo, fit_transformers=True)[source]
The base class to assess the quality of a machine learning algorithm.
This measure can be minimized (e.g.
MSEMeasure) or maximized (e.g.R2Measure).It can be evaluated from the training dataset, from a test dataset or using resampling techniques such as boostrap, cross-validation or leave-one-out.
The machine learning algorithm is usually trained. If not but required by the evaluation technique, the quality measure will train it.
Lastly, the transformers of the algorithm fitted from the training dataset can be used as is by the resampling methods or re-fitted for each algorithm trained on a subset of the training dataset.
- Parameters:
algo (BaseMLAlgo) -- A machine learning algorithm.
fit_transformers (bool) --
Whether to re-fit the transformers when using resampling techniques. If
False, use the transformers of the algorithm fitted from the whole training dataset.By default it is set to True.
- class EvaluationFunctionName(*values)[source]
The name of the function associated with an evaluation method.
- class EvaluationMethod(*values)[source]
The evaluation method.
- BOOTSTRAP = 'BOOTSTRAP'
The name of the method to evaluate the measure by bootstrap.
- KFOLDS = 'KFOLDS'
The name of the method to evaluate the measure by cross-validation.
- LEARN = 'LEARN'
The name of the method to evaluate the measure on the training dataset.
- LOO = 'LOO'
The name of the method to evaluate the measure by leave-one-out.
- TEST = 'TEST'
The name of the method to evaluate the measure on a test dataset.
- classmethod is_better(val1, val2)[source]
Compare the quality between two values.
This method returns
Trueif the first one is better than the second one.For most measures, a smaller value is "better" than a larger one (MSE etc.). But for some, like an R2-measure, higher values are better than smaller ones. This comparison method correctly handles this, regardless of the type of measure.
- abstractmethod compute_bootstrap_measure(n_replicates=100, samples=(), multioutput=True, seed=None, store_resampling_result=False)[source]
Evaluate the quality of the ML model using the bootstrap technique.
- Parameters:
n_replicates (int) --
The number of bootstrap replicates.
By default it is set to 100.
The indices of the learning samples. If empty, use the whole training dataset.
By default it is set to ().
multioutput (bool) --
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
seed (int | None) -- The seed of the pseudo-random number generator. If
None, an unpredictable generator will be used.store_resampling_result (bool) --
Whether to store the \(n\) machine learning algorithms and associated predictions generated by the resampling stage where \(n\) is the number of bootstrap replicates.
By default it is set to False.
- Returns:
The quality of the ML model.
- Return type:
float | ndarray[tuple[Any, ...], dtype[floating[Any]]] | dict[str, ndarray[tuple[Any, ...], dtype[floating[Any]]]]
- abstractmethod compute_cross_validation_measure(n_folds=5, samples=(), multioutput=True, randomize=True, seed=None, store_resampling_result=False)[source]
Evaluate the quality of the ML model using the k-folds technique.
- Parameters:
n_folds (int) --
The number of folds.
By default it is set to 5.
The indices of the learning samples. If empty, use the whole training dataset.
By default it is set to ().
multioutput (bool) --
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
randomize (bool) --
Whether to shuffle the samples before dividing them in folds.
By default it is set to True.
seed (int | None) -- The seed of the pseudo-random number generator. If
None, an unpredictable generator is used.store_resampling_result (bool) --
Whether to store the \(n\) machine learning algorithms and associated predictions generated by the resampling stage where \(n\) is the number of folds.
By default it is set to False.
- Returns:
The quality of the ML model.
- Return type:
float | ndarray[tuple[Any, ...], dtype[floating[Any]]] | dict[str, ndarray[tuple[Any, ...], dtype[floating[Any]]]]
- abstractmethod compute_learning_measure(samples=(), multioutput=True)[source]
Evaluate the quality of the ML model from the training dataset.
- Parameters:
The indices of the learning samples. If empty, use the whole training dataset.
By default it is set to ().
multioutput (bool) --
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
- Returns:
The quality of the ML model.
- Return type:
float | ndarray[tuple[Any, ...], dtype[floating[Any]]] | dict[str, ndarray[tuple[Any, ...], dtype[floating[Any]]]]
- compute_leave_one_out_measure(samples=(), multioutput=True, store_resampling_result=True)[source]
Evaluate the quality of the ML model using the leave-one-out technique.
- Parameters:
The indices of the learning samples. If empty, use the whole training dataset.
By default it is set to ().
multioutput (bool) --
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
store_resampling_result (bool) --
Whether to store the \(n\) machine learning algorithms and associated predictions generated by the resampling stage where \(n\) is the number of learning samples.
By default it is set to True.
- Returns:
The quality of the ML model.
- Return type:
float | ndarray[tuple[Any, ...], dtype[floating[Any]]] | dict[str, ndarray[tuple[Any, ...], dtype[floating[Any]]]]
- abstractmethod compute_test_measure(test_data, samples=(), multioutput=True)[source]
Evaluate the quality of the ML model from a test dataset.
- Parameters:
test_data (Dataset) -- The test dataset.
The indices of the learning samples. If empty, use the whole training dataset.
By default it is set to ().
multioutput (bool) --
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
- Returns:
The quality of the ML model.
- Return type:
float | ndarray[tuple[Any, ...], dtype[floating[Any]]] | dict[str, ndarray[tuple[Any, ...], dtype[floating[Any]]]]
- SMALLER_IS_BETTER: ClassVar[bool] = True
Whether to minimize or maximize the measure.
- algo: BaseMLAlgo
The machine learning algorithm whose quality we want to measure.
Measures for supervised models#
Introduction#
The base class to assess the quality of a regressor.
- class BaseRegressorQuality(algo, fit_transformers=True)[source]
The base class to assess the quality of a regressor.
- Parameters:
algo (BaseMLSupervisedAlgo) -- A machine learning algorithm for supervised learning.
fit_transformers (bool) --
Whether to re-fit the transformers when using resampling techniques. If
False, use the transformers of the algorithm fitted from the whole training dataset.By default it is set to True.
- compute_bootstrap_measure(n_replicates=100, samples=(), multioutput=True, seed=None, as_dict=False, store_resampling_result=False)[source]
Evaluate the quality of the ML model using the bootstrap technique.
- Parameters:
n_replicates (int) --
The number of bootstrap replicates.
By default it is set to 100.
samples (Sequence[int]) --
The indices of the learning samples. If empty, use the whole training dataset.
By default it is set to ().
multioutput (bool) --
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
seed (int | None) -- The seed of the pseudo-random number generator. If
None, an unpredictable generator will be used.as_dict (bool) --
Whether the full quality measure is returned as a mapping from
algo.output namesto quality measures. Otherwise, the full quality measure as an array stacking these quality measures according to the order ofalgo.output_names.By default it is set to False.
store_resampling_result (bool) --
Whether to store the \(n\) machine learning algorithms and associated predictions generated by the resampling stage where \(n\) is the number of bootstrap replicates.
By default it is set to False.
- Returns:
The quality of the ML model.
- Return type:
MeasureType
- compute_cross_validation_measure(n_folds=5, samples=(), multioutput=True, randomize=True, seed=None, as_dict=False, store_resampling_result=False)[source]
Evaluate the quality of the ML model using the k-folds technique.
- Parameters:
n_folds (int) --
The number of folds.
By default it is set to 5.
samples (Sequence[int]) --
The indices of the learning samples. If empty, use the whole training dataset.
By default it is set to ().
multioutput (bool) --
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
randomize (bool) --
Whether to shuffle the samples before dividing them in folds.
By default it is set to True.
seed (int | None) -- The seed of the pseudo-random number generator. If
None, an unpredictable generator is used.as_dict (bool) --
Whether the full quality measure is returned as a mapping from
algo.output namesto quality measures. Otherwise, the full quality measure as an array stacking these quality measures according to the order ofalgo.output_names.By default it is set to False.
store_resampling_result (bool) --
Whether to store the \(n\) machine learning algorithms and associated predictions generated by the resampling stage where \(n\) is the number of folds.
By default it is set to False.
- Returns:
The quality of the ML model.
- Return type:
MeasureType
- compute_learning_measure(samples=(), multioutput=True, as_dict=False)[source]
Evaluate the quality of the ML model from the training dataset.
- Parameters:
samples (Sequence[int]) --
The indices of the learning samples. If empty, use the whole training dataset.
By default it is set to ().
multioutput (bool) --
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
as_dict (bool) --
Whether the full quality measure is returned as a mapping from
algo.output namesto quality measures. Otherwise, the full quality measure as an array stacking these quality measures according to the order ofalgo.output_names.By default it is set to False.
- Returns:
The quality of the ML model.
- Return type:
MeasureType
- compute_leave_one_out_measure(samples=(), multioutput=True, as_dict=False, store_resampling_result=False)[source]
Evaluate the quality of the ML model using the leave-one-out technique.
- Parameters:
samples (Sequence[int]) --
The indices of the learning samples. If empty, use the whole training dataset.
By default it is set to ().
multioutput (bool) --
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
as_dict (bool) --
Whether the full quality measure is returned as a mapping from
algo.output namesto quality measures. Otherwise, the full quality measure as an array stacking these quality measures according to the order ofalgo.output_names.By default it is set to False.
store_resampling_result (bool) --
Whether to store the \(n\) machine learning algorithms and associated predictions generated by the resampling stage where \(n\) is the number of learning samples.
By default it is set to False.
- Returns:
The quality of the ML model.
- Return type:
MeasureType
- compute_test_measure(test_data, samples=(), multioutput=True, as_dict=False)[source]
Evaluate the quality of the ML model from a test dataset.
- Parameters:
test_data (IODataset) -- The test dataset.
samples (Sequence[int]) --
The indices of the learning samples. If empty, use the whole training dataset.
By default it is set to ().
multioutput (bool) --
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
as_dict (bool) --
Whether the full quality measure is returned as a mapping from
algo.output namesto quality measures. Otherwise, the full quality measure as an array stacking these quality measures according to the order ofalgo.output_names.By default it is set to False.
- Returns:
The quality of the ML model.
- Return type:
MeasureType
- algo: BaseMLSupervisedAlgo
The machine learning algorithm whose quality we want to measure.
The MSE#
The mean squared error to assess the quality of a regressor.
The mean squared error (MSE) is defined by
where \(\hat{y}\) are the predictions and \(y\) are the data points.
- class MSEMeasure(algo, fit_transformers=True)[source]
The mean squared error to assess the quality of a regressor.
- Parameters:
algo (BaseRegressor) -- A machine learning algorithm for regression.
fit_transformers (bool) --
Whether to re-fit the transformers when using resampling techniques. If
False, use the transformers of the algorithm fitted from the whole training dataset.By default it is set to True.
The R2#
The R2 score to assess the quality of a regressor.
The R2 score s defined by
where \(\hat{y}\) are the predictions, \(y\) are the data points and \(\bar{y}\) is the mean of \(y\).
- class R2Measure(algo, fit_transformers=True)[source]
The R2 score to assess the quality of a regressor.
- Parameters:
algo (BaseRegressor) -- A machine learning algorithm for regression.
fit_transformers (bool) --
Whether to re-fit the transformers when using resampling techniques. If
False, use the transformers of the algorithm fitted from the whole training dataset.By default it is set to True.
- compute_bootstrap_measure(n_replicates=100, samples=(), multioutput=True, seed=None, as_dict=False, store_resampling_result=False)[source]
Evaluate the quality of the ML model using the bootstrap technique.
- Parameters:
n_replicates (int) --
The number of bootstrap replicates.
By default it is set to 100.
The indices of the learning samples. If empty, use the whole training dataset.
By default it is set to ().
multioutput (bool) --
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
seed (int | None) -- The seed of the pseudo-random number generator. If
None, an unpredictable generator will be used.as_dict (bool) --
Whether the full quality measure is returned as a mapping from
algo.output namesto quality measures. Otherwise, the full quality measure as an array stacking these quality measures according to the order ofalgo.output_names.By default it is set to False.
store_resampling_result (bool) --
Whether to store the \(n\) machine learning algorithms and associated predictions generated by the resampling stage where \(n\) is the number of bootstrap replicates.
By default it is set to False.
- Returns:
The quality of the ML model.
- Return type:
MeasureType
- compute_cross_validation_measure(n_folds=5, samples=(), multioutput=True, randomize=True, seed=None, as_dict=False, store_resampling_result=False)[source]
Evaluate the quality of the ML model using the k-folds technique.
- Parameters:
n_folds (int) --
The number of folds.
By default it is set to 5.
The indices of the learning samples. If empty, use the whole training dataset.
By default it is set to ().
multioutput (bool) --
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
randomize (bool) --
Whether to shuffle the samples before dividing them in folds.
By default it is set to True.
seed (int | None) -- The seed of the pseudo-random number generator. If
None, an unpredictable generator is used.as_dict (bool) --
Whether the full quality measure is returned as a mapping from
algo.output namesto quality measures. Otherwise, the full quality measure as an array stacking these quality measures according to the order ofalgo.output_names.By default it is set to False.
store_resampling_result (bool) --
Whether to store the \(n\) machine learning algorithms and associated predictions generated by the resampling stage where \(n\) is the number of folds.
By default it is set to False.
- Returns:
The quality of the ML model.
- Return type:
MeasureType
- SMALLER_IS_BETTER: ClassVar[bool] = False
Whether to minimize or maximize the measure.
The F1 (for classification)#
The F1 score to assess the quality of a classifier.
The F1 score is defined by
where \(\mathit{precision}\) is the number of correctly predicted positives divided by the total number of predicted positives and \(\mathit{recall}\) is the number of correctly predicted positives divided by the total number of true positives.
- class F1Measure(algo, fit_transformers=True)[source]
The F1 score to assess the quality of a classifier.
- Parameters:
algo (BaseClassifier) -- A machine learning algorithm for classification.
fit_transformers (bool) --
Whether to re-fit the transformers when using resampling techniques. If
False, use the transformers of the algorithm fitted from the whole training dataset.By default it is set to True.
- SMALLER_IS_BETTER: ClassVar[bool] = False
Whether to minimize or maximize the measure.
- algo: BaseClassifier
The machine learning algorithm whose quality we want to measure.
Measures for clustering models#
Introduction#
The base class to assess the quality of a clusterer.
- class BaseClustererQuality(algo, fit_transformers=True)[source]
The base class to assess the quality of a clusterer.
- Parameters:
algo (BaseClusterer) -- A machine learning algorithm for clustering.
fit_transformers (bool) --
Whether to re-fit the transformers when using resampling techniques. If
False, use the transformers of the algorithm fitted from the whole training dataset.By default it is set to True.
- compute_learning_measure(samples=(), multioutput=True)[source]
Evaluate the quality of the ML model from the training dataset.
- Parameters:
samples (Sequence[int]) --
The indices of the learning samples. If empty, use the whole training dataset.
By default it is set to ().
multioutput (bool) --
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
- Returns:
The quality of the ML model.
- Return type:
MeasureType
- algo: BaseClusterer
The machine learning algorithm whose quality we want to measure.
The silhouette#
The silhouette score to assess the quality of a clusterer.
The silhouette coefficient \(s_i\) is a measure of how similar a point \(x_i\) is to its own cluster \(C_{k_i}\) (cohesion) compared to other clusters (separation):
with \(a_i=\frac{1}{|C_{k_i}|-1} \sum_{j\in C_{k_i}\setminus\{i\} } \|x_i-x_j\|\) and \(b_i = \underset{\ell=1,\cdots,K\atop{\ell\neq k_i}}{\min} \frac{1}{|C_\ell|} \sum_{j\in C_\ell} \|x_i-x_j\|\)
where
\(K\) is the number of clusters,
\(C_k\) are the indices of the points belonging to the cluster \(k\),
\(|C_k|\) is the size of \(C_k\).
- class SilhouetteMeasure(algo, fit_transformers=True)[source]
The silhouette score to assess the quality of a clusterer.
- Parameters:
algo (BasePredictiveClusterer) -- A clustering algorithm.
fit_transformers (bool) --
Whether to re-fit the transformers when using resampling techniques. If
False, use the transformers of the algorithm fitted from the whole training dataset.By default it is set to True.
- compute_bootstrap_measure(n_replicates=100, samples=(), multioutput=True, seed=None)[source]
Evaluate the quality of the ML model using the bootstrap technique.
- Parameters:
n_replicates (int) --
The number of bootstrap replicates.
By default it is set to 100.
samples (Sequence[int]) --
The indices of the learning samples. If empty, use the whole training dataset.
By default it is set to ().
multioutput (bool) --
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
seed (int | None) -- The seed of the pseudo-random number generator. If
None, an unpredictable generator will be used.
- Returns:
The quality of the ML model.
- Return type:
MeasureType
- compute_cross_validation_measure(n_folds=5, samples=(), multioutput=True, randomize=True, seed=None)[source]
Evaluate the quality of the ML model using the k-folds technique.
- Parameters:
n_folds (int) --
The number of folds.
By default it is set to 5.
samples (Sequence[int]) --
The indices of the learning samples. If empty, use the whole training dataset.
By default it is set to ().
multioutput (bool) --
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
randomize (bool) --
Whether to shuffle the samples before dividing them in folds.
By default it is set to True.
seed (int | None) -- The seed of the pseudo-random number generator. If
None, an unpredictable generator is used.
- Returns:
The quality of the ML model.
- Return type:
MeasureType
- compute_test_measure(test_data, samples=(), multioutput=True)[source]
Evaluate the quality of the ML model from a test dataset.
- Parameters:
test_data (Dataset) -- The test dataset.
samples (Sequence[int]) --
The indices of the learning samples. If empty, use the whole training dataset.
By default it is set to ().
multioutput (bool) --
Whether the quality measure is returned for each component of the outputs. Otherwise, the average quality measure.
By default it is set to True.
- Returns:
The quality of the ML model.
- Return type:
MeasureType
- SMALLER_IS_BETTER: ClassVar[bool] = False
Whether to minimize or maximize the measure.