distribution module¶
This module defines the notion of distribution of a machine learning algorithm.
Once a MLAlgo has been trained,
assessing its quality is important before using it.
One can not only measure its global quality (e.g. from a MLQualityMeasure)
but also its local one.
The MLRegressorDistribution class addresses the latter case,
by quantifying the robustness of a machine learning algorithm to a learning point.
The more robust it is,
the less variability it has around this point.
Note
For now, one does not consider any MLAlgo
but instances of MLRegressionAlgo.
The MLRegressorDistribution can be particularly useful to:
study the robustness of a
MLAlgow.r.t. learning dataset elements,evaluate acquisition criteria for adaptive learning purposes (see
MLDataAcquisitionandMLDataAcquisitionCriterion),etc.
The abstract MLRegressorDistribution class is derived into two classes:
KrigingDistribution:the
MLRegressionAlgois a Kriging model and this assessor takes advantage of the underlying Gaussian stochastic process,
RegressorDistribution:this class is based on sampling methods, such as bootstrap, cross-validation or leave-one-out.
See also
KrigingDistribution RegressorDistribution MLDataAcquisition MLDataAcquisitionCriterion MLDataAcquisitionCriterionFactory
- class gemseo_mlearning.adaptive.distribution.MLRegressorDistribution(algo)[source]¶
Bases:
objectDistribution related to a regression model.
- Parameters:
algo (MLRegressionAlgo) – A regression model.
- change_learning_set(learning_set)[source]¶
Re-train the machine learning algorithm relying on the initial learning set.
- Parameters:
learning_set (Dataset) – The new learning set.
- Return type:
None
- abstract compute_confidence_interval(input_data, level=0.95)[source]¶
Predict the lower bounds and upper bounds from input data.
The user can specify the input data either as a NumPy array, e.g.
array([1., 2., 3.])or as a dictionary, e.g.{'a': array([1.]), 'b': array([2., 3.])}.The output data type will be consistent with the input data type.
- abstract compute_expected_improvement(input_data, fopt, maximize=False)[source]¶
Compute the expected improvement from input data.
The user can specify the input data either as a NumPy array, e.g.
array([1., 2., 3.])or as a dictionary, e.g.{'a': array([1.]), 'b': array([2., 3.])}.The output data type will be consistent with the input data type.
- abstract compute_mean(input_data)[source]¶
Compute the mean from input data.
The user can specify the input data either as a NumPy array, e.g.
array([1., 2., 3.])or as a dictionary, e.g.{'a': array([1.]), 'b': array([2., 3.])}.The output data type will be consistent with the input data type.
- compute_standard_deviation(input_data, *args, **kwargs)¶
Evaluate
funcwith either array or dictionary-based input data.Firstly, the pre-processing stage converts the input data to a NumPy data array, if these data are expressed as a dictionary of NumPy data arrays.
Then, the processing evaluates the function
funcfrom this NumPy input data array.Lastly, the post-processing transforms the output data to a dictionary of output NumPy data array if the input data were passed as a dictionary of NumPy data arrays.
- Parameters:
algo (MLSupervisedAlgo) – The supervised learning algorithm.
input_data (DataType) – The input data.
*args (Any) – The positional arguments of the function
func.**kwargs (Any) – The keyword arguments of the function
func.
- Returns:
The output data with the same type as the input one.
- Return type:
- abstract compute_variance(input_data)[source]¶
Compute the variance from input data.
The user can specify the input data either as a NumPy array, e.g.
array([1., 2., 3.])or as a dictionary, e.g.{'a': array([1.]), 'b': array([2., 3.])}.The output data type will be consistent with the input data type.
- predict(input_data)[source]¶
Predict the output of the original machine learning algorithm.
The user can specify the input data either as a NumPy array, e.g.
array([1., 2., 3.])or as a dictionary, e.g.{'a': array([1.]), 'b': array([2., 3.])}.The output data type will be consistent with the input data type.
- algo: MLRegressionAlgo¶
The regression model.