distribution module¶
This module defines the notion of distribution of a machine learning algorithm.
Once a MLAlgo
has been trained,
assessing its quality is important before using it.
One can not only measure its global quality (e.g. from a MLQualityMeasure
)
but also its local one.
The MLRegressorDistribution
class addresses the latter case,
by quantifying the robustness of a machine learning algorithm to a learning point.
The more robust it is,
the less variability it has around this point.
Note
For now, one does not consider any MLAlgo
but instances of MLRegressionAlgo
.
The MLRegressorDistribution
can be particularly useful to:
study the robustness of a
MLAlgo
w.r.t. learning dataset elements,evaluate acquisition criteria for adaptive learning purposes (see
MLDataAcquisition
andMLDataAcquisitionCriterion
),etc.
The abstract MLRegressorDistribution
class is derived into two classes:
KrigingDistribution
:the
MLRegressionAlgo
is a Kriging model and this assessor takes advantage of the underlying Gaussian stochastic process,
RegressorDistribution
:this class is based on sampling methods, such as bootstrap, cross-validation or leave-one-out.
See also
KrigingDistribution RegressorDistribution MLDataAcquisition MLDataAcquisitionCriterion MLDataAcquisitionCriterionFactory
- class gemseo_mlearning.adaptive.distribution.MLRegressorDistribution(algo)[source]¶
Bases:
object
Distribution related to a regression model.
# noqa: D205 D212 D415 :param algo: A regression model.
- Parameters
algo (gemseo.mlearning.regression.regression.MLRegressionAlgo) –
- Return type
None
- change_learning_set(learning_set)[source]¶
Re-train the machine learning algorithm relying on the initial learning set.
- Parameters
learning_set (gemseo.core.dataset.Dataset) – The new learning set.
- Return type
None
- compute_confidence_interval(input_data, level=0.95)[source]¶
Predict the lower bounds and upper bounds from input data.
The user can specify the input data either as a NumPy array, e.g.
array([1., 2., 3.])
or as a dictionary, e.g.{'a': array([1.]), 'b': array([2., 3.])}
.The output data type will be consistent with the input data type.
- compute_expected_improvement(input_data, *args, **kwargs)¶
Evaluate ‘predict’ with either array or dictionary-based input data.
Firstly, the pre-processing stage converts the input data to a NumPy data array, if these data are expressed as a dictionary of NumPy data arrays.
Then, the processing evaluates the function ‘predict’ from this NumPy input data array.
Lastly, the post-processing transforms the output data to a dictionary of output NumPy data array if the input data were passed as a dictionary of NumPy data arrays.
- Parameters
input_data (Union[numpy.ndarray, Mapping[str, numpy.ndarray]]) – The input data.
*args – The positional arguments of the function ‘predict’.
**kwargs – The keyword arguments of the function ‘predict’.
- Returns
The output data with the same type as the input one.
- Return type
Union[numpy.ndarray, Mapping[str, numpy.ndarray]]
- compute_mean(input_data, *args, **kwargs)¶
Evaluate ‘predict’ with either array or dictionary-based input data.
Firstly, the pre-processing stage converts the input data to a NumPy data array, if these data are expressed as a dictionary of NumPy data arrays.
Then, the processing evaluates the function ‘predict’ from this NumPy input data array.
Lastly, the post-processing transforms the output data to a dictionary of output NumPy data array if the input data were passed as a dictionary of NumPy data arrays.
- Parameters
input_data (Union[numpy.ndarray, Mapping[str, numpy.ndarray]]) – The input data.
*args – The positional arguments of the function ‘predict’.
**kwargs – The keyword arguments of the function ‘predict’.
- Returns
The output data with the same type as the input one.
- Return type
Union[numpy.ndarray, Mapping[str, numpy.ndarray]]
- compute_standard_deviation(input_data, *args, **kwargs)¶
Evaluate ‘predict’ with either array or dictionary-based input data.
Firstly, the pre-processing stage converts the input data to a NumPy data array, if these data are expressed as a dictionary of NumPy data arrays.
Then, the processing evaluates the function ‘predict’ from this NumPy input data array.
Lastly, the post-processing transforms the output data to a dictionary of output NumPy data array if the input data were passed as a dictionary of NumPy data arrays.
- Parameters
input_data (Union[numpy.ndarray, Mapping[str, numpy.ndarray]]) – The input data.
*args – The positional arguments of the function ‘predict’.
**kwargs – The keyword arguments of the function ‘predict’.
- Returns
The output data with the same type as the input one.
- Return type
Union[numpy.ndarray, Mapping[str, numpy.ndarray]]
- compute_variance(input_data, *args, **kwargs)¶
Evaluate ‘predict’ with either array or dictionary-based input data.
Firstly, the pre-processing stage converts the input data to a NumPy data array, if these data are expressed as a dictionary of NumPy data arrays.
Then, the processing evaluates the function ‘predict’ from this NumPy input data array.
Lastly, the post-processing transforms the output data to a dictionary of output NumPy data array if the input data were passed as a dictionary of NumPy data arrays.
- Parameters
input_data (Union[numpy.ndarray, Mapping[str, numpy.ndarray]]) – The input data.
*args – The positional arguments of the function ‘predict’.
**kwargs – The keyword arguments of the function ‘predict’.
- Returns
The output data with the same type as the input one.
- Return type
Union[numpy.ndarray, Mapping[str, numpy.ndarray]]
- predict(input_data)[source]¶
Predict the output of the original machine learning algorithm.
The user can specify the input data either as a NumPy array, e.g.
array([1., 2., 3.])
or as a dictionary, e.g.{'a': array([1.]), 'b': array([2., 3.])}
.The output data type will be consistent with the input data type.
- Parameters
input_data (Union[numpy.ndarray, Mapping[str, numpy.ndarray]]) – The input data.
- Returns
The predicted output data.
- Return type
Union[numpy.ndarray, Mapping[str, numpy.ndarray]]
- algo: gemseo.mlearning.regression.regression.MLRegressionAlgo¶
The regression model.
- property learning_set: gemseo.core.dataset.Dataset¶
The learning dataset used by the original machine learning algorithm.