distribution module¶
This module defines the notion of distribution of a machine learning algorithm.
Once a MLAlgo has been trained,
assessing its quality is important before using it.
One can not only measure its global quality (e.g. from a MLQualityMeasure)
but also its local one.
The MLRegressorDistribution class addresses the latter case,
by quantifying the robustness of a machine learning algorithm to a learning point.
The more robust it is,
the less variability it has around this point.
Note
For now, one does not consider any MLAlgo
but instances of MLRegressionAlgo.
The MLRegressorDistribution can be particularly useful to:
study the robustness of a
MLAlgow.r.t. learning dataset elements,evaluate acquisition criteria for adaptive learning purposes (see
MLDataAcquisitionandMLDataAcquisitionCriterion),etc.
The abstract MLRegressorDistribution class is derived into two classes:
KrigingDistribution:the
MLRegressionAlgois a Kriging model and this assessor takes advantage of the underlying Gaussian stochastic process,
RegressorDistribution:this class is based on sampling methods, such as bootstrap, cross-validation or leave-one-out.
See also
KrigingDistribution RegressorDistribution MLDataAcquisition MLDataAcquisitionCriterion MLDataAcquisitionCriterionFactory
- class gemseo_mlearning.adaptive.distribution.MLRegressorDistribution(algo)[source]¶
Bases:
objectDistribution related to a regression model.
# noqa: D205 D212 D415 :param algo: A regression model.
- Parameters
algo (gemseo.mlearning.regression.regression.MLRegressionAlgo) –
- Return type
None
- change_learning_set(learning_set)[source]¶
Re-train the machine learning algorithm relying on the initial learning set.
- Parameters
learning_set (gemseo.core.dataset.Dataset) – The new learning set.
- Return type
None
- compute_confidence_interval(input_data, level=0.95)[source]¶
Predict the lower bounds and upper bounds from input data.
The user can specify the input data either as a NumPy array, e.g.
array([1., 2., 3.])or as a dictionary, e.g.{'a': array([1.]), 'b': array([2., 3.])}.The output data type will be consistent with the input data type.
- compute_expected_improvement(input_data, *args, **kwargs)¶
Evaluate ‘predict’ with either array or dictionary-based input data.
Firstly, the pre-processing stage converts the input data to a NumPy data array, if these data are expressed as a dictionary of NumPy data arrays.
Then, the processing evaluates the function ‘predict’ from this NumPy input data array.
Lastly, the post-processing transforms the output data to a dictionary of output NumPy data array if the input data were passed as a dictionary of NumPy data arrays.
- Parameters
input_data (Union[numpy.ndarray, Mapping[str, numpy.ndarray]]) – The input data.
*args – The positional arguments of the function ‘predict’.
**kwargs – The keyword arguments of the function ‘predict’.
- Returns
The output data with the same type as the input one.
- Return type
Union[numpy.ndarray, Mapping[str, numpy.ndarray]]
- compute_mean(input_data, *args, **kwargs)¶
Evaluate ‘predict’ with either array or dictionary-based input data.
Firstly, the pre-processing stage converts the input data to a NumPy data array, if these data are expressed as a dictionary of NumPy data arrays.
Then, the processing evaluates the function ‘predict’ from this NumPy input data array.
Lastly, the post-processing transforms the output data to a dictionary of output NumPy data array if the input data were passed as a dictionary of NumPy data arrays.
- Parameters
input_data (Union[numpy.ndarray, Mapping[str, numpy.ndarray]]) – The input data.
*args – The positional arguments of the function ‘predict’.
**kwargs – The keyword arguments of the function ‘predict’.
- Returns
The output data with the same type as the input one.
- Return type
Union[numpy.ndarray, Mapping[str, numpy.ndarray]]
- compute_standard_deviation(input_data, *args, **kwargs)¶
Evaluate ‘predict’ with either array or dictionary-based input data.
Firstly, the pre-processing stage converts the input data to a NumPy data array, if these data are expressed as a dictionary of NumPy data arrays.
Then, the processing evaluates the function ‘predict’ from this NumPy input data array.
Lastly, the post-processing transforms the output data to a dictionary of output NumPy data array if the input data were passed as a dictionary of NumPy data arrays.
- Parameters
input_data (Union[numpy.ndarray, Mapping[str, numpy.ndarray]]) – The input data.
*args – The positional arguments of the function ‘predict’.
**kwargs – The keyword arguments of the function ‘predict’.
- Returns
The output data with the same type as the input one.
- Return type
Union[numpy.ndarray, Mapping[str, numpy.ndarray]]
- compute_variance(input_data, *args, **kwargs)¶
Evaluate ‘predict’ with either array or dictionary-based input data.
Firstly, the pre-processing stage converts the input data to a NumPy data array, if these data are expressed as a dictionary of NumPy data arrays.
Then, the processing evaluates the function ‘predict’ from this NumPy input data array.
Lastly, the post-processing transforms the output data to a dictionary of output NumPy data array if the input data were passed as a dictionary of NumPy data arrays.
- Parameters
input_data (Union[numpy.ndarray, Mapping[str, numpy.ndarray]]) – The input data.
*args – The positional arguments of the function ‘predict’.
**kwargs – The keyword arguments of the function ‘predict’.
- Returns
The output data with the same type as the input one.
- Return type
Union[numpy.ndarray, Mapping[str, numpy.ndarray]]
- predict(input_data)[source]¶
Predict the output of the original machine learning algorithm.
The user can specify the input data either as a NumPy array, e.g.
array([1., 2., 3.])or as a dictionary, e.g.{'a': array([1.]), 'b': array([2., 3.])}.The output data type will be consistent with the input data type.
- Parameters
input_data (Union[numpy.ndarray, Mapping[str, numpy.ndarray]]) – The input data.
- Returns
The predicted output data.
- Return type
Union[numpy.ndarray, Mapping[str, numpy.ndarray]]
- algo: gemseo.mlearning.regression.regression.MLRegressionAlgo¶
The regression model.
- property learning_set: gemseo.core.dataset.Dataset¶
The learning dataset used by the original machine learning algorithm.