Regression algorithms¶
Warning
Some capabilities may require the installation of GEMSEO with all its features and some others may depend on plugins.
Note
All the features of the wrapped libraries may not be exposed through GEMSEO.
GaussianProcessRegressor¶
Module: gemseo.mlearning.regression.algos.gpr
- Required parameters
data : IODataset
The learning dataset.
- Optional parameters
alpha : float | RealArray, optional
The nugget effect to regularize the model.
By default it is set to 1e-10.
bounds : __Bounds | Mapping[str, __Bounds], optional
The lower and upper bounds of the parameter length scales when
kernel
isNone
. Either a unique lower-upper pair common to all the inputs or lower-upper pairs for some of them. Whenbounds
is empty or when an input has no pair, the lower bound is 0.01 and the upper bound is 100.By default it is set to ().
input_names : Iterable[str], optional
The names of the input variables. If empty, consider all the input variables of the learning dataset.
By default it is set to ().
kernel : Kernel | None, optional
The kernel specifying the covariance model. If
None
, use a Matérn(2.5).By default it is set to None.
n_restarts_optimizer : int, optional
The number of restarts of the optimizer.
By default it is set to 10.
optimizer : str | Callable, optional
The optimization algorithm to find the parameter length scales.
By default it is set to fmin_l_bfgs_b.
output_names : Iterable[str], optional
The names of the output variables. If empty, consider all the output variables of the learning dataset.
By default it is set to ().
random_state : int | None, optional
The random state passed to the random number generator. Use an integer for reproducible results.
By default it is set to 0.
transformer : TransformerType, optional
The strategies to transform the variables. The values are instances of
BaseTransformer
while the keys are the names of either the variables or the groups of variables, e.g."inputs"
or"outputs"
in the case of the regression algorithms. If a group is specified, theBaseTransformer
will be applied to all the variables of this group. IfIDENTITY
, do not transform the variables.By default it is set to {}.
GradientBoostingRegressor¶
Module: gemseo.mlearning.regression.algos.gradient_boosting
- Required parameters
data : IODataset
The learning dataset.
- Optional parameters
input_names : Iterable[str], optional
The names of the input variables. If empty, consider all the input variables of the learning dataset.
By default it is set to ().
n_estimators : int, optional
The number of boosting stages to perform.
By default it is set to 100.
output_names : Iterable[str], optional
The names of the output variables. If empty, consider all the output variables of the learning dataset.
By default it is set to ().
transformer : TransformerType, optional
The strategies to transform the variables. The values are instances of
BaseTransformer
while the keys are the names of either the variables or the groups of variables, e.g."inputs"
or"outputs"
in the case of the regression algorithms. If a group is specified, theBaseTransformer
will be applied to all the variables of this group. IfIDENTITY
, do not transform the variables.By default it is set to {}.
**parameters : Any
The parameters of the machine learning algorithm.
LinearRegressor¶
Module: gemseo.mlearning.regression.algos.linreg
- Required parameters
data : IODataset
The learning dataset.
- Optional parameters
fit_intercept : bool, optional
Whether to fit the intercept.
By default it is set to True.
input_names : Iterable[str], optional
The names of the input variables. If empty, consider all the input variables of the learning dataset.
By default it is set to ().
l2_penalty_ratio : float, optional
The penalty ratio related to the l2 regularization. If 1, use the Ridge penalty. If 0, use the Lasso penalty. Between 0 and 1, use the ElasticNet penalty.
By default it is set to 1.0.
output_names : Iterable[str], optional
The names of the output variables. If empty, consider all the output variables of the learning dataset.
By default it is set to ().
penalty_level : float, optional
The penalty level greater or equal to 0. If 0, there is no penalty.
By default it is set to 0.0.
random_state : int | None, optional
The random state passed to the random number generator when there is a penalty. Use an integer for reproducible results.
By default it is set to 0.
transformer : TransformerType, optional
The strategies to transform the variables. The values are instances of
BaseTransformer
while the keys are the names of either the variables or the groups of variables, e.g."inputs"
or"outputs"
in the case of the regression algorithms. If a group is specified, theBaseTransformer
will be applied to all the variables of this group. IfIDENTITY
, do not transform the variables.By default it is set to {}.
**parameters : float | str | bool | None
The parameters of the machine learning algorithm.
MLPRegressor¶
Module: gemseo.mlearning.regression.algos.mlp
- Required parameters
data : IODataset
The learning dataset.
- Optional parameters
hidden_layer_sizes : tuple[int], optional
The number of neurons per hidden layer.
By default it is set to (100,).
input_names : Iterable[str], optional
The names of the input variables. If empty, consider all the input variables of the learning dataset.
By default it is set to ().
output_names : Iterable[str], optional
The names of the output variables. If empty, consider all the output variables of the learning dataset.
By default it is set to ().
transformer : TransformerType, optional
The strategies to transform the variables. The values are instances of
BaseTransformer
while the keys are the names of either the variables or the groups of variables, e.g."inputs"
or"outputs"
in the case of the regression algorithms. If a group is specified, theBaseTransformer
will be applied to all the variables of this group. IfIDENTITY
, do not transform the variables.By default it is set to {}.
**parameters : Any
The parameters of the machine learning algorithm.
MOERegressor¶
Module: gemseo.mlearning.regression.algos.moe
- Required parameters
data : IODataset
The learning dataset.
- Optional parameters
hard : bool, optional
Whether clustering/classification should be hard or soft.
By default it is set to True.
input_names : Iterable[str], optional
The names of the input variables. If empty, consider all the input variables of the learning dataset.
By default it is set to ().
output_names : Iterable[str], optional
The names of the output variables. If empty, consider all the output variables of the learning dataset.
By default it is set to ().
transformer : TransformerType, optional
The strategies to transform the variables. The values are instances of
BaseTransformer
while the keys are the names of either the variables or the groups of variables, e.g."inputs"
or"outputs"
in the case of the regression algorithms. If a group is specified, theBaseTransformer
will be applied to all the variables of this group. IfIDENTITY
, do not transform the variables.By default it is set to {}.
OTGaussianProcessRegressor¶
Module: gemseo.mlearning.regression.algos.ot_gpr
- Required parameters
data : IODataset
The learning dataset.
- Optional parameters
covariance_model : Iterable[CovarianceModelType] | CovarianceModelType, optional
The covariance model of the Gaussian process. Either an OpenTURNS covariance model class, an OpenTURNS covariance model class instance, a name of covariance model, or a list of OpenTURNS covariance model classes, OpenTURNS class instances and covariance model names, whose size is equal to the output dimension.
By default it is set to Matern52.
input_names : Iterable[str], optional
The names of the input variables. If empty, consider all the input variables of the learning dataset.
By default it is set to ().
multi_start_algo_name : DOEAlgorithmName, optional
The name of the DOE algorithm for multi-start optimization of the covariance model parameters.
By default it is set to OT_OPT_LHS.
multi_start_algo_options : StrKeyMapping, optional
The options of the DOE algorithm for multi-start optimization of the covariance model parameters.
By default it is set to {}.
multi_start_n_samples : int, optional
The number of starting points for multi-start optimization of the covariance model parameters; if 0, do not use multi-start optimization.
By default it is set to 10.
optimization_space : DesignSpace | None, optional
The covariance model parameter space; the size of a variable must take into account the size of the output space.
By default it is set to None.
optimizer : OptimizationAlgorithmImplementation, optional
The solver used to optimize the covariance model parameters.
By default it is set to class=TNC class=OptimizationAlgorithmImplementation problem=class=OptimizationProblem implementation=class=OptimizationProblemImplementation objective=class=Function name=Unnamed implementation=class=FunctionImplementation name=Unnamed description=[] evaluationImplementation=class=NoEvaluation name=Unnamed gradientImplementation=class=NoGradient name=Unnamed hessianImplementation=class=NoHessian name=Unnamed equality constraint=none inequality constraint=none bounds=none minimization=true dimension=0 startingPoint=class=Point name=Unnamed dimension=0 values=[] maximumIterationNumber=100 maximumCallsNumber=1000 maximumAbsoluteError=1e-05 maximumRelativeError=1e-05 maximumResidualError=1e-05 maximumConstraintError=1e-05 scale=class=Point name=Unnamed dimension=0 values=[] offset=class=Point name=Unnamed dimension=0 values=[] maxCGit=50 eta=0.25 stepmx=10 accuracy=0.0001 fmin=1 rescale=1.3.
output_names : Iterable[str], optional
The names of the output variables. If empty, consider all the output variables of the learning dataset.
By default it is set to ().
transformer : TransformerType, optional
The strategies to transform the variables. The values are instances of
BaseTransformer
while the keys are the names of either the variables or the groups of variables, e.g."inputs"
or"outputs"
in the case of the regression algorithms. If a group is specified, theBaseTransformer
will be applied to all the variables of this group. IfIDENTITY
, do not transform the variables.By default it is set to {}.
trend : Trend, optional
The name of the trend.
By default it is set to constant.
use_hmat : bool | None, optional
Whether to use the HMAT or LAPACK as linear algebra method. If None, use HMAT when the learning size is greater than [MAX_SIZE_FOR_LAPACK][gemseo_mlearning.regression.ot_gpr.OTGaussianProcessRegressor.MAX_SIZE_FOR_LAPACK].
By default it is set to None.
PCERegressor¶
Module: gemseo.mlearning.regression.algos.pce
- Required parameters
data : IODataset | None
The learning dataset required in the case of the least-squares regression or when
discipline
isNone
in the case of quadrature.probability_space : ParameterSpace
The set of random input variables defined by
OTDistribution
instances.
- Optional parameters
cleaning_options : CleaningOptions | None, optional
The options of the `CleaningStrategy`_. If
None
, useDEFAULT_CLEANING_OPTIONS
.By default it is set to None.
degree : int, optional
The polynomial degree of the PCE.
By default it is set to 2.
discipline : MDODiscipline | None, optional
The discipline to be sampled if
use_quadrature
isTrue
anddata
isNone
.By default it is set to None.
hyperbolic_parameter : float, optional
The \(q\)-quasi norm parameter of the `hyperbolic and anisotropic enumerate function`_, defined over the interval \(]0,1]\).
By default it is set to 1.0.
input_names : Iterable[str], optional
The names of the input variables. If empty, consider all the input variables of the learning dataset.
By default it is set to ().
n_quadrature_points : int, optional
The total number of quadrature points used by the quadrature strategy to compute the marginal number of points by input dimension when
discipline
is notNone
. If0
, use \((1+P)^d\) points, where \(d\) is the dimension of the input space and \(P\) is the polynomial degree of the PCE.By default it is set to 0.
output_names : Iterable[str], optional
The names of the output variables. If empty, consider all the output variables of the learning dataset.
By default it is set to ().
transformer : TransformerType, optional
The strategies to transform the variables. The values are instances of
BaseTransformer
while the keys are the names of either the variables or the groups of variables, e.g."inputs"
or"outputs"
in the case of the regression algorithms. If a group is specified, theBaseTransformer
will be applied to all the variables of this group. IfIDENTITY
, do not transform the variables.By default it is set to {}.
use_cleaning : bool, optional
Whether to use the `CleaningStrategy`_ algorithm. Otherwise, use a fixed truncation strategy (`FixedStrategy`_).
By default it is set to False.
use_lars : bool, optional
Whether to use the `LARS`_ algorithm in the case of the least-squares regression.
By default it is set to False.
use_quadrature : bool, optional
Whether to estimate the coefficients of the PCE by a quadrature rule; if so, use the quadrature points stored in
data
or samplediscipline
. otherwise, estimate the coefficients by least-squares regression.By default it is set to False.
PolynomialRegressor¶
Module: gemseo.mlearning.regression.algos.polyreg
- Required parameters
data : IODataset
The learning dataset.
degree : int
The polynomial degree.
- Optional parameters
fit_intercept : bool, optional
Whether to fit the intercept.
By default it is set to True.
input_names : Iterable[str], optional
The names of the input variables. If empty, consider all the input variables of the learning dataset.
By default it is set to ().
l2_penalty_ratio : float, optional
The penalty ratio related to the l2 regularization. If 1, the penalty is the Ridge penalty. If 0, this is the Lasso penalty. Between 0 and 1, the penalty is the ElasticNet penalty.
By default it is set to 1.0.
output_names : Iterable[str], optional
The names of the output variables. If empty, consider all the output variables of the learning dataset.
By default it is set to ().
penalty_level : float, optional
The penalty level greater or equal to 0. If 0, there is no penalty.
By default it is set to 0.0.
transformer : TransformerType, optional
The strategies to transform the variables. The values are instances of
BaseTransformer
while the keys are the names of either the variables or the groups of variables, e.g."inputs"
or"outputs"
in the case of the regression algorithms. If a group is specified, theBaseTransformer
will be applied to all the variables of this group. IfIDENTITY
, do not transform the variables.By default it is set to {}.
**parameters : float | str | bool | None
The parameters of the machine learning algorithm.
RBFRegressor¶
Module: gemseo.mlearning.regression.algos.rbf
- Required parameters
data : IODataset
The learning dataset.
- Optional parameters
der_function : Callable[[RealArray], RealArray] | None, optional
The derivative of the radial basis function, only to be provided if
function
is a callable and if the use of the model with its derivative is required. IfNone
and iffunction
is a callable, an error will be raised. IfNone
and iffunction
is a string, the class will look for its internal implementation and will raise an error if it is missing. Theder_function
shall take three arguments (input_data
,norm_input_data
,eps
). For an RBF of the form function(\(r\)), der_function(\(x\), \(|x|\), \(\epsilon\)) shall return \(\epsilon^{-1} x/|x| f'(|x|/\epsilon)\).By default it is set to None.
epsilon : float | None, optional
An adjustable constant for Gaussian or multiquadric functions. If
None
, use the average distance between input data.By default it is set to None.
function : Function | Callable[[float, float], float], optional
The radial basis function taking a radius \(r\) as input, representing a distance between two points. If it is a string, then it must be one of the following:
"multiquadric"
for \(\sqrt{(r/\epsilon)^2 + 1}\),"inverse"
for \(1/\sqrt{(r/\epsilon)^2 + 1}\),"gaussian"
for \(\exp(-(r/\epsilon)^2)\),"linear"
for \(r\),"cubic"
for \(r^3\),"quintic"
for \(r^5\),"thin_plate"
for \(r^2\log(r)\).
If it is a callable, then it must take the two arguments
self
andr
as inputs, e.g.lambda self, r: sqrt((r/self.epsilon)**2 + 1)
for the multiquadric function. The epsilon parameter will be available asself.epsilon
. Other keyword arguments passed in will be available as well.By default it is set to multiquadric.
input_names : Iterable[str], optional
The names of the input variables. If empty, consider all the input variables of the learning dataset.
By default it is set to ().
norm : str | Callable[[RealArray, RealArray], float], optional
The distance metric to be used, either a distance function name known by SciPy or a function that computes the distance between two points.
By default it is set to euclidean.
output_names : Iterable[str], optional
The names of the output variables. If empty, consider all the output variables of the learning dataset.
By default it is set to ().
smooth : float, optional
The degree of smoothness,
0
involving an interpolation of the learning points.By default it is set to 0.0.
transformer : TransformerType, optional
The strategies to transform the variables. The values are instances of
BaseTransformer
while the keys are the names of either the variables or the groups of variables, e.g."inputs"
or"outputs"
in the case of the regression algorithms. If a group is specified, theBaseTransformer
will be applied to all the variables of this group. IfIDENTITY
, do not transform the variables.By default it is set to {}.
RandomForestRegressor¶
Module: gemseo.mlearning.regression.algos.random_forest
- Required parameters
data : IODataset
The learning dataset.
- Optional parameters
input_names : Iterable[str], optional
The names of the input variables. If empty, consider all the input variables of the learning dataset.
By default it is set to ().
n_estimators : int, optional
The number of trees in the forest.
By default it is set to 100.
output_names : Iterable[str], optional
The names of the output variables. If empty, consider all the output variables of the learning dataset.
By default it is set to ().
random_state : int | None, optional
The random state passed to the random number generator. Use an integer for reproducible results.
By default it is set to 0.
transformer : TransformerType, optional
The strategies to transform the variables. The values are instances of
BaseTransformer
while the keys are the names of either the variables or the groups of variables, e.g."inputs"
or"outputs"
in the case of the regression algorithms. If a group is specified, theBaseTransformer
will be applied to all the variables of this group. IfIDENTITY
, do not transform the variables.By default it is set to {}.
RegressorChain¶
Module: gemseo.mlearning.regression.algos.regressor_chain
- Required parameters
data : IODataset
The learning dataset.
- Optional parameters
input_names : Iterable[str], optional
The names of the input variables. If empty, consider all the input variables of the learning dataset.
By default it is set to ().
output_names : Iterable[str], optional
The names of the output variables. If empty, consider all the output variables of the learning dataset.
By default it is set to ().
transformer : TransformerType, optional
The strategies to transform the variables. The values are instances of
BaseTransformer
while the keys are the names of either the variables or the groups of variables, e.g."inputs"
or"outputs"
in the case of the regression algorithms. If a group is specified, theBaseTransformer
will be applied to all the variables of this group. IfIDENTITY
, do not transform the variables.By default it is set to {}.
**parameters : Any
The parameters of the machine learning algorithm.
SVMRegressor¶
Module: gemseo.mlearning.regression.algos.svm
- Required parameters
data : IODataset
The learning dataset.
- Optional parameters
input_names : Iterable[str], optional
The names of the input variables. If empty, consider all the input variables of the learning dataset.
By default it is set to ().
kernel : str, optional
The kernel type to be used.
By default it is set to rbf.
output_names : Iterable[str], optional
The names of the output variables. If empty, consider all the output variables of the learning dataset.
By default it is set to ().
transformer : TransformerType, optional
The strategies to transform the variables. The values are instances of
BaseTransformer
while the keys are the names of either the variables or the groups of variables, e.g."inputs"
or"outputs"
in the case of the regression algorithms. If a group is specified, theBaseTransformer
will be applied to all the variables of this group. IfIDENTITY
, do not transform the variables.By default it is set to {}.
**parameters : Any
The parameters of the machine learning algorithm.
TPSRegressor¶
Module: gemseo.mlearning.regression.algos.thin_plate_spline
- Required parameters
data : IODataset
The learning dataset.
- Optional parameters
input_names : Iterable[str], optional
The names of the input variables. If empty, consider all the input variables of the learning dataset.
By default it is set to ().
norm : str | Callable[[NumberArray, NumberArray], float], optional
The distance metric to be used, either a distance function name known by SciPy or a function that computes the distance between two points.
By default it is set to euclidean.
output_names : Iterable[str], optional
The names of the output variables. If empty, consider all the output variables of the learning dataset.
By default it is set to ().
smooth : float, optional
The degree of smoothness,
0
involving an interpolation of the learning points.By default it is set to 0.0.
transformer : TransformerType, optional
The strategies to transform the variables. The values are instances of
BaseTransformer
while the keys are the names of either the variables or the groups of variables, e.g."inputs"
or"outputs"
in the case of the regression algorithms. If a group is specified, theBaseTransformer
will be applied to all the variables of this group. IfIDENTITY
, do not transform the variables.By default it is set to {}.