# Surrogate disciplines¶

A SurrogateDiscipline is built from the name of a MLRegressionAlgo and its options. These names and options are listed below.

Warning

Some algorithms can require the installation of GEMSEO with all its features and others can depend on plugins.

Note

All the features of the wrapped algorithm libraries may not be exposed through GEMSEO.

## GaussianProcessRegressor¶

Required parameters
• data : Dataset

The learning dataset.

Optional parameters
• alpha : float | ndarray, optional

The nugget effect to regularize the model.

By default it is set to 1e-10.

• bounds : __Bounds | Mapping[str, __Bounds] | None, optional

The lower and upper bounds of the parameter length scales when kernel is None. Either a unique lower-upper pair common to all the inputs or lower-upper pairs for some of them. When bounds is None or when an input has no pair, the lower bound is 0.01 and the upper bound is 100.

By default it is set to None.

• input_names : Iterable[str] | None, optional

The names of the input variables. If None, consider all the input variables of the learning dataset.

By default it is set to None.

• kernel : Kernel | None, optional

The kernel specifying the covariance model. If None, use a Matérn(2.5).

By default it is set to None.

• n_restarts_optimizer : int, optional

The number of restarts of the optimizer.

By default it is set to 10.

• optimizer : str | Callable, optional

The optimization algorithm to find the parameter length scales.

By default it is set to fmin_l_bfgs_b.

• output_names : Iterable[str] | None, optional

The names of the output variables. If None, consider all the output variables of the learning dataset.

By default it is set to None.

• random_state : int | None, optional

The seed used to initialize the centers. If None, the random number generator is the RandomState instance used by numpy.random.

By default it is set to None.

• transformer : TransformerType, optional

The strategies to transform the variables. The values are instances of Transformer while the keys are the names of either the variables or the groups of variables, e.g. "inputs" or "outputs" in the case of the regression algorithms. If a group is specified, the Transformer will be applied to all the variables of this group. If IDENTITY, do not transform the variables.

By default it is set to {}.

Required parameters
• data : Dataset

The learning dataset.

Optional parameters
• input_names : Iterable[str], optional

The names of the input variables. If None, consider all the input variables of the learning dataset.

By default it is set to None.

• n_estimators : int, optional

The number of boosting stages to perform.

By default it is set to 100.

• output_names : Iterable[str], optional

The names of the output variables. If None, consider all the output variables of the learning dataset.

By default it is set to None.

• transformer : Mapping[str, TransformerType] | None, optional

The strategies to transform the variables. The values are instances of Transformer while the keys are the names of either the variables or the groups of variables, e.g. "inputs" or "outputs" in the case of the regression algorithms. If a group is specified, the Transformer will be applied to all the variables of this group. If IDENTITY, do not transform the variables.

By default it is set to None.

• **parameters : Any

The parameters of the machine learning algorithm.

## LinearRegressor¶

Required parameters
• data : Dataset

The learning dataset.

Optional parameters
• fit_intercept : bool, optional

Whether to fit the intercept.

By default it is set to True.

• input_names : Iterable[str] | None, optional

The names of the input variables. If None, consider all the input variables of the learning dataset.

By default it is set to None.

• l2_penalty_ratio : float, optional

The penalty ratio related to the l2 regularization. If 1, use the Ridge penalty. If 0, use the Lasso penalty. Between 0 and 1, use the ElasticNet penalty.

By default it is set to 1.0.

• output_names : Iterable[str] | None, optional

The names of the output variables. If None, consider all the output variables of the learning dataset.

By default it is set to None.

• penalty_level : float, optional

The penalty level greater or equal to 0. If 0, there is no penalty.

By default it is set to 0.0.

• transformer : TransformerType, optional

The strategies to transform the variables. The values are instances of Transformer while the keys are the names of either the variables or the groups of variables, e.g. "inputs" or "outputs" in the case of the regression algorithms. If a group is specified, the Transformer will be applied to all the variables of this group. If IDENTITY, do not transform the variables.

By default it is set to {}.

• **parameters : float | int | str | bool | None

The parameters of the machine learning algorithm.

## MLPRegressor¶

Required parameters
• data : Dataset

The learning dataset.

Optional parameters
• hidden_layer_sizes : tuple[int], optional

The number of neurons per hidden layer.

By default it is set to (100,).

• input_names : Iterable[str], optional

The names of the input variables. If None, consider all the input variables of the learning dataset.

By default it is set to None.

• output_names : Iterable[str], optional

The names of the output variables. If None, consider all the output variables of the learning dataset.

By default it is set to None.

• transformer : Mapping[str, TransformerType] | None, optional

The strategies to transform the variables. The values are instances of Transformer while the keys are the names of either the variables or the groups of variables, e.g. "inputs" or "outputs" in the case of the regression algorithms. If a group is specified, the Transformer will be applied to all the variables of this group. If IDENTITY, do not transform the variables.

By default it is set to None.

• **parameters : Any

The parameters of the machine learning algorithm.

## MOERegressor¶

Required parameters
• data : Dataset

The learning dataset.

Optional parameters
• hard : bool, optional

Whether clustering/classification should be hard or soft.

By default it is set to True.

• input_names : Iterable[str] | None, optional

The names of the input variables. If None, consider all the input variables of the learning dataset.

By default it is set to None.

• output_names : Iterable[str] | None, optional

The names of the output variables. If None, consider all the output variables of the learning dataset.

By default it is set to None.

• transformer : TransformerType, optional

The strategies to transform the variables. The values are instances of Transformer while the keys are the names of either the variables or the groups of variables, e.g. "inputs" or "outputs" in the case of the regression algorithms. If a group is specified, the Transformer will be applied to all the variables of this group. If IDENTITY, do not transform the variables.

By default it is set to {}.

## OTGaussianProcessRegressor¶

Required parameters
• data : Dataset

The learning dataset.

Optional parameters
• input_names : Iterable[str], optional

The names of the input variables. If None, consider all the input variables of the learning dataset.

By default it is set to None.

• output_names : Iterable[str], optional

The names of the output variables. If None, consider all the output variables of the learning dataset.

By default it is set to None.

• transformer : TransformerType | None, optional

The strategies to transform the variables. The values are instances of Transformer while the keys are the names of either the variables or the groups of variables, e.g. "inputs" or "outputs" in the case of the regression algorithms. If a group is specified, the Transformer will be applied to all the variables of this group. If IDENTITY, do not transform the variables.

By default it is set to None.

• use_hmat : bool, optional

Whether to use the HMAT or LAPACK as linear algebra method. If None, use HMAT when the learning size is greater than MAX_SIZE_FOR_LAPACK.

By default it is set to None.

## PCERegressor¶

Required parameters
• data : Dataset | None

The learning dataset required in the case of the least-squares regression or when discipline is None in the case of quadrature.

• probability_space : ParameterSpace

The set of random input variables defined by OTDistribution instances.

Optional parameters
• cleaning_options : CleaningOptions | None, optional

The options of the CleaningStrategy_. If None, use DEFAULT_CLEANING_OPTIONS.

By default it is set to None.

• degree : int, optional

The polynomial degree of the PCE.

By default it is set to 2.

• discipline : MDODiscipline | None, optional

The discipline to be sampled if use_quadrature is True and data is None.

By default it is set to None.

• hyperbolic_parameter : float, optional

The $$q$$-quasi norm parameter of the hyperbolic and anisotropic enumerate function_, defined over the interval $$]0,1]$$.

By default it is set to 1.0.

• input_names : Iterable[str] | None, optional

The names of the input variables. If None, consider all the input variables of the learning dataset.

By default it is set to None.

The total number of quadrature points used by the quadrature strategy to compute the marginal number of points by input dimension when discipline is not None. If 0, use $$(1+P)^d$$ points, where $$d$$ is the dimension of the input space and $$P$$ is the polynomial degree of the PCE.

By default it is set to 0.

• output_names : Iterable[str] | None, optional

The names of the output variables. If None, consider all the output variables of the learning dataset.

By default it is set to None.

• transformer : TransformerType, optional

The strategies to transform the variables. The values are instances of Transformer while the keys are the names of either the variables or the groups of variables, e.g. "inputs" or "outputs" in the case of the regression algorithms. If a group is specified, the Transformer will be applied to all the variables of this group. If IDENTITY, do not transform the variables.

By default it is set to {}.

• use_cleaning : bool, optional

Whether to use the CleaningStrategy_ algorithm. Otherwise, use a fixed truncation strategy (FixedStrategy_).

By default it is set to False.

• use_lars : bool, optional

Whether to use the LARS_ algorithm in the case of the least-squares regression.

By default it is set to False.

Whether to estimate the coefficients of the PCE by a quadrature rule; if so, use the quadrature points stored in data or sample discipline. otherwise, estimate the coefficients by least-squares regression.

By default it is set to False.

## PolynomialRegressor¶

Required parameters
• data : Dataset

The learning dataset.

• degree : int

The polynomial degree.

Optional parameters
• fit_intercept : bool, optional

Whether to fit the intercept.

By default it is set to True.

• input_names : Iterable[str] | None, optional

The names of the input variables. If None, consider all the input variables of the learning dataset.

By default it is set to None.

• l2_penalty_ratio : float, optional

The penalty ratio related to the l2 regularization. If 1, the penalty is the Ridge penalty. If 0, this is the Lasso penalty. Between 0 and 1, the penalty is the ElasticNet penalty.

By default it is set to 1.0.

• output_names : Iterable[str] | None, optional

The names of the output variables. If None, consider all the output variables of the learning dataset.

By default it is set to None.

• penalty_level : float, optional

The penalty level greater or equal to 0. If 0, there is no penalty.

By default it is set to 0.0.

• transformer : TransformerType, optional

The strategies to transform the variables. The values are instances of Transformer while the keys are the names of either the variables or the groups of variables, e.g. "inputs" or "outputs" in the case of the regression algorithms. If a group is specified, the Transformer will be applied to all the variables of this group. If IDENTITY, do not transform the variables.

By default it is set to {}.

• **parameters : float | int | str | bool | None

The parameters of the machine learning algorithm.

## RBFRegressor¶

Required parameters
• data : Dataset

The learning dataset.

Optional parameters
• der_function : Callable[[ndarray], ndarray] | None, optional

The derivative of the radial basis function, only to be provided if function is a callable and if the use of the model with its derivative is required. If None and if function is a callable, an error will be raised. If None and if function is a string, the class will look for its internal implementation and will raise an error if it is missing. The der_function shall take three arguments (input_data, norm_input_data, eps). For an RBF of the form function($$r$$), der_function($$x$$, $$|x|$$, $$\epsilon$$) shall return $$\epsilon^{-1} x/|x| f'(|x|/\epsilon)$$.

By default it is set to None.

• epsilon : float | None, optional

An adjustable constant for Gaussian or multiquadric functions. If None, use the average distance between input data.

By default it is set to None.

• function : str | Callable[[float, float], float], optional

The radial basis function taking a radius $$r$$ as input, representing a distance between two points. If it is a string, then it must be one of the following:

• "multiquadric" for $$\sqrt{(r/\epsilon)^2 + 1}$$,

• "inverse" for $$1/\sqrt{(r/\epsilon)^2 + 1}$$,

• "gaussian" for $$\exp(-(r/\epsilon)^2)$$,

• "linear" for $$r$$,

• "cubic" for $$r^3$$,

• "quintic" for $$r^5$$,

• "thin_plate" for $$r^2\log(r)$$.

If it is a callable, then it must take the two arguments self and r as inputs, e.g. lambda self, r: sqrt((r/self.epsilon)**2 + 1) for the multiquadric function. The epsilon parameter will be available as self.epsilon. Other keyword arguments passed in will be available as well.

By default it is set to multiquadric.

• input_names : Iterable[str] | None, optional

The names of the input variables. If None, consider all the input variables of the learning dataset.

By default it is set to None.

• norm : str | Callable[[ndarray, ndarray], float], optional

The distance metric to be used, either a distance function name known by SciPy or a function that computes the distance between two points.

By default it is set to euclidean.

• output_names : Iterable[str] | None, optional

The names of the output variables. If None, consider all the output variables of the learning dataset.

By default it is set to None.

• smooth : float, optional

The degree of smoothness, 0 involving an interpolation of the learning points.

By default it is set to 0.0.

• transformer : TransformerType, optional

The strategies to transform the variables. The values are instances of Transformer while the keys are the names of either the variables or the groups of variables, e.g. "inputs" or "outputs" in the case of the regression algorithms. If a group is specified, the Transformer will be applied to all the variables of this group. If IDENTITY, do not transform the variables.

By default it is set to {}.

## RandomForestRegressor¶

Required parameters
• data : Dataset

The learning dataset.

Optional parameters
• input_names : Iterable[str] | None, optional

The names of the input variables. If None, consider all the input variables of the learning dataset.

By default it is set to None.

• n_estimators : int, optional

The number of trees in the forest.

By default it is set to 100.

• output_names : Iterable[str] | None, optional

The names of the output variables. If None, consider all the output variables of the learning dataset.

By default it is set to None.

• transformer : TransformerType, optional

The strategies to transform the variables. The values are instances of Transformer while the keys are the names of either the variables or the groups of variables, e.g. "inputs" or "outputs" in the case of the regression algorithms. If a group is specified, the Transformer will be applied to all the variables of this group. If IDENTITY, do not transform the variables.

By default it is set to {}.

## RegressorChain¶

Required parameters
• data : Dataset

The learning dataset.

Optional parameters
• input_names : Iterable[str], optional

The names of the input variables. If None, consider all the input variables of the learning dataset.

By default it is set to None.

• output_names : Iterable[str], optional

The names of the output variables. If None, consider all the output variables of the learning dataset.

By default it is set to None.

• transformer : Mapping[str, TransformerType] | None, optional

The strategies to transform the variables. The values are instances of Transformer while the keys are the names of either the variables or the groups of variables, e.g. "inputs" or "outputs" in the case of the regression algorithms. If a group is specified, the Transformer will be applied to all the variables of this group. If IDENTITY, do not transform the variables.

By default it is set to None.

• **parameters : Any

The parameters of the machine learning algorithm.

## SVMRegressor¶

Required parameters
• data : Dataset

The learning dataset.

Optional parameters
• input_names : Iterable[str] | None, optional

The names of the input variables. If None, consider all the input variables of the learning dataset.

By default it is set to None.

• kernel : str, optional

The kernel type to be used.

By default it is set to rbf.

• output_names : Iterable[str] | None, optional

The names of the output variables. If None, consider all the output variables of the learning dataset.

By default it is set to None.

• transformer : Mapping[str, TransformerType] | None, optional

The strategies to transform the variables. The values are instances of Transformer while the keys are the names of either the variables or the groups of variables, e.g. "inputs" or "outputs" in the case of the regression algorithms. If a group is specified, the Transformer will be applied to all the variables of this group. If IDENTITY, do not transform the variables.

By default it is set to None.

• **parameters : Any

The parameters of the machine learning algorithm.

## TPSRegressor¶

Required parameters
• data : Dataset

The learning dataset.

Optional parameters
• input_names : Iterable[str], optional

The names of the input variables. If None, consider all the input variables of the learning dataset.

By default it is set to None.

• norm : str | Callable[[ndarray, ndarray], float], optional

The distance metric to be used, either a distance function name known by SciPy or a function that computes the distance between two points.

By default it is set to euclidean.

• output_names : Iterable[str], optional

The names of the output variables. If None, consider all the output variables of the learning dataset.

By default it is set to None.

• smooth : float, optional

The degree of smoothness, 0 involving an interpolation of the learning points.

By default it is set to 0.0.

• transformer : Mapping[str, TransformerType] | None, optional

The strategies to transform the variables. The values are instances of Transformer while the keys are the names of either the variables or the groups of variables, e.g. "inputs" or "outputs" in the case of the regression algorithms. If a group is specified, the Transformer will be applied to all the variables of this group. If IDENTITY, do not transform the variables.

By default it is set to None.

• **parameters : Any