Note
Go to the end to download the full example code.
High-level functions#
The gemseo.mlearning
package includes high-level functions
to create regression models from model class names.
from __future__ import annotations
from gemseo import configure_logger
from gemseo import create_benchmark_dataset
from gemseo.mlearning import create_regression_model
from gemseo.mlearning import get_regression_models
from gemseo.mlearning import get_regression_options
configure_logger()
<RootLogger root (INFO)>
Available models#
Use the get_regression_models()
to list the available model class names:
get_regression_models()
['GaussianProcessRegressor', 'GradientBoostingRegressor', 'LinearRegressor', 'MLPRegressor', 'MOERegressor', 'OTGaussianProcessRegressor', 'PCERegressor', 'PolynomialRegressor', 'RBFRegressor', 'RandomForestRegressor', 'RegressorChain', 'SMTRegressor', 'SVMRegressor', 'TPSRegressor']
Available model options#
Use the get_regression_options()
to get the options of a model
from its class name:
get_regression_options("GaussianProcessRegressor", pretty_print=False)
{'additionalProperties': False, 'description': 'The settings of the Gaussian process regressor from scikit-learn.', 'properties': {'transformer': {'description': 'The strategies to transform the variables.\n\nThe values are instances of :class:`.BaseTransformer`\nwhile the keys are the names of\neither the variables\nor the groups of variables,\ne.g. ``"inputs"`` or ``"outputs"``\nin the case of the regression algorithms.\nIf a group is specified,\nthe :class:`.BaseTransformer` will be applied\nto all the variables of this group.\nIf :attr:`.IDENTITY`, do not transform the variables.', 'title': 'Transformer', 'type': 'object'}, 'parameters': {'description': 'Other parameters.', 'title': 'Parameters', 'type': 'object'}, 'input_names': {'default': [], 'description': 'The names of the input variables', 'items': {'type': 'string'}, 'title': 'Input Names', 'type': 'array'}, 'output_names': {'default': [], 'description': 'The names of the output variables', 'items': {'type': 'string'}, 'title': 'Output Names', 'type': 'array'}, 'kernel': {'anyOf': [{}, {'type': 'null'}], 'default': None, 'description': 'The kernel specifying the covariance model.\n\nIf ``None``, use a Matérn(2.5).', 'title': 'Kernel'}, 'bounds': {'anyOf': [{'items': {}, 'type': 'array'}, {'maxItems': 2, 'minItems': 2, 'prefixItems': [{'type': 'number'}, {'type': 'number'}], 'type': 'array'}, {'additionalProperties': {'maxItems': 2, 'minItems': 2, 'prefixItems': [{'type': 'number'}, {'type': 'number'}], 'type': 'array'}, 'type': 'object'}], 'default': [], 'description': 'The lower and upper bounds of the length scales.\n\nEither a unique lower-upper pair common to all the inputs\nor lower-upper pairs for some of them.\nWhen ``bounds`` is empty or when an input has no pair,\nthe lower bound is 0.01 and the upper bound is 100.\n\nThis argument is ignored when ``kernel`` is ``None``.', 'title': 'Bounds'}, 'alpha': {'default': 1e-10, 'description': 'The nugget effect to regularize the model.', 'title': 'Alpha', 'type': 'number'}, 'optimizer': {'anyOf': [{'type': 'string'}, {}], 'default': 'fmin_l_bfgs_b', 'description': 'The optimization algorithm to find the parameter length scales.', 'title': 'Optimizer'}, 'n_restarts_optimizer': {'default': 10, 'description': 'The number of restarts of the optimizer.', 'minimum': 0, 'title': 'N Restarts Optimizer', 'type': 'integer'}, 'random_state': {'anyOf': [{'minimum': 0, 'type': 'integer'}, {'type': 'null'}], 'default': 0, 'description': 'The random state parameter.\n\nIf ``None``, use the global random state instance from ``numpy.random``.\nCreating the model multiple times will produce different results.\nIf ``int``, use a new random number generator seeded by this integer.\nThis will produce the same results.', 'title': 'Random State'}}, 'title': 'GaussianProcessRegressor_Settings', 'type': 'object'}
See also
The functions
get_regression_models()
and get_regression_options()
can be very useful for the developers.
As a user,
it may be easier to consult this page
to find out about the different algorithms and their options.
Creation#
Given a training dataset, e.g.
dataset = create_benchmark_dataset("RosenbrockDataset", opt_naming=False)
use the create_regression_model()
function
to create a clustering model from its class name and settings:
model = create_regression_model("RBFRegressor", data=dataset)
model.learn()
Total running time of the script: (0 minutes 0.010 seconds)