API

Here are some examples of the machine learning API applied to regression models.

from __future__ import division, unicode_literals

from gemseo.api import (
    configure_logger,
    create_design_space,
    create_discipline,
    create_scenario,
)
from gemseo.mlearning.api import (
    create_regression_model,
    get_regression_models,
    get_regression_options,
)

configure_logger()

Out:

<RootLogger root (INFO)>

Get available regression models

print(get_regression_models())

Out:

['GaussianProcessRegression', 'LinearRegression', 'MixtureOfExperts', 'PCERegression', 'PolynomialRegression', 'RBFRegression', 'RandomForestRegressor']

Get regression model options

print(get_regression_options("GaussianProcessRegression"))

Out:

+---------------------------+--------------------------------------------------------------------------------------------+---------------------------+
|            Name           |                                        Description                                         |            Type           |
+---------------------------+--------------------------------------------------------------------------------------------+---------------------------+
|           alpha           |                         The nugget effect to regularize the model.                         |           number          |
|        input_names        |  The names of the input variables. if none, consider all input variables mentioned in the  |            null           |
|                           |                                     learning dataset.                                      |                           |
|           kernel          |                    The kernel function. if none, use a ``matern(2.5)``.                    |            null           |
|    n_restarts_optimizer   |                          The number of restarts of the optimizer.                          |          integer          |
|         optimizer         |                  The optimization algorithm to find the hyperparameters.                   |           string          |
|        output_names       | The names of the output variables. if none, consider all input variables mentioned in the  |            null           |
|                           |                                     learning dataset.                                      |                           |
|        random_state       |    The seed used to initialize the centers. if none, the random number generator is the    |            null           |
|                           |                        randomstate instance used by `numpy.random`.                        |                           |
|        transformer        |           The strategies to transform the variables. the values are instances of           |            null           |
|                           |  :class:`.transformer` while the keys are the names of either the variables or the groups  |                           |
|                           |  of variables, e.g. "inputs" or "outputs" in the case of the regression algorithms. if a   |                           |
|                           | group is specified, the :class:`.transformer` will be applied to all the variables of this |                           |
|                           |                      group. if none, do not transform the variables.                       |                           |
+---------------------------+--------------------------------------------------------------------------------------------+---------------------------+
    INFO - 14:41:02: +---------------------------+--------------------------------------------------------------------------------------------+---------------------------+
    INFO - 14:41:02: |            Name           |                                        Description                                         |            Type           |
    INFO - 14:41:02: +---------------------------+--------------------------------------------------------------------------------------------+---------------------------+
    INFO - 14:41:02: |           alpha           |                         The nugget effect to regularize the model.                         |           number          |
    INFO - 14:41:02: |        input_names        |  The names of the input variables. if none, consider all input variables mentioned in the  |            null           |
    INFO - 14:41:02: |                           |                                     learning dataset.                                      |                           |
    INFO - 14:41:02: |           kernel          |                    The kernel function. if none, use a ``matern(2.5)``.                    |            null           |
    INFO - 14:41:02: |    n_restarts_optimizer   |                          The number of restarts of the optimizer.                          |          integer          |
    INFO - 14:41:02: |         optimizer         |                  The optimization algorithm to find the hyperparameters.                   |           string          |
    INFO - 14:41:02: |        output_names       | The names of the output variables. if none, consider all input variables mentioned in the  |            null           |
    INFO - 14:41:02: |                           |                                     learning dataset.                                      |                           |
    INFO - 14:41:02: |        random_state       |    The seed used to initialize the centers. if none, the random number generator is the    |            null           |
    INFO - 14:41:02: |                           |                        randomstate instance used by `numpy.random`.                        |                           |
    INFO - 14:41:02: |        transformer        |           The strategies to transform the variables. the values are instances of           |            null           |
    INFO - 14:41:02: |                           |  :class:`.transformer` while the keys are the names of either the variables or the groups  |                           |
    INFO - 14:41:02: |                           |  of variables, e.g. "inputs" or "outputs" in the case of the regression algorithms. if a   |                           |
    INFO - 14:41:02: |                           | group is specified, the :class:`.transformer` will be applied to all the variables of this |                           |
    INFO - 14:41:02: |                           |                      group. if none, do not transform the variables.                       |                           |
    INFO - 14:41:02: +---------------------------+--------------------------------------------------------------------------------------------+---------------------------+
{'$schema': 'http://json-schema.org/schema#', 'type': 'object', 'properties': {'transformer': {'description': 'The strategies to transform the variables. The values are instances of :class:`.Transformer` while the keys are the names of either the variables or the groups of variables, e.g. "inputs" or "outputs" in the case of the regression algorithms. If a group is specified, the :class:`.Transformer` will be applied to all the variables of this group. If None, do not transform the variables.', 'type': 'null'}, 'input_names': {'description': 'The names of the input variables. If None, consider all input variables mentioned in the learning dataset.', 'type': 'null'}, 'output_names': {'description': 'The names of the output variables. If None, consider all input variables mentioned in the learning dataset.', 'type': 'null'}, 'kernel': {'description': 'The kernel function. If None, use a ``Matern(2.5)``.', 'type': 'null'}, 'alpha': {'description': 'The nugget effect to regularize the model.', 'type': 'number'}, 'optimizer': {'description': 'The optimization algorithm to find the hyperparameters.', 'type': 'string'}, 'n_restarts_optimizer': {'description': 'The number of restarts of the optimizer.', 'type': 'integer'}, 'random_state': {'description': 'The seed used to initialize the centers. If None, the random number generator is the RandomState instance used by `numpy.random`.', 'type': 'null'}}, 'required': ['alpha', 'n_restarts_optimizer', 'optimizer']}

Create regression model

expressions_dict = {"y_1": "1+2*x_1+3*x_2", "y_2": "-1-2*x_1-3*x_2"}
discipline = create_discipline(
    "AnalyticDiscipline", name="func", expressions_dict=expressions_dict
)

design_space = create_design_space()
design_space.add_variable("x_1", l_b=0.0, u_b=1.0)
design_space.add_variable("x_2", l_b=0.0, u_b=1.0)

discipline.set_cache_policy(discipline.MEMORY_FULL_CACHE)
scenario = create_scenario(
    [discipline], "DisciplinaryOpt", "y_1", design_space, scenario_type="DOE"
)
scenario.execute({"algo": "fullfact", "n_samples": 9})

dataset = discipline.cache.export_to_dataset()
model = create_regression_model("LinearRegression", data=dataset)
model.learn()

print(model)

Out:

    INFO - 14:41:02:
    INFO - 14:41:02: *** Start DOE Scenario execution ***
    INFO - 14:41:02: DOEScenario
    INFO - 14:41:02:    Disciplines: func
    INFO - 14:41:02:    MDOFormulation: DisciplinaryOpt
    INFO - 14:41:02:    Algorithm: fullfact
    INFO - 14:41:02: Optimization problem:
    INFO - 14:41:02:    Minimize: y_1(x_1, x_2)
    INFO - 14:41:02:    With respect to: x_1, x_2
    INFO - 14:41:02: Full factorial design required. Number of samples along each direction for a design vector of size 2 with 9 samples: 3
    INFO - 14:41:02: Final number of samples for DOE = 9 vs 9 requested
    INFO - 14:41:02: DOE sampling:   0%|          | 0/9 [00:00<?, ?it]
    INFO - 14:41:02: DOE sampling: 100%|██████████| 9/9 [00:00<00:00, 580.44 it/sec, obj=6]
    INFO - 14:41:02: Optimization result:
    INFO - 14:41:02: Objective value = 1.0
    INFO - 14:41:02: The result is feasible.
    INFO - 14:41:02: Status: None
    INFO - 14:41:02: Optimizer message: None
    INFO - 14:41:02: Number of calls to the objective function by the optimizer: 9
    INFO - 14:41:02: Design space:
    INFO - 14:41:02: +------+-------------+-------+-------------+-------+
    INFO - 14:41:02: | name | lower_bound | value | upper_bound | type  |
    INFO - 14:41:02: +------+-------------+-------+-------------+-------+
    INFO - 14:41:02: | x_1  |      0      |   0   |      1      | float |
    INFO - 14:41:02: | x_2  |      0      |   0   |      1      | float |
    INFO - 14:41:02: +------+-------------+-------+-------------+-------+
    INFO - 14:41:02: *** DOE Scenario run terminated ***
/home/docs/checkouts/readthedocs.org/user_builds/gemseo/conda/stable/lib/python3.8/site-packages/sklearn/linear_model/_base.py:148: FutureWarning: 'normalize' was deprecated in version 1.0 and will be removed in 1.2. Please leave the normalize parameter to its default value to silence this warning. The default behavior of this estimator is to not do any normalization. If normalization is needed please use sklearn.preprocessing.StandardScaler instead.
  warnings.warn(
LinearRegression(fit_intercept=True, l2_penalty_ratio=1.0, penalty_level=0.0)
   based on the scikit-learn library
   built from 9 learning samples

Total running time of the script: ( 0 minutes 0.083 seconds)

Gallery generated by Sphinx-Gallery