.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/mlearning/calibration/plot_selection.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_mlearning_calibration_plot_selection.py: Machine learning algorithm selection example ============================================ In this example we use the :class:`.MLAlgoSelection` class to perform a grid search over different algorithms and hyperparameter values. .. GENERATED FROM PYTHON SOURCE LINES 23-38 .. code-block:: Python from __future__ import annotations import matplotlib.pyplot as plt from numpy import linspace from numpy import sort from numpy.random import default_rng from gemseo.algos.design_space import DesignSpace from gemseo.datasets.io_dataset import IODataset from gemseo.mlearning.core.selection import MLAlgoSelection from gemseo.mlearning.quality_measures.mse_measure import MSEMeasure rng = default_rng(54321) .. GENERATED FROM PYTHON SOURCE LINES 39-48 Build dataset ------------- The data are generated from the function :math:`f(x)=x^2`. The input data :math:`\{x_i\}_{i=1,\cdots,20}` are chosen at random over the interval :math:`[0,1]`. The output value :math:`y_i = f(x_i) + \varepsilon_i` corresponds to the evaluation of :math:`f` at :math:`x_i` corrupted by a Gaussian noise :math:`\varepsilon_i` with zero mean and standard deviation :math:`\sigma=0.05`. .. GENERATED FROM PYTHON SOURCE LINES 48-56 .. code-block:: Python n = 20 x = sort(rng.random(n)) y = x**2 + rng.normal(0, 0.05, n) dataset = IODataset() dataset.add_variable("x", x[:, None], dataset.INPUT_GROUP) dataset.add_variable("y", y[:, None], dataset.OUTPUT_GROUP) .. GENERATED FROM PYTHON SOURCE LINES 57-62 Build selector -------------- We consider three regression models, with different possible hyperparameters. A mean squared error quality measure is used with a k-folds cross validation scheme (5 folds). .. GENERATED FROM PYTHON SOURCE LINES 62-87 .. code-block:: Python selector = MLAlgoSelection( dataset, MSEMeasure, measure_evaluation_method_name="KFOLDS", n_folds=5 ) selector.add_candidate( "LinearRegressor", penalty_level=[0, 0.1, 1, 10, 20], l2_penalty_ratio=[0, 0.5, 1], fit_intercept=[True], ) selector.add_candidate( "PolynomialRegressor", degree=[2, 3, 4, 10], penalty_level=[0, 0.1, 1, 10], l2_penalty_ratio=[1], fit_intercept=[True, False], ) rbf_space = DesignSpace() rbf_space.add_variable("epsilon", 1, "float", 0.01, 0.1, 0.05) selector.add_candidate( "RBFRegressor", calib_space=rbf_space, calib_algo={"algo": "fullfact", "n_samples": 16}, smooth=[0, 0.01, 0.1, 1, 10, 100], ) .. GENERATED FROM PYTHON SOURCE LINES 88-90 Select best candidate --------------------- .. GENERATED FROM PYTHON SOURCE LINES 90-93 .. code-block:: Python best_algo = selector.select() best_algo .. raw:: html
PolynomialRegressor(degree=4, fit_intercept=True, l2_penalty_ratio=1, penalty_level=0, random_state=0)
  • based on the scikit-learn library
  • built from 20 learning samples


.. GENERATED FROM PYTHON SOURCE LINES 94-97 Plot results ------------ Plot the best models from each candidate algorithm .. GENERATED FROM PYTHON SOURCE LINES 97-106 .. code-block:: Python finex = linspace(0, 1, 1000) for candidate in selector.candidates: algo = candidate[0] print(algo) predy = algo.predict(finex[:, None])[:, 0] plt.plot(finex, predy, label=algo.SHORT_ALGO_NAME) plt.scatter(x, y, label="Training points") plt.legend() plt.show() .. image-sg:: /examples/mlearning/calibration/images/sphx_glr_plot_selection_001.png :alt: plot selection :srcset: /examples/mlearning/calibration/images/sphx_glr_plot_selection_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none LinearRegressor(fit_intercept=True, l2_penalty_ratio=0, penalty_level=0, random_state=0) based on the scikit-learn library built from 20 learning samples PolynomialRegressor(degree=4, fit_intercept=True, l2_penalty_ratio=1, penalty_level=0, random_state=0) based on the scikit-learn library built from 20 learning samples RBFRegressor(epsilon=0.1, function=multiquadric, norm=euclidean, smooth=0.1) based on the SciPy library built from 20 learning samples .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 3.821 seconds) .. _sphx_glr_download_examples_mlearning_calibration_plot_selection.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_selection.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_selection.py ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_