.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/mlearning/calibration/plot_selection.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_mlearning_calibration_plot_selection.py: Machine learning algorithm selection example ============================================ In this example we use the :class:`.MLAlgoSelection` class to perform a grid search over different algorithms and hyperparameter values. .. GENERATED FROM PYTHON SOURCE LINES 23-32 .. code-block:: default import matplotlib.pyplot as plt import numpy as np from gemseo.algos.design_space import DesignSpace from gemseo.core.dataset import Dataset from gemseo.mlearning.core.selection import MLAlgoSelection from gemseo.mlearning.qual_measure.mse_measure import MSEMeasure np.random.seed(54321) .. GENERATED FROM PYTHON SOURCE LINES 33-41 Build dataset ------------- The data consists of a 1D-function :math:`f:[0,1]\to[0,1]`, where :math:`f(x)=x^2`. The inputs :math:`(x_i)_{i=1,\cdots,n}` are chosen randomly from the interval :math:`[0,1]`. The outputs :math:`y_i = f(x_i) + \epsilon_i`contain added noise, where :math:`\epsilon_i\tilde \mathcal{N}(0,\sigma^2)`. We choose :math:`n=20` and :math:`\sigma=0.05`. .. GENERATED FROM PYTHON SOURCE LINES 41-49 .. code-block:: default n = 20 x = np.sort(np.random.random(n)) y = x**2 + np.random.normal(0, 0.05, n) dataset = Dataset() dataset.add_variable("x", x[:, None], Dataset.INPUT_GROUP) dataset.add_variable("y", y[:, None], Dataset.OUTPUT_GROUP, cache_as_input=False) .. GENERATED FROM PYTHON SOURCE LINES 50-55 Build selector -------------- We consider three regression models, with different possible hyperparameters. A mean squared error quality measure is used with a k-folds cross validation scheme (5 folds). .. GENERATED FROM PYTHON SOURCE LINES 55-78 .. code-block:: default selector = MLAlgoSelection(dataset, MSEMeasure, eval_method="kfolds", n_folds=5) selector.add_candidate( "LinearRegressor", penalty_level=[0, 0.1, 1, 10, 20], l2_penalty_ratio=[0, 0.5, 1], fit_intercept=[True], ) selector.add_candidate( "PolynomialRegressor", degree=[2, 3, 4, 10], penalty_level=[0, 0.1, 1, 10], l2_penalty_ratio=[1], fit_intercept=[True, False], ) rbf_space = DesignSpace() rbf_space.add_variable("epsilon", 1, "float", 0.01, 0.1, 0.05) selector.add_candidate( "RBFRegressor", calib_space=rbf_space, calib_algo={"algo": "fullfact", "n_samples": 16}, smooth=[0, 0.01, 0.1, 1, 10, 100], ) .. GENERATED FROM PYTHON SOURCE LINES 79-81 Select best candidate --------------------- .. GENERATED FROM PYTHON SOURCE LINES 81-84 .. code-block:: default best_algo = selector.select() print(best_algo) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none PolynomialRegressor(degree=2, fit_intercept=False, l2_penalty_ratio=1, penalty_level=0) based on the scikit-learn library built from 20 learning samples .. GENERATED FROM PYTHON SOURCE LINES 85-88 Plot results ------------ Plot the best models from each candidate algorithm .. GENERATED FROM PYTHON SOURCE LINES 88-97 .. code-block:: default finex = np.linspace(0, 1, 1000) for candidate in selector.candidates: algo = candidate[0] print(algo) predy = algo.predict(finex[:, None])[:, 0] plt.plot(finex, predy, label=algo.SHORT_ALGO_NAME) plt.scatter(x, y, label="Training points") plt.legend() plt.show() .. image-sg:: /examples/mlearning/calibration/images/sphx_glr_plot_selection_001.png :alt: plot selection :srcset: /examples/mlearning/calibration/images/sphx_glr_plot_selection_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out Out: .. code-block:: none LinearRegressor(fit_intercept=True, l2_penalty_ratio=1, penalty_level=0.1) based on the scikit-learn library built from 20 learning samples PolynomialRegressor(degree=2, fit_intercept=False, l2_penalty_ratio=1, penalty_level=0) based on the scikit-learn library built from 20 learning samples RBFRegressor(epsilon=0.1, function='multiquadric', norm='euclidean', smooth=0.01) based on the SciPy library built from 20 learning samples .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.985 seconds) .. _sphx_glr_download_examples_mlearning_calibration_plot_selection.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_selection.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_selection.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_