.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/mlearning/quality_measure/plot_quality_measure_for_comparison.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_mlearning_quality_measure_plot_quality_measure_for_comparison.py: Quality measure for surrogate model comparison ============================================== In this example we use the quality measure class to compare the performances of a mixture of experts (MoE) and a random forest algorithm under different circumstances. We will consider two different datasets: A 1D function, and the Rosenbrock dataset (two inputs and one output). .. GENERATED FROM PYTHON SOURCE LINES 33-35 Import ------ .. GENERATED FROM PYTHON SOURCE LINES 35-49 .. code-block:: default from __future__ import division, unicode_literals import matplotlib.pyplot as plt from numpy import hstack, linspace, meshgrid, sin from gemseo.api import configure_logger, load_dataset from gemseo.core.dataset import Dataset from gemseo.mlearning.api import create_regression_model from gemseo.mlearning.qual_measure.mse_measure import MSEMeasure from gemseo.mlearning.transform.scaler.min_max_scaler import MinMaxScaler configure_logger() .. rst-class:: sphx-glr-script-out Out: .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 50-54 Test on 1D dataset ------------------ In this section we create a dataset from an analytical expression of a 1D function, and compare the errors of the two regression models. .. GENERATED FROM PYTHON SOURCE LINES 56-58 Create 1D dataset from expression ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 58-76 .. code-block:: default def data_gen(x): return 3 + 0.5 * sin(14 * x) * (x <= 0.7) + (x > 0.7) * (0.8 + 6 * (x - 1) ** 2) x = linspace(0, 1, 25) y = data_gen(x) data = hstack((x[:, None], y[:, None])) variables = ["x", "y"] sizes = {"x": 1, "y": 1} groups = {"x": Dataset.INPUT_GROUP, "y": Dataset.OUTPUT_GROUP} dataset = Dataset("dataset_name") dataset.set_from_array(data, variables, sizes, groups) .. GENERATED FROM PYTHON SOURCE LINES 77-79 Plot 1D data ~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 79-85 .. code-block:: default x_refined = linspace(0, 1, 500) y_refined = data_gen(x_refined) plt.plot(x_refined, y_refined) plt.scatter(x, y) plt.show() .. image:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_001.png :alt: plot quality measure for comparison :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 86-88 Create regression algorithms ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 88-106 .. code-block:: default moe = create_regression_model( "MixtureOfExperts", dataset, transformer={"outputs": MinMaxScaler()} ) moe.set_clusterer("GaussianMixture", n_components=4) moe.set_classifier("KNNClassifier", n_neighbors=3) moe.set_regressor( "PolynomialRegression", degree=5, l2_penalty_ratio=1, penalty_level=0.00005 ) randfor = create_regression_model( "RandomForestRegressor", dataset, transformer={"outputs": MinMaxScaler()}, n_estimators=50, ) .. GENERATED FROM PYTHON SOURCE LINES 107-109 Compute measures (Mean Squared Error) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 109-112 .. code-block:: default measure_moe = MSEMeasure(moe) measure_randfor = MSEMeasure(randfor) .. GENERATED FROM PYTHON SOURCE LINES 113-115 Evaluate on training set directly (keyword: 'learn') **************************************************** .. GENERATED FROM PYTHON SOURCE LINES 115-147 .. code-block:: default print("Learn:") print("Error MoE:", measure_moe.evaluate(method="learn")) print("Error Random Forest:", measure_randfor.evaluate(method="learn")) plt.figure() plt.plot(x_refined, moe.predict(x_refined[:, None]).flatten(), label="MoE") plt.plot(x_refined, randfor.predict(x_refined[:, None]).flatten(), label="RndFr") plt.scatter(x, y) plt.legend() plt.ylim(2, 5) plt.show() plt.figure() plt.plot( x_refined, moe.predict_local_model(x_refined[:, None], 0).flatten(), label="MoE 0" ) plt.plot( x_refined, moe.predict_local_model(x_refined[:, None], 1).flatten(), label="MoE 1" ) plt.plot( x_refined, moe.predict_local_model(x_refined[:, None], 2).flatten(), label="MoE 2" ) plt.plot( x_refined, moe.predict_local_model(x_refined[:, None], 3).flatten(), label="MoE 3" ) plt.plot(x_refined, moe.predict(x_refined[:, None]).flatten(), label="MoE") plt.plot(x_refined, randfor.predict(x_refined[:, None]).flatten(), label="RndFr") plt.scatter(x, y) plt.legend() plt.ylim(2, 5) plt.show() .. rst-class:: sphx-glr-horizontal * .. image:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_002.png :alt: plot quality measure for comparison :class: sphx-glr-multi-img * .. image:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_003.png :alt: plot quality measure for comparison :class: sphx-glr-multi-img .. rst-class:: sphx-glr-script-out Out: .. code-block:: none Learn: Error MoE: [0.00130025] Error Random Forest: [0.01388602] .. GENERATED FROM PYTHON SOURCE LINES 148-153 Evaluate using cross validation (keyword: 'kfolds') *************************************************** In order to better consider the generalization error, perform a k-folds cross validation algorithm. We also plot the predictions from the last iteration of the algorithm. .. GENERATED FROM PYTHON SOURCE LINES 153-169 .. code-block:: default print("K-folds:") print("Error MoE:", measure_moe.evaluate("kfolds")) print("Error Random Forest:", measure_randfor.evaluate("kfolds")) print("Loo:") print("Error MoE:", measure_moe.evaluate("loo")) print("Error Random Forest:", measure_randfor.evaluate("loo")) plt.plot(x_refined, moe.predict(x_refined[:, None]).flatten(), label="MoE") plt.plot( x_refined, randfor.predict(x_refined[:, None]).flatten(), label="Random Forest" ) plt.scatter(x, y) plt.legend() plt.show() .. image:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_004.png :alt: plot quality measure for comparison :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out Out: .. code-block:: none K-folds: Error MoE: [0.29601814] Error Random Forest: [0.14403934] Loo: Error MoE: [0.13216698] Error Random Forest: [0.07299357] .. GENERATED FROM PYTHON SOURCE LINES 170-174 Test on 2D dataset (Rosenbrock) ------------------------------- In this section, we load the Rosenbrock dataset, and compare the error measures for the two regression models. .. GENERATED FROM PYTHON SOURCE LINES 176-178 Load dataset ~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 178-191 .. code-block:: default dataset = load_dataset("RosenbrockDataset", opt_naming=False) x = dataset.get_data_by_group(dataset.INPUT_GROUP) y = dataset.get_data_by_group(dataset.OUTPUT_GROUP) Y = y.reshape((10, 10)) refinement = 100 x_refined = linspace(-2, 2, refinement) X_1_refined, X_2_refined = meshgrid(x_refined, x_refined) x_1_refined, x_2_refined = X_1_refined.flatten(), X_2_refined.flatten() x_refined = hstack((x_1_refined[:, None], x_2_refined[:, None])) print(dataset) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none Rosenbrock Number of samples: 100 Number of variables: 2 Variables names and sizes by group: inputs: x (2) outputs: rosen (1) Number of dimensions (total = 3) by group: inputs: 2 outputs: 1 .. GENERATED FROM PYTHON SOURCE LINES 192-194 Create regression algorithms ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 194-211 .. code-block:: default moe = create_regression_model( "MixtureOfExperts", dataset, transformer={"outputs": MinMaxScaler()} ) moe.set_clusterer("KMeans", n_clusters=3) moe.set_classifier("KNNClassifier", n_neighbors=5) moe.set_regressor( "PolynomialRegression", degree=5, l2_penalty_ratio=1, penalty_level=0.1 ) randfor = create_regression_model( "RandomForestRegressor", dataset, transformer={"outputs": MinMaxScaler()}, n_estimators=200, ) .. GENERATED FROM PYTHON SOURCE LINES 212-214 Compute measures (Mean Squared Error) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 214-226 .. code-block:: default measure_moe = MSEMeasure(moe) measure_randfor = MSEMeasure(randfor) print("Learn:") print("Error MoE:", measure_moe.evaluate(method="learn")) print("Error Random Forest:", measure_randfor.evaluate(method="learn")) print("K-folds:") print("Error MoE:", measure_moe.evaluate("kfolds")) print("Error Random Forest:", measure_randfor.evaluate("kfolds")) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none Learn: Error MoE: [8.19866424] Error Random Forest: [4567.22027112] K-folds: Error MoE: [8983.49635387] Error Random Forest: [104270.92710389] .. GENERATED FROM PYTHON SOURCE LINES 227-229 Plot data ~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 229-233 .. code-block:: default plt.imshow(Y, interpolation="nearest") plt.colorbar() plt.show() .. image:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_005.png :alt: plot quality measure for comparison :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 234-236 Plot predictions ~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 236-244 .. code-block:: default moe.learn() randfor.learn() Y_pred_moe = moe.predict(x_refined).reshape((refinement, refinement)) Y_pred_moe_0 = moe.predict_local_model(x_refined, 0).reshape((refinement, refinement)) Y_pred_moe_1 = moe.predict_local_model(x_refined, 1).reshape((refinement, refinement)) Y_pred_moe_2 = moe.predict_local_model(x_refined, 2).reshape((refinement, refinement)) Y_pred_randfor = randfor.predict(x_refined).reshape((refinement, refinement)) .. GENERATED FROM PYTHON SOURCE LINES 245-247 Plot mixture of experts predictions *********************************** .. GENERATED FROM PYTHON SOURCE LINES 247-251 .. code-block:: default plt.imshow(Y_pred_moe) plt.colorbar() plt.show() .. image:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_006.png :alt: plot quality measure for comparison :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 252-254 Plot local models *********************************** .. GENERATED FROM PYTHON SOURCE LINES 254-269 .. code-block:: default plt.figure() plt.imshow(Y_pred_moe_0) plt.colorbar() plt.show() plt.figure() plt.imshow(Y_pred_moe_1) plt.colorbar() plt.show() plt.figure() plt.imshow(Y_pred_moe_2) plt.colorbar() plt.show() .. rst-class:: sphx-glr-horizontal * .. image:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_007.png :alt: plot quality measure for comparison :class: sphx-glr-multi-img * .. image:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_008.png :alt: plot quality measure for comparison :class: sphx-glr-multi-img * .. image:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_009.png :alt: plot quality measure for comparison :class: sphx-glr-multi-img .. GENERATED FROM PYTHON SOURCE LINES 270-272 Plot random forest predictions ****************************** .. GENERATED FROM PYTHON SOURCE LINES 272-275 .. code-block:: default plt.imshow(Y_pred_randfor) plt.colorbar() plt.show() .. image:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_010.png :alt: plot quality measure for comparison :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 8.468 seconds) .. _sphx_glr_download_examples_mlearning_quality_measure_plot_quality_measure_for_comparison.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_quality_measure_for_comparison.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_quality_measure_for_comparison.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_