.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/mlearning/quality_measure/plot_quality_measure_for_comparison.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_mlearning_quality_measure_plot_quality_measure_for_comparison.py: Quality measure for surrogate model comparison ============================================== In this example we use the quality measure class to compare the performances of a mixture of experts (MoE) and a random forest algorithm under different circumstances. We will consider two different datasets: A 1D function, and the Rosenbrock dataset (two inputs and one output). .. GENERATED FROM PYTHON SOURCE LINES 31-33 Import ------ .. GENERATED FROM PYTHON SOURCE LINES 33-48 .. code-block:: default import matplotlib.pyplot as plt from gemseo.api import configure_logger from gemseo.api import load_dataset from gemseo.core.dataset import Dataset from gemseo.mlearning.api import create_regression_model from gemseo.mlearning.qual_measure.mse_measure import MSEMeasure from gemseo.mlearning.transform.scaler.min_max_scaler import MinMaxScaler from numpy import hstack from numpy import linspace from numpy import meshgrid from numpy import sin configure_logger() .. rst-class:: sphx-glr-script-out Out: .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 49-53 Test on 1D dataset ------------------ In this section we create a dataset from an analytical expression of a 1D function, and compare the errors of the two regression models. .. GENERATED FROM PYTHON SOURCE LINES 55-57 Create 1D dataset from expression ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 57-75 .. code-block:: default def data_gen(x): return 3 + 0.5 * sin(14 * x) * (x <= 0.7) + (x > 0.7) * (0.8 + 6 * (x - 1) ** 2) x = linspace(0, 1, 25) y = data_gen(x) data = hstack((x[:, None], y[:, None])) variables = ["x", "y"] sizes = {"x": 1, "y": 1} groups = {"x": Dataset.INPUT_GROUP, "y": Dataset.OUTPUT_GROUP} dataset = Dataset("dataset_name") dataset.set_from_array(data, variables, sizes, groups) .. GENERATED FROM PYTHON SOURCE LINES 76-78 Plot 1D data ~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 78-84 .. code-block:: default x_refined = linspace(0, 1, 500) y_refined = data_gen(x_refined) plt.plot(x_refined, y_refined) plt.scatter(x, y) plt.show() .. image-sg:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_001.png :alt: plot quality measure for comparison :srcset: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 85-87 Create regression algorithms ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 87-105 .. code-block:: default moe = create_regression_model( "MOERegressor", dataset, transformer={"outputs": MinMaxScaler()} ) moe.set_clusterer("GaussianMixture", n_components=4) moe.set_classifier("KNNClassifier", n_neighbors=3) moe.set_regressor( "PolynomialRegressor", degree=5, l2_penalty_ratio=1, penalty_level=0.00005 ) randfor = create_regression_model( "RandomForestRegressor", dataset, transformer={"outputs": MinMaxScaler()}, n_estimators=50, ) .. GENERATED FROM PYTHON SOURCE LINES 106-108 Compute measures (Mean Squared Error) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 108-111 .. code-block:: default measure_moe = MSEMeasure(moe) measure_randfor = MSEMeasure(randfor) .. GENERATED FROM PYTHON SOURCE LINES 112-114 Evaluate on training set directly (keyword: 'learn') **************************************************** .. GENERATED FROM PYTHON SOURCE LINES 114-146 .. code-block:: default print("Learn:") print("Error MoE:", measure_moe.evaluate(method="learn")) print("Error Random Forest:", measure_randfor.evaluate(method="learn")) plt.figure() plt.plot(x_refined, moe.predict(x_refined[:, None]).flatten(), label="MoE") plt.plot(x_refined, randfor.predict(x_refined[:, None]).flatten(), label="RndFr") plt.scatter(x, y) plt.legend() plt.ylim(2, 5) plt.show() plt.figure() plt.plot( x_refined, moe.predict_local_model(x_refined[:, None], 0).flatten(), label="MoE 0" ) plt.plot( x_refined, moe.predict_local_model(x_refined[:, None], 1).flatten(), label="MoE 1" ) plt.plot( x_refined, moe.predict_local_model(x_refined[:, None], 2).flatten(), label="MoE 2" ) plt.plot( x_refined, moe.predict_local_model(x_refined[:, None], 3).flatten(), label="MoE 3" ) plt.plot(x_refined, moe.predict(x_refined[:, None]).flatten(), label="MoE") plt.plot(x_refined, randfor.predict(x_refined[:, None]).flatten(), label="RndFr") plt.scatter(x, y) plt.legend() plt.ylim(2, 5) plt.show() .. rst-class:: sphx-glr-horizontal * .. image-sg:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_002.png :alt: plot quality measure for comparison :srcset: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_002.png :class: sphx-glr-multi-img * .. image-sg:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_003.png :alt: plot quality measure for comparison :srcset: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_003.png :class: sphx-glr-multi-img .. rst-class:: sphx-glr-script-out Out: .. code-block:: none Learn: Error MoE: [0.00130025] Error Random Forest: [0.01388602] .. GENERATED FROM PYTHON SOURCE LINES 147-152 Evaluate using cross validation (keyword: 'kfolds') *************************************************** In order to better consider the generalization error, perform a k-folds cross validation algorithm. We also plot the predictions from the last iteration of the algorithm. .. GENERATED FROM PYTHON SOURCE LINES 152-168 .. code-block:: default print("K-folds:") print("Error MoE:", measure_moe.evaluate("kfolds")) print("Error Random Forest:", measure_randfor.evaluate("kfolds")) print("Loo:") print("Error MoE:", measure_moe.evaluate("loo")) print("Error Random Forest:", measure_randfor.evaluate("loo")) plt.plot(x_refined, moe.predict(x_refined[:, None]).flatten(), label="MoE") plt.plot( x_refined, randfor.predict(x_refined[:, None]).flatten(), label="Random Forest" ) plt.scatter(x, y) plt.legend() plt.show() .. image-sg:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_004.png :alt: plot quality measure for comparison :srcset: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_004.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out Out: .. code-block:: none K-folds: Error MoE: [28.76108725] Error Random Forest: [27.38342059] Loo: Error MoE: [26.41217106] Error Random Forest: [27.84836118] .. GENERATED FROM PYTHON SOURCE LINES 169-173 Test on 2D dataset (Rosenbrock) ------------------------------- In this section, we load the Rosenbrock dataset, and compare the error measures for the two regression models. .. GENERATED FROM PYTHON SOURCE LINES 175-177 Load dataset ~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 177-190 .. code-block:: default dataset = load_dataset("RosenbrockDataset", opt_naming=False) x = dataset.get_data_by_group(dataset.INPUT_GROUP) y = dataset.get_data_by_group(dataset.OUTPUT_GROUP) Y = y.reshape((10, 10)) refinement = 100 x_refined = linspace(-2, 2, refinement) X_1_refined, X_2_refined = meshgrid(x_refined, x_refined) x_1_refined, x_2_refined = X_1_refined.flatten(), X_2_refined.flatten() x_refined = hstack((x_1_refined[:, None], x_2_refined[:, None])) print(dataset) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none Rosenbrock Number of samples: 100 Number of variables: 2 Variables names and sizes by group: inputs: x (2) outputs: rosen (1) Number of dimensions (total = 3) by group: inputs: 2 outputs: 1 .. GENERATED FROM PYTHON SOURCE LINES 191-193 Create regression algorithms ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 193-210 .. code-block:: default moe = create_regression_model( "MOERegressor", dataset, transformer={"outputs": MinMaxScaler()} ) moe.set_clusterer("KMeans", n_clusters=3) moe.set_classifier("KNNClassifier", n_neighbors=5) moe.set_regressor( "PolynomialRegressor", degree=5, l2_penalty_ratio=1, penalty_level=0.1 ) randfor = create_regression_model( "RandomForestRegressor", dataset, transformer={"outputs": MinMaxScaler()}, n_estimators=200, ) .. GENERATED FROM PYTHON SOURCE LINES 211-213 Compute measures (Mean Squared Error) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 213-225 .. code-block:: default measure_moe = MSEMeasure(moe) measure_randfor = MSEMeasure(randfor) print("Learn:") print("Error MoE:", measure_moe.evaluate(method="learn")) print("Error Random Forest:", measure_randfor.evaluate(method="learn")) print("K-folds:") print("Error MoE:", measure_moe.evaluate("kfolds")) print("Error Random Forest:", measure_randfor.evaluate("kfolds")) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none Learn: Error MoE: [8.19866424] Error Random Forest: [4567.22027112] K-folds: Error MoE: [1.64097462e+13] Error Random Forest: [1.03057662e+13] .. GENERATED FROM PYTHON SOURCE LINES 226-228 Plot data ~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 228-232 .. code-block:: default plt.imshow(Y, interpolation="nearest") plt.colorbar() plt.show() .. image-sg:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_005.png :alt: plot quality measure for comparison :srcset: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_005.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 233-235 Plot predictions ~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 235-243 .. code-block:: default moe.learn() randfor.learn() Y_pred_moe = moe.predict(x_refined).reshape((refinement, refinement)) Y_pred_moe_0 = moe.predict_local_model(x_refined, 0).reshape((refinement, refinement)) Y_pred_moe_1 = moe.predict_local_model(x_refined, 1).reshape((refinement, refinement)) Y_pred_moe_2 = moe.predict_local_model(x_refined, 2).reshape((refinement, refinement)) Y_pred_randfor = randfor.predict(x_refined).reshape((refinement, refinement)) .. GENERATED FROM PYTHON SOURCE LINES 244-246 Plot mixture of experts predictions *********************************** .. GENERATED FROM PYTHON SOURCE LINES 246-250 .. code-block:: default plt.imshow(Y_pred_moe) plt.colorbar() plt.show() .. image-sg:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_006.png :alt: plot quality measure for comparison :srcset: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_006.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 251-253 Plot local models *********************************** .. GENERATED FROM PYTHON SOURCE LINES 253-268 .. code-block:: default plt.figure() plt.imshow(Y_pred_moe_0) plt.colorbar() plt.show() plt.figure() plt.imshow(Y_pred_moe_1) plt.colorbar() plt.show() plt.figure() plt.imshow(Y_pred_moe_2) plt.colorbar() plt.show() .. rst-class:: sphx-glr-horizontal * .. image-sg:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_007.png :alt: plot quality measure for comparison :srcset: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_007.png :class: sphx-glr-multi-img * .. image-sg:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_008.png :alt: plot quality measure for comparison :srcset: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_008.png :class: sphx-glr-multi-img * .. image-sg:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_009.png :alt: plot quality measure for comparison :srcset: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_009.png :class: sphx-glr-multi-img .. GENERATED FROM PYTHON SOURCE LINES 269-271 Plot random forest predictions ****************************** .. GENERATED FROM PYTHON SOURCE LINES 271-274 .. code-block:: default plt.imshow(Y_pred_randfor) plt.colorbar() plt.show() .. image-sg:: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_010.png :alt: plot quality measure for comparison :srcset: /examples/mlearning/quality_measure/images/sphx_glr_plot_quality_measure_for_comparison_010.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 6.344 seconds) .. _sphx_glr_download_examples_mlearning_quality_measure_plot_quality_measure_for_comparison.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_quality_measure_for_comparison.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_quality_measure_for_comparison.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_