Note

Go to the end to download the full example code.

Gaussian process (GP) regression#

A GaussianProcessRegressor is a GP regression model based on scikit-learn.

Problem#

In this example, we represent the function \(f(x)=(6x-2)^2\sin(12x-4)\) [FSK08] by the AnalyticDiscipline

discipline = create_discipline(
    "AnalyticDiscipline",
    name="f",
    expressions={"y": "(6*x-2)**2*sin(12*x-4)"},
)

and seek to approximate it over the input space

input_space = create_design_space()
input_space.add_variable("x", lower_bound=0.0, upper_bound=1.0)

To do this, we create a training dataset with 6 equispaced points:

training_dataset = sample_disciplines(
    [discipline], input_space, "y", algo_name="PYDOE_FULLFACT", n_samples=6
)

INFO - 16:16:07: *** Start Sampling execution ***
INFO - 16:16:07: Sampling
INFO - 16:16:07:    Disciplines: f
INFO - 16:16:07:    MDO formulation: MDF
INFO - 16:16:07: Running the algorithm PYDOE_FULLFACT:
INFO - 16:16:07:     17%|█▋        | 1/6 [00:00<00:00, 597.65 it/sec]
INFO - 16:16:07:     33%|███▎      | 2/6 [00:00<00:00, 989.11 it/sec]
INFO - 16:16:07:     50%|█████     | 3/6 [00:00<00:00, 1295.74 it/sec]
INFO - 16:16:07:     67%|██████▋   | 4/6 [00:00<00:00, 1556.47 it/sec]
INFO - 16:16:07:     83%|████████▎ | 5/6 [00:00<00:00, 1781.02 it/sec]
INFO - 16:16:07:    100%|██████████| 6/6 [00:00<00:00, 1911.28 it/sec]
INFO - 16:16:07: *** End Sampling execution ***

Basics#

Training#

Then, we train a GP regression model from these samples:

model = create_regression_model("GaussianProcessRegressor", training_dataset)
model.learn()

Prediction#

Once it is built, we can predict the output value of \(f\) at a new input point:

input_value = {"x": array([0.65])}
output_value = model.predict(input_value)
output_value

{'y': array([2.20380214])}

but cannot predict its Jacobian value:

try:
    model.predict_jacobian(input_value)
except NotImplementedError:
    print("The derivatives are not available for GaussianProcessRegressor.")

The derivatives are not available for GaussianProcessRegressor.

Uncertainty#

GP models are often valued for their ability to provide model uncertainty. Indeed, a GP model is a random process fully characterized by its mean function and a covariance structure. Given an input point \(x\), the prediction is equal to the mean at \(x\) and the uncertainty is equal to the standard deviation at \(x\):

standard_deviation = model.predict_std(input_value)
standard_deviation

array([[0.3140468]])

Plotting#

You can see that the GP model interpolates the training points but is very bad elsewhere. This case-dependent problem is due to poor auto-tuning of these length scales. We will look at how to correct this next.

test_dataset = sample_disciplines(
    [discipline], input_space, "y", algo_name="PYDOE_FULLFACT", n_samples=100
)
input_data = test_dataset.get_view(variable_names=model.input_names).to_numpy()
reference_output_data = test_dataset.get_view(variable_names="y").to_numpy().ravel()
predicted_output_data = model.predict(input_data).ravel()
plt.plot(input_data.ravel(), reference_output_data, label="Reference")
plt.plot(input_data.ravel(), predicted_output_data, label="Regression - Basics")
plt.grid()
plt.legend()
plt.show()

INFO - 16:16:07: *** Start Sampling execution ***
INFO - 16:16:07: Sampling
INFO - 16:16:07:    Disciplines: f
INFO - 16:16:07:    MDO formulation: MDF
INFO - 16:16:07: Running the algorithm PYDOE_FULLFACT:
INFO - 16:16:07:      1%|          | 1/100 [00:00<00:00, 4140.48 it/sec]
INFO - 16:16:07:      2%|▏         | 2/100 [00:00<00:00, 3648.81 it/sec]
INFO - 16:16:07:      3%|▎         | 3/100 [00:00<00:00, 3680.29 it/sec]
INFO - 16:16:07:      4%|▍         | 4/100 [00:00<00:00, 3789.75 it/sec]
INFO - 16:16:07:      5%|▌         | 5/100 [00:00<00:00, 3882.18 it/sec]
INFO - 16:16:07:      6%|▌         | 6/100 [00:00<00:00, 3866.91 it/sec]
INFO - 16:16:07:      7%|▋         | 7/100 [00:00<00:00, 3929.35 it/sec]
INFO - 16:16:07:      8%|▊         | 8/100 [00:00<00:00, 3993.15 it/sec]
INFO - 16:16:07:      9%|▉         | 9/100 [00:00<00:00, 4049.42 it/sec]
INFO - 16:16:07:     10%|█         | 10/100 [00:00<00:00, 4096.00 it/sec]
INFO - 16:16:07:     11%|█         | 11/100 [00:00<00:00, 4083.67 it/sec]
INFO - 16:16:07:     12%|█▏        | 12/100 [00:00<00:00, 4123.85 it/sec]
INFO - 16:16:07:     13%|█▎        | 13/100 [00:00<00:00, 4155.31 it/sec]
INFO - 16:16:07:     14%|█▍        | 14/100 [00:00<00:00, 4192.51 it/sec]
INFO - 16:16:07:     15%|█▌        | 15/100 [00:00<00:00, 4195.14 it/sec]
INFO - 16:16:07:     16%|█▌        | 16/100 [00:00<00:00, 4216.71 it/sec]
INFO - 16:16:07:     17%|█▋        | 17/100 [00:00<00:00, 4236.17 it/sec]
INFO - 16:16:07:     18%|█▊        | 18/100 [00:00<00:00, 4254.82 it/sec]
INFO - 16:16:07:     19%|█▉        | 19/100 [00:00<00:00, 4274.85 it/sec]
INFO - 16:16:07:     20%|██        | 20/100 [00:00<00:00, 4274.45 it/sec]
INFO - 16:16:07:     21%|██        | 21/100 [00:00<00:00, 4290.95 it/sec]
INFO - 16:16:07:     22%|██▏       | 22/100 [00:00<00:00, 4306.07 it/sec]
INFO - 16:16:07:     23%|██▎       | 23/100 [00:00<00:00, 4322.67 it/sec]
INFO - 16:16:07:     24%|██▍       | 24/100 [00:00<00:00, 4324.02 it/sec]
INFO - 16:16:07:     25%|██▌       | 25/100 [00:00<00:00, 4333.85 it/sec]
INFO - 16:16:07:     26%|██▌       | 26/100 [00:00<00:00, 4345.56 it/sec]
INFO - 16:16:07:     27%|██▋       | 27/100 [00:00<00:00, 4358.31 it/sec]
INFO - 16:16:07:     28%|██▊       | 28/100 [00:00<00:00, 4371.51 it/sec]
INFO - 16:16:07:     29%|██▉       | 29/100 [00:00<00:00, 4368.28 it/sec]
INFO - 16:16:07:     30%|███       | 30/100 [00:00<00:00, 4320.91 it/sec]
INFO - 16:16:07:     31%|███       | 31/100 [00:00<00:00, 4324.31 it/sec]
INFO - 16:16:07:     32%|███▏      | 32/100 [00:00<00:00, 4332.40 it/sec]
INFO - 16:16:07:     33%|███▎      | 33/100 [00:00<00:00, 4328.22 it/sec]
INFO - 16:16:07:     34%|███▍      | 34/100 [00:00<00:00, 4334.28 it/sec]
INFO - 16:16:07:     35%|███▌      | 35/100 [00:00<00:00, 4337.82 it/sec]
INFO - 16:16:07:     36%|███▌      | 36/100 [00:00<00:00, 4344.93 it/sec]
INFO - 16:16:07:     37%|███▋      | 37/100 [00:00<00:00, 4343.75 it/sec]
INFO - 16:16:07:     38%|███▊      | 38/100 [00:00<00:00, 4349.16 it/sec]
INFO - 16:16:07:     39%|███▉      | 39/100 [00:00<00:00, 4358.12 it/sec]
INFO - 16:16:07:     40%|████      | 40/100 [00:00<00:00, 4367.70 it/sec]
INFO - 16:16:07:     41%|████      | 41/100 [00:00<00:00, 4376.07 it/sec]
INFO - 16:16:07:     42%|████▏     | 42/100 [00:00<00:00, 4374.38 it/sec]
INFO - 16:16:07:     43%|████▎     | 43/100 [00:00<00:00, 4331.92 it/sec]
INFO - 16:16:07:     44%|████▍     | 44/100 [00:00<00:00, 4336.52 it/sec]
INFO - 16:16:07:     45%|████▌     | 45/100 [00:00<00:00, 4341.13 it/sec]
INFO - 16:16:07:     46%|████▌     | 46/100 [00:00<00:00, 4337.54 it/sec]
INFO - 16:16:07:     47%|████▋     | 47/100 [00:00<00:00, 4341.64 it/sec]
INFO - 16:16:07:     48%|████▊     | 48/100 [00:00<00:00, 4347.27 it/sec]
INFO - 16:16:07:     49%|████▉     | 49/100 [00:00<00:00, 4353.24 it/sec]
INFO - 16:16:07:     50%|█████     | 50/100 [00:00<00:00, 4358.99 it/sec]
INFO - 16:16:07:     51%|█████     | 51/100 [00:00<00:00, 4354.66 it/sec]
INFO - 16:16:07:     52%|█████▏    | 52/100 [00:00<00:00, 4360.59 it/sec]
INFO - 16:16:07:     53%|█████▎    | 53/100 [00:00<00:00, 4362.46 it/sec]
INFO - 16:16:07:     54%|█████▍    | 54/100 [00:00<00:00, 4369.07 it/sec]
INFO - 16:16:07:     55%|█████▌    | 55/100 [00:00<00:00, 4368.65 it/sec]
INFO - 16:16:07:     56%|█████▌    | 56/100 [00:00<00:00, 4373.79 it/sec]
INFO - 16:16:07:     57%|█████▋    | 57/100 [00:00<00:00, 4380.84 it/sec]
INFO - 16:16:07:     58%|█████▊    | 58/100 [00:00<00:00, 4388.22 it/sec]
INFO - 16:16:07:     59%|█████▉    | 59/100 [00:00<00:00, 4395.22 it/sec]
INFO - 16:16:07:     60%|██████    | 60/100 [00:00<00:00, 4393.17 it/sec]
INFO - 16:16:07:     61%|██████    | 61/100 [00:00<00:00, 4397.45 it/sec]
INFO - 16:16:07:     62%|██████▏   | 62/100 [00:00<00:00, 4402.72 it/sec]
INFO - 16:16:07:     63%|██████▎   | 63/100 [00:00<00:00, 4406.30 it/sec]
INFO - 16:16:07:     64%|██████▍   | 64/100 [00:00<00:00, 4405.85 it/sec]
INFO - 16:16:07:     65%|██████▌   | 65/100 [00:00<00:00, 4409.56 it/sec]
INFO - 16:16:07:     66%|██████▌   | 66/100 [00:00<00:00, 4414.42 it/sec]
INFO - 16:16:07:     67%|██████▋   | 67/100 [00:00<00:00, 4416.58 it/sec]
INFO - 16:16:07:     68%|██████▊   | 68/100 [00:00<00:00, 4420.74 it/sec]
INFO - 16:16:07:     69%|██████▉   | 69/100 [00:00<00:00, 4418.29 it/sec]
INFO - 16:16:07:     70%|███████   | 70/100 [00:00<00:00, 4409.95 it/sec]
INFO - 16:16:07:     71%|███████   | 71/100 [00:00<00:00, 4408.98 it/sec]
INFO - 16:16:07:     72%|███████▏  | 72/100 [00:00<00:00, 4412.48 it/sec]
INFO - 16:16:07:     73%|███████▎  | 73/100 [00:00<00:00, 4411.11 it/sec]
INFO - 16:16:07:     74%|███████▍  | 74/100 [00:00<00:00, 4413.24 it/sec]
INFO - 16:16:07:     75%|███████▌  | 75/100 [00:00<00:00, 4418.34 it/sec]
INFO - 16:16:07:     76%|███████▌  | 76/100 [00:00<00:00, 4421.36 it/sec]
INFO - 16:16:07:     77%|███████▋  | 77/100 [00:00<00:00, 4423.83 it/sec]
INFO - 16:16:07:     78%|███████▊  | 78/100 [00:00<00:00, 4422.34 it/sec]
INFO - 16:16:07:     79%|███████▉  | 79/100 [00:00<00:00, 4426.03 it/sec]
INFO - 16:16:07:     80%|████████  | 80/100 [00:00<00:00, 4429.86 it/sec]
INFO - 16:16:07:     81%|████████  | 81/100 [00:00<00:00, 4433.44 it/sec]
INFO - 16:16:07:     82%|████████▏ | 82/100 [00:00<00:00, 4436.76 it/sec]
INFO - 16:16:07:     83%|████████▎ | 83/100 [00:00<00:00, 4434.12 it/sec]
INFO - 16:16:07:     84%|████████▍ | 84/100 [00:00<00:00, 4437.08 it/sec]
INFO - 16:16:07:     85%|████████▌ | 85/100 [00:00<00:00, 4438.31 it/sec]
INFO - 16:16:07:     86%|████████▌ | 86/100 [00:00<00:00, 4441.70 it/sec]
INFO - 16:16:07:     87%|████████▋ | 87/100 [00:00<00:00, 4440.15 it/sec]
INFO - 16:16:07:     88%|████████▊ | 88/100 [00:00<00:00, 4441.78 it/sec]
INFO - 16:16:07:     89%|████████▉ | 89/100 [00:00<00:00, 4443.44 it/sec]
INFO - 16:16:07:     90%|█████████ | 90/100 [00:00<00:00, 4445.89 it/sec]
INFO - 16:16:07:     91%|█████████ | 91/100 [00:00<00:00, 4448.25 it/sec]
INFO - 16:16:07:     92%|█████████▏| 92/100 [00:00<00:00, 4445.01 it/sec]
INFO - 16:16:07:     93%|█████████▎| 93/100 [00:00<00:00, 4447.22 it/sec]
INFO - 16:16:07:     94%|█████████▍| 94/100 [00:00<00:00, 4449.64 it/sec]
INFO - 16:16:07:     95%|█████████▌| 95/100 [00:00<00:00, 4452.65 it/sec]
INFO - 16:16:07:     96%|█████████▌| 96/100 [00:00<00:00, 4452.06 it/sec]
INFO - 16:16:07:     97%|█████████▋| 97/100 [00:00<00:00, 4452.31 it/sec]
INFO - 16:16:07:     98%|█████████▊| 98/100 [00:00<00:00, 4454.77 it/sec]
INFO - 16:16:07:     99%|█████████▉| 99/100 [00:00<00:00, 4457.57 it/sec]
INFO - 16:16:07:    100%|██████████| 100/100 [00:00<00:00, 4401.57 it/sec]
INFO - 16:16:07: *** End Sampling execution ***

Settings#

The GaussianProcessRegressor has many options defined in the GaussianProcessRegressor_Settings Pydantic model. Here are the main ones.

Kernel#

The kernel option defines the kernel function parametrizing the Gaussian process regressor and must be passed as a scikit-learn object. The default kernel is the Matérn 5/2 covariance function with input length scales belonging to the interval \([0.01,100]\), initialized at 1 and optimized by the L-BFGS-B algorithm. We can replace this kernel by the Matérn 5/2 kernel with input length scales fixed at 1:

model = create_regression_model(
    "GaussianProcessRegressor",
    training_dataset,
    kernel=Matern(length_scale=1.0, length_scale_bounds="fixed", nu=2.5),
)
model.learn()
predicted_output_data_1 = model.predict(input_data).ravel()

or a squared exponential covariance kernel with input length scales fixed at 1:

model = create_regression_model(
    "GaussianProcessRegressor",
    training_dataset,
    kernel=RBF(length_scale=1.0, length_scale_bounds="fixed"),
)
model.learn()
predicted_output_data_2 = model.predict(input_data).ravel()

These two models are much better than the previous one, notably the one with the Matérn 5/2 kernel, which highlights that the concern with the initial model is the value of the length scales found by numerical optimization:

plt.plot(input_data.ravel(), reference_output_data, label="Reference")
plt.plot(input_data.ravel(), predicted_output_data, label="Regression - Basics")
plt.plot(
    input_data.ravel(), predicted_output_data_1, label="Regression - Kernel(Matern 2.5)"
)
plt.plot(input_data.ravel(), predicted_output_data_2, label="Regression - Kernel(RBF)")
plt.grid()
plt.legend()
plt.show()

Bounds#

The bounds option defines the bounds of the input length scales;

model = create_regression_model(
    "GaussianProcessRegressor", training_dataset, bounds=(1e-1, 1e2)
)
model.learn()

Increasing the lower bounds can facilitate the training as in this example:

predicted_output_data_ = model.predict(input_data).ravel()
plt.plot(input_data.ravel(), reference_output_data, label="Reference")
plt.plot(input_data.ravel(), predicted_output_data, label="Regression - Basics")
plt.plot(input_data.ravel(), predicted_output_data_, label="Regression - Bounds")
plt.grid()
plt.legend()
plt.show()

Alpha#

The alpha parameter (default: 1e-10), often called nugget effect, is the value added to the diagonal of the training kernel matrix to avoid overfitting. When alpha is equal to zero, the GP model interpolates the training points at which the standard deviation is equal to zero. The larger alpha is, the less interpolating the GP model is. For example, we can increase the value to 0.1:

predicted_output_data_1 = predicted_output_data_
model = create_regression_model(
    "GaussianProcessRegressor", training_dataset, bounds=(1e-1, 1e2), alpha=0.1
)
model.learn()

and see that the model moves away from the training points:

predicted_output_data_2 = model.predict(input_data).ravel()
plt.plot(input_data.ravel(), reference_output_data, label="Reference")
plt.plot(input_data.ravel(), predicted_output_data_1, label="Regression - Alpha(1e-10)")
plt.plot(input_data.ravel(), predicted_output_data_2, label="Regression - Alpha(1e-1)")
plt.grid()
plt.legend()
plt.show()

Total running time of the script: (0 minutes 0.293 seconds)

Gallery generated by Sphinx-Gallery