Note

Go to the end to download the full example code

# Advanced mixture of experts¶

```
from __future__ import annotations
from gemseo import create_benchmark_dataset
from gemseo.mlearning import create_regression_model
from gemseo.mlearning.quality_measures.f1_measure import F1Measure
from gemseo.mlearning.quality_measures.mse_measure import MSEMeasure
from gemseo.mlearning.quality_measures.silhouette_measure import SilhouetteMeasure
```

In this example,
we seek to estimate the Rosenbrock function from the `RosenbrockDataset`

.

```
dataset = create_benchmark_dataset("RosenbrockDataset", opt_naming=False)
```

For that purpose,
we will use a `MOERegressor`

in an advanced way:
we will not set the clustering, classification and regression algorithms
but select them according to their performance
from several candidates that we will provide.
Moreover,
for a given candidate,
we will propose several settings,
compare their performances
and select the best one.

## Initialization¶

First,
we initialize a `MOERegressor`

with soft classification
by means of the high-level machine learning function `create_regression_model()`

.

```
model = create_regression_model("MOERegressor", dataset, hard=False)
```

## Clustering¶

Then,
we add two clustering algorithms
with different numbers of clusters (called *components* for the Gaussian Mixture)
and set the `SilhouetteMeasure`

as clustering measure
to be evaluated from the learning set.
During the learning stage,
the mixture of experts will select the clustering algorithm
and the number of clusters
minimizing this measure.

```
model.set_clustering_measure(SilhouetteMeasure)
model.add_clusterer_candidate("KMeans", n_clusters=[2, 3, 4])
model.add_clusterer_candidate("GaussianMixture", n_components=[3, 4, 5])
```

## Classification¶

We also add classification algorithms
with different settings
and set the `F1Measure`

as classification measure
to be evaluated from the learning set.
During the learning stage,
the mixture of experts will select the classification algorithm and the settings
minimizing this measure.

```
model.set_classification_measure(F1Measure)
model.add_classifier_candidate("KNNClassifier", n_neighbors=[3, 4, 5])
model.add_classifier_candidate("RandomForestClassifier", n_estimators=[100])
```

## Regression¶

We also add regression algorithms
and set the `MSEMeasure`

as regression measure
to be evaluated from the learning set.
During the learning stage, for each cluster,
the mixture of experts will select the regression algorithm minimizing this measure.

```
model.set_regression_measure(MSEMeasure)
model.add_regressor_candidate("LinearRegressor")
model.add_regressor_candidate("RBFRegressor")
```

Note

We could also add candidates for some learning stages, e.g. clustering and regression, and set the machine learning algorithms for the remaining ones, e.g. classification.

## Training¶

Lastly, we learn the data and select the best machine learning algorithm for both clustering, classification and regression steps.

```
model.learn()
```

```
/home/docs/checkouts/readthedocs.org/user_builds/gemseo/envs/5.0.1/lib/python3.9/site-packages/sklearn/cluster/_kmeans.py:870: FutureWarning: The default value of `n_init` will change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly to suppress the warning
warnings.warn(
/home/docs/checkouts/readthedocs.org/user_builds/gemseo/envs/5.0.1/lib/python3.9/site-packages/sklearn/cluster/_kmeans.py:870: FutureWarning: The default value of `n_init` will change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly to suppress the warning
warnings.warn(
/home/docs/checkouts/readthedocs.org/user_builds/gemseo/envs/5.0.1/lib/python3.9/site-packages/sklearn/cluster/_kmeans.py:870: FutureWarning: The default value of `n_init` will change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly to suppress the warning
warnings.warn(
```

## Result¶

We can get information on this model,
on the sub-machine learning models selected among the candidates
and on their selected settings.
We can see that
a `KMeans`

with four clusters has been selected for the clustering stage,
as well as a `RandomForestClassifier`

for the classification stage
and a `RBFRegressor`

for each cluster.

```
print(model)
```

```
MOERegressor(hard=False)
built from 100 learning samples
Clustering
KMeans(n_clusters=4, random_state=0, var_names=None)
Classification
RandomForestClassifier(n_estimators=100)
Regression
Local model 0
RBFRegressor(epsilon=None, function=multiquadric, norm=euclidean, smooth=0.0)
Local model 1
RBFRegressor(epsilon=None, function=multiquadric, norm=euclidean, smooth=0.0)
Local model 2
RBFRegressor(epsilon=None, function=multiquadric, norm=euclidean, smooth=0.0)
Local model 3
RBFRegressor(epsilon=None, function=multiquadric, norm=euclidean, smooth=0.0)
```

Note

By adding candidates,
and depending on the complexity of the function to be approximated,
one could obtain different regression models according to the clusters.
For example,
one could use a `PolynomialRegressor`

with order 2
on a sub-part of the input space
and a `GaussianProcessRegressor`

on another sub-part of the input space.

Once built,
this mixture of experts can be used as any `MLRegressionAlgo`

.

See also

Another example
proposes a standard use of `MOERegressor`

.

**Total running time of the script:** ( 0 minutes 0.590 seconds)