Note
Click here to download the full example code
Advanced mixture of experts¶
from __future__ import absolute_import, division, print_function, unicode_literals
from gemseo.api import load_dataset
from gemseo.mlearning.api import create_regression_model
from gemseo.mlearning.qual_measure.f1_measure import F1Measure
from gemseo.mlearning.qual_measure.mse_measure import MSEMeasure
from gemseo.mlearning.qual_measure.silhouette import SilhouetteMeasure
In this example,
we seek to estimate the Rosenbrock function from the RosenbrockDataset
.
dataset = load_dataset("RosenbrockDataset", opt_naming=False)
For that purpose,
we will use a MixtureOfExperts
in an advanced way:
we will not set the clustering, classification and regression algorithms
but select them according to their performance
from several candidates that we will provide.
Moreover,
for a given candidate,
we will propose several settings,
compare their performances
and select the best one.
Initialization¶
First,
we initialize a MixtureOfExperts
with soft classification
by means of the machine learning API function
create_regression_model()
.
model = create_regression_model("MixtureOfExperts", dataset, hard=False)
Clustering¶
Then,
we add two clustering algorithms
with different numbers of clusters (called components for the Gaussian Mixture)
and set the SilhouetteMeasure
as clustering measure
to be evaluated from the learning set.
During the learning stage,
the mixture of experts will select the clustering algorithm
and the number of clusters
minimizing this measure.
model.set_clustering_measure(SilhouetteMeasure)
model.add_clusterer_candidate("KMeans", n_clusters=[2, 3, 4])
model.add_clusterer_candidate("GaussianMixture", n_components=[3, 4, 5])
Classification¶
We also add classification algorithms
with different settings
and set the F1Measure
as classification measure
to be evaluated from the learning set.
During the learning stage,
the mixture of experts will select the classification algorithm and the settings
minimizing this measure.
model.set_classification_measure(F1Measure)
model.add_classifier_candidate("KNNClassifier", n_neighbors=[3, 4, 5])
model.add_classifier_candidate("RandomForestClassifier", n_estimators=[100])
Regression¶
We also add regression algorithms
and set the MSEMeasure
as regression measure
to be evaluated from the learning set.
During the learning stage, for each cluster,
the mixture of experts will select the regression algorithm minimizing this measure.
model.set_regression_measure(MSEMeasure)
model.add_regressor_candidate("LinearRegression")
model.add_regressor_candidate("RBFRegression")
Note
We could also add candidates for some learning stages, e.g. clustering and regression, and set the machine learning algorithms for the remaining ones, e.g. classification.
Training¶
Lastly, we learn the data and select the best machine learning algorithm for both clustering, classification and regression steps.
model.learn()
Result¶
We can get information on this model,
on the sub-machine learning models selected among the candidates
and on their selected settings.
We can see that
a KMeans
with four clusters has been selected for the clustering stage,
as well as a RandomForestClassifier
for the classification stage
and a RBFRegression
for each cluster.
print(model)
Out:
MixtureOfExperts(hard=False)
built from 100 learning samples
Clustering
KMeans(n_clusters=4, random_state=0, var_names=None)
Classification
RandomForestClassifier(n_estimators=100)
Regression
Local model 0
RBFRegression(epsilon=None, function='multiquadric')
Local model 1
RBFRegression(epsilon=None, function='multiquadric')
Local model 2
RBFRegression(epsilon=None, function='multiquadric')
Local model 3
RBFRegression(epsilon=None, function='multiquadric')
Note
By adding candidates,
and depending on the complexity of the function to be approximated,
one could obtain different regression models according to the clusters.
For example,
one could use a PolynomialRegression
with order 2
on a sub-part of the input space
and a GaussianProcessRegression
on another sub-part of the input space.
Once built,
this mixture of experts can be used as any MLRegressionAlgo
.
See also
Another example
proposes a standard use of MixtureOfExperts
.
Total running time of the script: ( 0 minutes 1.032 seconds)