gemseo / mlearning / regression

# moe module¶

## Mixture of Experts¶

The mixture of experts (MoE) regression model expresses the output as a weighted sum of local surrogate models, where the weights are indicating the class of the input.

Inputs are grouped into clusters by a classification model that is trained on a training set where the output labels are determined through a clustering algorithm. The outputs may be preprocessed trough a sensor or a dimension reduction algorithm.

The classification may either be hard, in which only one of the weights is equal to one, and the rest equal to zero:

$y = \sum_{k=1}^K i_{C_k}(x) f_k(x),$

or soft, in which case the weights express the probabilities of belonging to each class:

$y = \sum_{k=1}^K \mathbb{P}(x\in C_k) f_k(x),$

where $$x$$ is the input, $$y$$ is the output, $$K$$ is the number of classes, $$(C_k)_{k=1,\cdots,K}$$ are the input spaces associated to the classes, $$i_{C_k}(x)$$ is the indicator of class $$k$$, $$\mathbb{P}(x\in C_k)$$ is the probability of class $$k$$ given $$x$$ and $$f_k(x)$$ is the local surrogate model on class $$k$$.

This concept is implemented through the MixtureOfExperts class which inherits from the MLRegressionAlgo class.

class gemseo.mlearning.regression.moe.MixtureOfExperts(data, transformer=None, input_names=None, output_names=None, hard=True)[source]

Mixture of experts regression.

Constructor.

Parameters
• data (Dataset) – learning dataset.

• transformer (dict(str)) – transformation strategy for data groups. If None, do not transform data. Default: None.

• input_names (list(str)) – names of the input variables.

• output_names (list(str)) – names of the output variables.

• hard (bool) – Indicator for hard or soft clustering/classification. Hard clustering/classification if True. Default: True.

ABBR = 'MoE'
class DataFormatters[source]

Machine learning regression model decorators.

classmethod format_predict_class_dict(predict)[source]

If input_data is passed as a dictionary, then convert it to ndarray, and convert output_data to dictionary. Else, do nothing.

Parameters

predict – Method whose input_data and output_data are to be formatted.

LABELS = 'labels'
property labels

Cluster labels.

load_algo(directory)[source]

Parameters

directory (str) – algorithm directory.

property n_clusters

Number of clusters.

predict_class(input_data, *args, **kwargs)[source]

Wrapper function.

predict_local_model(input_data, *args, **kwargs)
set_classifier(classif_algo, **classif_params)[source]

Set classification algorithm.

Parameters
• classif_algo (str) – classifier.

• classif_params – optional arguments for classification. If none, uses default arguments.

set_clusterer(cluster_algo, **cluster_params)[source]

Set cluster algorithm.

Parameters
• cluster_algo (str) – clusterer.

• cluster_params – optional arguments for clustering. If none, uses default arguments.

set_regressor(regress_algo, **regress_params)[source]

Set regression algorithm.

Parameters
• regress_algo (str) – regressor.

• regress_params – optional arguments for regression. If none, uses default arguments.