cluster module¶
Clustering algorithm¶
The cluster
module
implements the concept of clustering models,
a kind of unsupervised machine learning algorithm where the goal is
to group data into clusters.
Wherever it is possible, these methods should be able to predict the class of
new data, as well as the probability of belonging to each class.
This concept is implemented through the MLClusteringAlgo
class
which inherits from the MLUnsupervisedAlgo
class.
-
class
gemseo.mlearning.cluster.cluster.
MLClusteringAlgo
(data, transformer=None, var_names=None, **parameters)[source]¶ Bases:
gemseo.mlearning.core.unsupervised.MLUnsupervisedAlgo
Clustering algorithm.
Inheriting class should overload the
MLUnsupervisedAlgo._fit()
method, and theMLClusteringAlgo._predict()
andMLClusteringAlgo._predict_proba()
methods if possible.Constructor.
- Parameters
data (Dataset) – learning dataset.
transformer (dict(str)) – transformation strategy for data groups. If None, do not scale data. Default: None.
var_names (list(str)) – names of the variables to consider.
parameters – algorithm parameters.
-
learn
(samples=None)[source]¶ Overriding learn function for assuring that labels are defined. Identify number of clusters.
-
predict
(data)[source]¶ Predict cluster of data.
- Parameters
data (dict(ndarray) or ndarray) – data (1D or 2D).
- Returns
clusters of data (“0D” or 1D).
- Return type
int or ndarray(int)
-
predict_proba
(data, hard=True)[source]¶ Predict probability of belonging to each cluster.
- Parameters
data (dict(ndarray) or ndarray) – data (1D or 2D).
hard (bool) – indicator for hard or soft clustering. Default: True.
- Returns
probabilities of belonging to each cluster (1D or 2D, same as data).
- Return type
ndarray