gemseo.mlearning.regression.algos.base_random_process_regressor module#

A base class for regressors based on a random process.

A class implementing a Gaussian process regressor must derive from it.

class BaseRandomProcessRegressor(data, settings_model=None, **settings)[source]#

Bases: BaseRegressor

A base class for regressors based on a random process.

Initialize self. See help(type(self)) for accurate signature.

Parameters:
  • data (Dataset) -- The training dataset.

  • settings_model (BaseMLAlgoSettings | None) -- The machine learning algorithm settings as a Pydantic model. If None, use **settings.

  • **settings (Any) -- The machine learning algorithm settings. These arguments are ignored when settings_model is not None.

Raises:

ValueError -- When both the variable and the group it belongs to have a transformer.

abstractmethod compute_samples(input_data, n_samples, seed=None)[source]#

Sample a random vector from the conditioned Gaussian process.

Parameters:
  • input_data (RealArray) -- The \(N\) input points of dimension \(d\) at which to observe the conditioned Gaussian process; shaped as \((N, d)\).

  • n_samples (int) -- The number of samples \(M\).

  • seed (int | None) -- The seed for reproducible results.

Returns:

The output samples shaped as \((M, N, p)\) where \(p\) is the output dimension.

Return type:

RealArray

abstractmethod predict_covariance(input_data)[source]#

Predict the covariance matrix from input data.

Parameters:

input_data (RealArray) -- The \(N\) input points of dimension \(d\) at which to observe the conditioned Gaussian process; shaped as \((N, d)\).

Returns:

The posterior covariance matrix at the input points of shape \((Np, Np)\) with \(p\) the output dimension. The covariance between the \(k\)-th output at the \(i\)-th input point and the \(l\)-th output at the \(j\)-th input point is located at the \(m\)-th line and \(n\)-th column with \(m=ip+k\), \(n=jp+l\), \(i,j\in\{0,\ldots,N-1\}\) and \(k,l\in\{0,\ldots,p-1\}\).

Return type:

RealArray

Warning

This statistic is expressed in relation to the transformed output space. You can sample the predict() method to estimate it in relation to the original output space if it is different from the transformed output space.

abstractmethod predict_std(input_data)[source]#

Predict the standard deviation from input data.

The user can specify these input data either as a NumPy array, e.g. array([1., 2., 3.]) or as a dictionary of NumPy arrays, e.g. {'a': array([1.]), 'b': array([2., 3.])}.

If the NumPy arrays are of dimension 2, their i-th rows represent the input data of the i-th sample; while if the NumPy arrays are of dimension 1, there is a single sample.

Parameters:

input_data (DataType) -- The input data.

Returns:

The standard deviation at the query points.

Return type:

RealArray

Warning

This statistic is expressed in relation to the transformed output space. You can sample the predict() method to estimate it in relation to the original output space if it is different from the transformed output space.