gemseo.mlearning.regression.algos.pce module#

Polynomial chaos expansion model.

openturns/latest/user_manual/response_surface/response_surface.html .. _CleaningStrategy: https://openturns.github.io/ openturns/latest/user_manual/response_surface/_generated/openturns.CleaningStrategy.html .. _LARS: https://openturns.github.io/ openturns/latest/theory/meta_modeling/polynomial_sparse_least_squares.html .. _hyperbolic and anisotropic enumerate function: https://openturns.github.io/ openturns/latest/user_manual/_generated/openturns.HyperbolicAnisotropicEnumerateFunction.html

The polynomial chaos expansion (PCE) model expresses an output variable as a weighted sum of polynomial functions which are orthonormal in the stochastic input space spanned by the random input variables:

\[Y = w_0 + w_1\phi_1(X) + w_2\phi_2(X) + ... + w_K\phi_K(X)\]

where \(\phi_i(x)=\psi_{\tau_1(i),1}(x_1)\times\ldots\times \psi_{\tau_d(i),d}(x_d)\) and \(d\) is the number of input variables.

Enumeration strategy#

The choice of the function \(\tau=(\tau_1,\ldots,\tau_d)\) is an enumeration strategy and \(\tau_j(i)\) is the polynomial degree of \(\psi_{\tau_j(i),j}\).

Distributions#

PCE models depend on random input variables and are often used to deal with uncertainty quantification problems.

If \(X_j\) is a Gaussian random variable, \((\psi_{ij})_{i\geq 0}\) is the Legendre basis. If \(X_j\) is a uniform random variable, \((\psi_{ij})_{i\geq 0}\) is the Hermite basis.

When the problem is deterministic, we can still use PCE models under the assumption that the input variables are independent uniform random variables. Then, the orthonormal function basis is the Hermite one.

Degree#

The degree \(P\) of a PCE model is defined in such a way that \(\max_i \text{degree}(\phi_i)=\sum_{j=1}^d\tau_j(i)=P\).

Estimation#

The coefficients \((w_1, w_2, ..., w_K)\) and the intercept \(w_0\) are estimated either by least-squares regression or a quadrature rule. In the case of least-squares regression, a sparse strategy can be considered with the `LARS`_ algorithm and in both cases, the `CleaningStrategy`_ can also remove the non-significant coefficients.

Dependence#

The PCE model relies on the OpenTURNS class FunctionalChaosAlgorithm.

class PCERegressor(data, settings_model=None, **settings)[source]#

Bases: BaseFCERegressor

Polynomial chaos expansion model.

See Also: API documentation of the OpenTURNS class FunctionalChaosAlgorithm.

Initialize self. See help(type(self)) for accurate signature.

Parameters:

data (IODataset) -- The training dataset whose input space data.misc["input_space"] is expected to be a ParameterSpace defining the random input variables. The training dataset can be empty in the case of quadrature when discipline is not None.
settings_model (PCERegressor_Settings | None) -- The machine learning algorithm settings as a Pydantic model. If None, use **settings.
**settings (Any) -- The machine learning algorithm settings. These arguments are ignored when settings_model is not None.

Raises:

ValueError -- When both data and discipline are missing, when both data and discipline are provided, when discipline is provided in the case of least-squares regression, when data is missing in the case of least-squares regression, when the probability space does not contain the distribution of an input variable, when an input variable has a data transformer or when a probability distribution is not an OTDistribution.

Settings#: alias of PCERegressor_Settings

LIBRARY: ClassVar[str] = 'OpenTURNS'#: The name of the library of the wrapped machine learning algorithm.

SHORT_ALGO_NAME: ClassVar[str] = 'PCE'#

The short name of the machine learning algorithm, often an acronym.

Typically used for composite names, e.g. f"{algo.SHORT_ALGO_NAME}_{dataset.name}" or f"{algo.SHORT_ALGO_NAME}_{discipline.name}".

property covariance: RealArray#: The covariance matrix of the PCE model output.

Warning

This statistic is expressed in relation to the transformed output space. You can sample the predict() method to estimate it in relation to the original output space if it is different from the transformed output space.

property second_sobol_indices: list[dict[str, dict[str, float]]]#: The second-order Sobol' indices for the different output components.

Warning

These statistics are expressed in relation to the transformed output space. You can use a SobolAnalysis to estimate them in relation to the original output space if it is different from the transformed output space.