pce module¶

Polynomial chaos expansion model.

The polynomial chaos expansion (PCE) model expresses an output variable as a weighted sum of polynomial functions which are orthonormal in the stochastic input space spanned by the random input variables:

\[Y = w_0 + w_1\phi_1(X) + w_2\phi_2(X) + ... + w_K\phi_K(X)\]

where \(\phi_i(x)=\psi_{\tau_1(i),1}(x_1)\times\ldots\times \psi_{\tau_d(i),d}(x_d)\) and \(d\) is the number of input variables.

Enumeration strategy¶

The choice of the function \(\tau=(\tau_1,\ldots,\tau_d)\) is an enumeration strategy and \(\tau_j(i)\) is the polynomial degree of \(\psi_{\tau_j(i),j}\).

Distributions¶

PCE models depend on random input variables and are often used to deal with uncertainty quantification problems.

If \(X_j\) is a Gaussian random variable, \((\psi_{ij})_{i\geq 0}\) is the Legendre basis. If \(X_j\) is a uniform random variable, \((\psi_{ij})_{i\geq 0}\) is the Hermite basis.

When the problem is deterministic, we can still use PCE models under the assumption that the input variables are independent uniform random variables. Then, the orthonormal function basis is the Hermite one.

Degree¶

The degree \(P\) of a PCE model is defined in such a way that \(\max_i \text{degree}(\phi_i)=\sum_{j=1}^d\tau_j(i)=P\).

Estimation¶

The coefficients \((w_1, w_2, ..., w_K)\) and the intercept \(w_0\) are estimated either by least-squares regression or a quadrature rule. In the case of least-squares regression, a sparse strategy can be considered with the LARS algorithm and in both cases, the CleaningStrategy can also remove the non-significant coefficients.

Dependence¶

The PCE model relies on the OpenTURNS class FunctionalChaosAlgorithm.

class gemseo.mlearning.regression.pce.CleaningOptions(max_considered_terms=100, most_significant=20, significance_factor=0.0001)[source]

Bases: object

The options of the CleaningStrategy.

Parameters:

max_considered_terms (int) –

By default it is set to 100.
most_significant (int) –

By default it is set to 20.
significance_factor (float) –

By default it is set to 0.0001.

max_considered_terms: int = 100: The maximum number of coefficients of the polynomial basis to be considered.

most_significant: int = 20: The maximum number of efficient coefficients of the polynomial basis to be kept.

significance_factor: float = 0.0001: The threshold to select the efficient coefficients of the polynomial basis.

class gemseo.mlearning.regression.pce.PCERegressor(data, probability_space, transformer=mappingproxy({}), input_names=None, output_names=None, degree=2, discipline=None, use_quadrature=False, use_lars=False, use_cleaning=False, hyperbolic_parameter=1.0, n_quadrature_points=0, cleaning_options=None)[source]

Bases: MLRegressionAlgo

Polynomial chaos expansion model.

Examples using PCERegressor¶

PCE regression