gemseo.mlearning.regression.algos.polyreg module#
Polynomial regression model.
Polynomial regression is a particular case of the linear regression, where the input data is transformed before the regression is applied. This transform consists of creating a matrix of monomials by raising the input data to different powers up to a certain degree \(D\). In the case where there is only one input variable, the input data \((x_i)_{i=1, \dots, n}\in\mathbb{R}^n\) is transformed into the Vandermonde matrix:
The output variable is expressed as a weighted sum of monomials:
where the coefficients \(w_1, w_2, ..., w_d\) and the intercept \(w_0\) are estimated by least square regression.
In the case of a multidimensional input, i.e. \(X = (x_{ij})_{i=1,\dots,n; j=1,\dots,m}\), where \(n\) is the number of samples and \(m\) is the number of input variables, the Vandermonde matrix is expressed through different combinations of monomials of degree \(d, (1 \leq d \leq D)\); e.g. for three variables \((x, y, z)\) and degree \(D=3\), the different terms are \(x\), \(y\), \(z\), \(x^2\), \(xy\), \(xz\), \(y^2\), \(yz\), \(z^2\), \(x^3\), \(x^2y\) etc. More generally, for \(m\) input variables, the total number of monomials of degree \(1 \leq d \leq D\) is given by \(P = \binom{m+D}{m} = \frac{(m+D)!}{m!D!}\). In the case of 3 input variables given above, the total number of monomial combinations of degree lesser than or equal to three is thus \(P = \binom{6}{3} = 20\). The linear regression has to identify the coefficients \(w_1, \dots, w_P\), in addition to the intercept \(w_0\).
Dependence#
The polynomial regression model relies on the LinearRegression and PolynomialFeatures classes of the scikit-learn library.
- class PolynomialRegressor(data, settings_model=None, **settings)[source]#
Bases:
LinearRegressor
Polynomial regression model.
- Parameters:
data (Dataset) -- The learning dataset.
settings_model (BaseMLAlgoSettings | None) -- The machine learning algorithm settings as a Pydantic model. If
None
, use**settings
.**settings (Any) -- The machine learning algorithm settings. These arguments are ignored when
settings_model
is notNone
.
- Raises:
ValueError -- When both the variable and the group it belongs to have a transformer.
- Settings#
alias of
PolynomialRegressor_Settings
- get_coefficients(as_dict=False)[source]#
Return the regression coefficients of the linear model.
- Parameters:
as_dict (bool) --
If
True
, return the coefficients as a dictionary of Numpy arrays indexed by the names of the coefficients. Otherwise, return the coefficients as a Numpy array. For now the only valid value is False.By default it is set to False.
- Returns:
The regression coefficients of the linear model.
- Raises:
NotImplementedError -- If the coefficients are required as a dictionary.
- Return type: