# Linear regression¶

The linear regression surrogate discipline expresses the model output as a weighted sum of the model inputs:

$y = w_0 + w_1x_1 + w_2x_2 + ... + w_dx_d + \alpha \left( \lambda \|w\|_2 + (1-\lambda) \|w\|_1 \right),$

where the coefficients $$(w_1, w_2, ..., w_d)$$ and the intercept $$w_0$$ are estimated by least square regression. They are are easily accessible via the arguments coefficients and intercept.

The penalty level $$\alpha$$ is a non-negative parameter intended to prevent overfitting, while the penalty ratio $$\lambda\in [0, 1]$$ expresses the ratio between $$\ell_2$$- and $$\ell_1$$-regularization. When $$\lambda=1$$, there is no $$\ell_1$$-regularization, and a Ridge regression is performed. When $$\lambda=0$$, there is no $$\ell_2$$-regularization, and a Lasso regression is performed. For $$\lambda$$ between 0 and 1, an elastic net regression is performed.

One may also choose not to penalize the regression at all, by setting $$\alpha=0$$. In this case, a simple least squares regression is performed.

This concept is implemented through the LinearRegression class which inherits from the MLRegressionAlgo class.

## Dependence¶

The linear model relies on the LinearRegression, Ridge, Lasso and ElasticNet classes of the scikit-learn library.

class gemseo.mlearning.regression.linreg.LinearRegression(data, transformer=None, input_names=None, output_names=None, fit_intercept=True, penalty_level=0.0, l2_penalty_ratio=1.0, **parameters)[source]

Linear regression

Constructor.

Parameters
• data (Dataset) – learning dataset.

• transformer (dict(str)) – transformation strategy for data groups. If None, do not transform data. Default: None.

• input_names (list(str)) – names of the input variables.

• output_names (list(str)) – names of the output variables.

• fit_intercept (bool) – if True, fit intercept. Default: True.

• penalty_level (float) – penalty level greater or equal to 0. If 0, there is no penalty. Default: 0.

• l2_penalty_ratio (float) – penalty ratio related to the l2 regularization. If 1, the penalty is the Ridge penalty. If 0, this is the Lasso penalty. Between 0 and 1, the penalty is the ElasticNet penalty. Default: None.