gemseo / mlearning / regression

# linreg module¶

## Linear regression¶

The linear regression surrogate discipline expresses the model output as a weighted sum of the model inputs:

$y = w_0 + w_1x_1 + w_2x_2 + ... + w_dx_d + \alpha \left( \lambda \|w\|_2 + (1-\lambda) \|w\|_1 \right),$

where the coefficients $$(w_1, w_2, ..., w_d)$$ and the intercept $$w_0$$ are estimated by least square regression. They are are easily accessible via the arguments coefficients and intercept.

The penalty level $$\alpha$$ is a non-negative parameter intended to prevent overfitting, while the penalty ratio $$\lambda\in [0, 1]$$ expresses the ratio between $$\ell_2$$- and $$\ell_1$$-regularization. When $$\lambda=1$$, there is no $$\ell_1$$-regularization, and a Ridge regression is performed. When $$\lambda=0$$, there is no $$\ell_2$$-regularization, and a Lasso regression is performed. For $$\lambda$$ between 0 and 1, an elastic net regression is performed.

One may also choose not to penalize the regression at all, by setting $$\alpha=0$$. In this case, a simple least squares regression is performed.

This concept is implemented through the LinearRegression class which inherits from the MLRegressionAlgo class.

### Dependence¶

The linear model relies on the LinearRegression, Ridge, Lasso and ElasticNet classes of the scikit-learn library.

class gemseo.mlearning.regression.linreg.LinearRegression(data, transformer=None, input_names=None, output_names=None, fit_intercept=True, penalty_level=0.0, l2_penalty_ratio=1.0, **parameters)[source]

Linear regression

Constructor.

Parameters
• data (Dataset) – learning dataset.

• transformer (dict(str)) – transformation strategy for data groups. If None, do not transform data. Default: None.

• input_names (list(str)) – names of the input variables.

• output_names (list(str)) – names of the output variables.

• fit_intercept (bool) – if True, fit intercept. Default: True.

• penalty_level (float) – penalty level greater or equal to 0. If 0, there is no penalty. Default: 0.

• l2_penalty_ratio (float) – penalty ratio related to the l2 regularization. If 1, the penalty is the Ridge penalty. If 0, this is the Lasso penalty. Between 0 and 1, the penalty is the ElasticNet penalty. Default: None.

ABBR = 'LinReg'
LIBRARY = 'scikit-learn'
property coefficients

Return the regression coefficients of the linear fit.

get_coefficients(as_dict=True)[source]

Return the regression coefficients of the linear fit as a numpy array or as a dict.

Parameters

as_dict (bool) – if True, returns coefficients as a dictionary. Default: True.

get_intercept(as_dict=True)[source]

Returns the regression intercept of the linear fit as a numpy array or as a dict.

Parameters

as_dict (bool) – if True, returns intercept as a dictionary. Default: True.

property intercept

Return the regression intercepts of the linear fit.