gemseo / post / mlearning

Hide inherited members

ml_regressor_quality_viewer module

Visualization of the quality of a regression model.

class gemseo.post.mlearning.ml_regressor_quality_viewer.MLRegressorQualityViewer(algo)[source]

Bases: object

Visualization of the quality of a regression model.

Parameters:

algo (BaseMLRegressionAlgo) – The regression algorithm.

class ReferenceDataset(value)[source]

Bases: StrEnum

The reference dataset.

CROSS_VALIDATION = 'CROSS_VALIDATION'

The cross-validation dataset.

This is the learning dataset decomposable into \(K\) learning-validation partitions.

LEARNING = 'LEARNING'

The learning dataset.

plot_predictions_vs_observations(output, observations=ReferenceDataset.LEARNING, use_scatter_matrix=True, filter_scatters=True, save=True, show=False, n_folds=5, samples=(), seed=None, **options)[source]

Plot the predictions versus the observations.

Parameters:
  • output (str | tuple[str, int]) – The name of the output of interest, and possibly the component of interest; if the latter is missing, use all the components of the output.

  • observations (ReferenceDataset | Dataset) –

    The validation dataset.

    By default it is set to “LEARNING”.

  • use_scatter_matrix (bool) –

    Whether the method outputs a ScatterMatrix. Otherwise, it outputs a list of Scatter.

    By default it is set to True.

  • filter_scatters (bool) –

    Whether to display only the scatters with the quantity of interest on at least one of the axes. Otherwise, consider all scatters, including input or output in function of another input or output.

    By default it is set to True.

  • save (bool) –

    Whether to save the plots.

    By default it is set to True.

  • show (bool) –

    Whether to show the plots.

    By default it is set to False.

  • n_folds (int) –

    The number of folds. Used only in the case of cross-validation.

    By default it is set to 5.

  • samples (Sequence[int]) –

    The indices of the learning samples. If empty, use the whole learning dataset. Used only in the case of cross-validation.

    By default it is set to ().

  • seed (int | None) – The seed of the pseudo-random number generator. If None, the seed of the i-th execution is SEED+i. Used only in the case of cross-validation.

  • **options (Any) – The options of the underlying DatasetPlot.

Returns:

The plots of the predictions versus the observations.

Return type:

list[Scatter] | ScatterMatrix

plot_residuals_vs_inputs(output, input_names=(), observations=ReferenceDataset.LEARNING, use_scatter_matrix=True, filter_scatters=True, save=True, show=False, n_folds=5, samples=(), seed=None, **options)[source]

Plot the residuals of the model versus the inputs.

Parameters:
  • output (str | tuple[str, int]) – The name of the output of interest, and possibly the component of interest; if the latter is missing, use all the components of the output.

  • input_names (str | Iterable[str] | ()) –

    The names of the inputs to plot in addition to the model data. If empty, use all the inputs.

    By default it is set to ().

  • observations (ReferenceDataset | Dataset) –

    The validation dataset.

    By default it is set to “LEARNING”.

  • use_scatter_matrix (bool) –

    Whether the method outputs a ScatterMatrix. Otherwise, it outputs a list of Scatter.

    By default it is set to True.

  • filter_scatters (bool) –

    Whether to display only the scatters with the quantity of interest on at least one of the axes. Otherwise, consider all scatters, including input or output in function of another input or output.

    By default it is set to True.

  • save (bool) –

    Whether to save the plots.

    By default it is set to True.

  • show (bool) –

    Whether to show the plots.

    By default it is set to False.

  • n_folds (int) –

    The number of folds. Used only in the case of cross-validation.

    By default it is set to 5.

  • samples (Sequence[int]) –

    The indices of the learning samples. If empty, use the whole learning dataset. Used only in the case of cross-validation.

    By default it is set to ().

  • seed (int | None) – The seed of the pseudo-random number generator. If None, the seed of the i-th execution is SEED+i. Used only in the case of cross-validation.

  • **options (Any) – The options of the underlying DatasetPlot.

Returns:

The plots of the residuals of the model versus the inputs.

Return type:

list[Scatter] | ScatterMatrix

plot_residuals_vs_observations(output, observations=ReferenceDataset.LEARNING, use_scatter_matrix=True, filter_scatters=True, save=True, show=False, n_folds=5, samples=(), seed=None, **options)[source]

Plot the residuals of the model versus the observations.

Parameters:
  • output (str | tuple[str, int]) – The name of the output of interest, and possibly the component of interest; if the latter is missing, use all the components of the output.

  • observations (ReferenceDataset | Dataset) –

    The validation dataset.

    By default it is set to “LEARNING”.

  • use_scatter_matrix (bool) –

    Whether the method outputs a ScatterMatrix. Otherwise, it outputs a list of Scatter.

    By default it is set to True.

  • filter_scatters (bool) –

    Whether to display only the scatters with the quantity of interest on at least one of the axes. Otherwise, consider all scatters, including input or output in function of another input or output.

    By default it is set to True.

  • save (bool) –

    Whether to save the plots.

    By default it is set to True.

  • show (bool) –

    Whether to show the plots.

    By default it is set to False.

  • n_folds (int) –

    The number of folds. Used only in the case of cross-validation.

    By default it is set to 5.

  • samples (Sequence[int]) –

    The indices of the learning samples. If empty, use the whole learning dataset. Used only in the case of cross-validation.

    By default it is set to ().

  • seed (int | None) – The seed of the pseudo-random number generator. If None, the seed of the i-th execution is SEED+i. Used only in the case of cross-validation.

  • **options (Any) – The options of the underlying DatasetPlot.

Returns:

The plots of the residuals of the model versus the observations.

Return type:

list[Scatter] | ScatterMatrix