empirical module¶

Class for the empirical estimation of statistics from a dataset.

Overview¶

The EmpiricalStatistics class inherits from the abstract Statistics class and aims to estimate statistics from a Dataset, based on empirical estimators.

Construction¶

A EmpiricalStatistics is built from a Dataset and optionally variables names. In this case, statistics are only computed for these variables. Otherwise, statistics are computed for all the variable available in the dataset. Lastly, the user can give a name to its EmpiricalStatistics object. By default, this name is the concatenation of ‘EmpiricalStatistics’ and the name of the Dataset.

class gemseo.uncertainty.statistics.empirical.EmpiricalStatistics(dataset, variable_names=None, name=None)[source]¶

Bases: Statistics

A toolbox to compute statistics empirically.

Unless otherwise stated, the statistics are computed variable-wise and component-wise, i.e. variable-by-variable and component-by-component. So, for the sake of readability, the methods named as compute_statistic() return dict[str, ndarray] objects whose values are the names of the variables and the values are the statistic estimated for the different component.

Examples

>>> from gemseo import (
...     create_discipline,
...     create_parameter_space,
...     create_scenario)
>>> from gemseo.uncertainty.statistics.empirical import EmpiricalStatistics
>>>
>>> expressions = {"y1": "x1+2*x2", "y2": "x1-3*x2"}
>>> discipline = create_discipline(
...     "AnalyticDiscipline", expressions=expressions
... )
>>>
>>> parameter_space = create_parameter_space()
>>> parameter_space.add_random_variable(
...     "x1", "OTUniformDistribution", minimum=-1, maximum=1
... )
>>> parameter_space.add_random_variable(
...     "x2", "OTUniformDistribution", minimum=-1, maximum=1
... )
>>>
>>> scenario = create_scenario(
...     [discipline],
...     "DisciplinaryOpt",
...     "y1",
...     parameter_space,
...     scenario_type="DOE"
... )
>>> scenario.execute({'algo': 'OT_MONTE_CARLO', 'n_samples': 100})
>>>
>>> dataset = scenario.to_dataset(opt_naming=False)
>>>
>>> statistics = EmpiricalStatistics(dataset)
>>> mean = statistics.compute_mean()

Parameters:

dataset (Dataset) – A dataset.
variable_names (Iterable[str] | None) – The variables of interest. Default: consider all the variables available in the dataset.
name (str) – A name for the object. Default: use the concatenation of the class and dataset names.

compute_a_value()¶

Compute the A-value \(\text{Aval}[X]\).

The A-value is the lower bound of the left-sided tolerance interval associated with a coverage level equal to 99% and a confidence level equal to 95%.

Returns:: The component-wise A-value of the different variables.
Return type:: dict[str, numpy.ndarray]

Examples using EmpiricalStatistics¶

Empirical estimation of statistics