statistics module¶
Estimation of statistics from a dataset¶
Overview¶
The abstract Statistics
class implements the concept of
statistics library. It is enriched by concrete classes
such as EmpiricalStatistics
and ParametricStatistics
.
Construction¶
A Statistics
is built from a Dataset
and optionally
a list of variables names. In this case, statistics are only computed
for these variables. Otherwise, statistics are computed for all variables.
Lastly, the user can name its Statistics
. By default,
the name is the concatenation of the name of the class
overloading Statistics
and the name of the Dataset
.
Capabilities¶
A Statistics
returns standard descriptive and statistical measures
for the different variables:
Statistics.minimum()
: the minimum value,Statistics.maximum()
: the maximum value,Statistics.range()
: the difference between minimum and maximum values,Statistics.mean()
: the expectation, a.k.a. mean value,Statistics.moment()
: the central moment which is a the expected value of a specified integer power of the deviation from the mean,Statistics.variance()
: the variance, which is the mean squared variation around the mean value,Statistics.standard_deviation()
: the standard deviation, which is the square root of the variance,Statistics.quantile()
: the quantile associated with a probability, which is the cut point diving the range into a first continuous interval with this given probability and a second continuous interval with the complementary probability; common qquantiles dividing the range into q continuous interval with equal probabilities are also implemented:Statistics.median()
which implements the 2quantile (50%).Statistics.quartile()
whose order (1, 2 or 3) implements the 4quantiles (respectively 25%, 50% and 75%),Statistics.percentile()
whose order (1, 2, …, 99) implements the 100quantiles (1%, 2%, …, 99%),
Statistics.probability()
: the probability that the random variable is larger or smaller than a certain threshold,Statistics.tolerance_interval()
: the leftsided, rightsided or bothsided tolerance interval associated with a given coverage level and a given confidence level, which is a statistical interval within which, with some confidence level, a specified proportion of the random variable realizations falls (this proportion is the coverage level)Statistics.a_value()
: the Avalue, which is the lower bound of the leftsided tolerance interval associated with a coverage level equal to 99% and a confidence level equal to 95%,Statistics.b_value()
: the Bvalue, which is the lower bound of the leftsided tolerance interval associated with a coverage level equal to 90% and a confidence level equal to 95%,

class
gemseo.uncertainty.statistics.statistics.
Statistics
(dataset, variables_names=None, name=None)[source]¶ Bases:
object
Abstract class for Statistics library interface.
Constructor
 Parameters
dataset (Dataset) – dataset
variables_names (list(str)) – list of variables names. If None, the method considers all variables from loaded dataset. Default: None.
name (str) – name of the object. If None, use the concatenation of class and dataset names. Default: None.

moment
(order)[source]¶ Compute the moment for a given order.
 Parameters
order (int) – moment index
 Returns
moment

percentile
(order)[source]¶ Compute the percentile.
 Parameters
order (int) – percentile order, e.g. 4.
 Returns
percentile

probability
(thresh, greater)[source]¶ Compute a probability associated to a threshold.
 Parameters
thresh (float) – threshold
greater (bool) – if True, compute the probability the probability of exceeding the threshold, if False, compute the reverse.
 Returns
probability

quantile
(prob)[source]¶ Compute a quantile associated to a probability.
 Parameters
prob (float) – probability between 0 and 1
 Returns
quantile

quartile
(order)[source]¶ Compute a quartile.
 Parameters
order (int) – quartile order in [1,2,3]
 Returns
quartile

tolerance_interval
(coverage, confidence=0.95, side='both')[source]¶ Compute the tolerance interval (TI) for a given minimum percentage of the population and a given confidence level.
 Parameters
coverage (float) – minimum percentage of belonging to the TI.
confidence (float) – level of confidence in [0,1]. Default: 0.95.
side (str) – kind of interval: ‘lower’ for lowersided TI, ‘upper’ for uppersided TI and ‘both for bothsided TI.
 Returns
tolerance limits