Probability distribution#

The package distributions#

Capabilities to create and manipulate probability distributions.

This package contains:

Lastly, the class OTDistributionFitter offers the possibility to fit an OTDistribution from data based on OpenTURNS.

The base class BaseDistribution#

The base class for probability distributions.

The base class BaseDistribution implements the concept of probability distribution, which is a mathematical function giving the probabilities of occurrence of different possible outcomes of a random variable for an experiment. The normal distribution with its famous bell curve is a well-known example of probability distribution.

See also

This base class is enriched by concrete ones, such as OTDistribution interfacing the OpenTURNS probability distributions and SPDistribution interfacing the SciPy probability distributions.

The BaseDistribution of a given uncertain variable is built from a distribution name (e.g. 'Normal' for OpenTURNS or 'norm' for SciPy), a set of parameters and optionally a standard representation of these parameters.

From a BaseDistribution, we can easily get statistics, such as mean and standard_deviation. We can also get the numerical range and the mathematical support.

Note

We call mathematical support the set of values that the random variable can take in theory, e.g. \(]-\infty,+\infty[\) for a Gaussian variable, and numerical range the set of values that it can take in practice, taking into account the values rounded to zero double precision. Both support and range are described in terms of lower and upper bounds

We can also evaluate the cumulative distribution function (BaseDistribution.compute_cdf()) for the different marginals of the random variable, as well as the inverse cumulative density function (compute_inverse_cdf()). We can also plot them (plot()).

Lastly, we can compute realizations of the random variable using the compute_samples() method.

class BaseDistribution(interfaced_distribution, parameters, standard_parameters=mappingproxy({}), **options)[source]

The base class for probability distributions.

By default, this base class models the probability distribution of a scalar random variable. Child classes need to be adapted to model other types of random variables, e.g. random vectors (see BaseJointDistribution).

Initialize self. See help(type(self)) for accurate signature.

Parameters:
  • interfaced_distribution (str) -- The name of the probability distribution, typically the name of a class wrapped from an external library, such as "Normal" for OpenTURNS or "norm" for SciPy.

  • parameters (_ParametersT) -- The parameters of the probability distribution.

  • standard_parameters (StandardParametersType) --

    The parameters of the probability distribution used for string representation only (use parameters for computation). If empty, use parameters instead. For instance, let us consider an interfaced distribution named "Dirac" with positional parameters (this is the case of OTDistribution). Then, the string representation of BaseDistribution("x", "Dirac", (1,), 1, {"loc": 1}) is "Dirac(loc=1)" while the string representation of BaseDistribution("x", "Dirac", (1,)) is "Dirac(1)". The same mechanism works for keyword parameters (this is the case of SPDistribution).

    By default it is set to {}.

  • **options (Any) -- The options of the probability distribution.

abstract compute_cdf(value)[source]

Evaluate the cumulative density function (CDF).

Parameters:

value (_VariableT) -- The value of the random variable for which to evaluate the CDF.

Returns:

The value of the CDF.

Return type:

_VariableT

abstract compute_inverse_cdf(value)[source]

Evaluate the inverse cumulative density function (ICDF).

Parameters:

value (_VariableT) -- The probability for which to evaluate the ICDF.

Returns:

The value of the ICDF.

Return type:

_VariableT

abstract compute_samples(n_samples=1)[source]

Sample the random variable.

Parameters:

n_samples (int) --

The number of samples.

By default it is set to 1.

Returns:

The samples of the random variable.

Return type:

RealArray

DEFAULT_VARIABLE_NAME: Final[str] = 'x'

The default name of the variable.

distribution: _DistributionT

The probability distribution of the random variable.

math_lower_bound: _VariableT

The mathematical lower bound of the random variable.

math_upper_bound: _VariableT

The mathematical upper bound of the random variable.

abstract property mean: _VariableT

The expectation of the random variable.

num_lower_bound: _VariableT

The numerical lower bound of the random variable.

num_upper_bound: _VariableT

The numerical upper bound of the random variable.

property range: RealArray

The numerical range.

The numerical range is the interval defined by the lower and upper bounds numerically reachable by the random variable.

abstract property standard_deviation: _VariableT

The standard deviation of the random variable.

property support: RealArray

The mathematical support.

The mathematical support is the interval defined by the theoretical lower and upper bounds of the random variable.

transformation: str

The transformation applied to the random variable noted "x".

E.g. "sin(x)".

Examples#

See the examples about probability distributions.