.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/uncertainty/distributions/plot_ot_distfactory.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_uncertainty_distributions_plot_ot_distfactory.py: Fitting a distribution from data based on OpenTURNS =================================================== .. GENERATED FROM PYTHON SOURCE LINES 25-35 .. code-block:: Python from __future__ import annotations from numpy.random import default_rng from gemseo import configure_logger from gemseo.uncertainty.distributions.openturns.fitting import OTDistributionFitter configure_logger() .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 36-42 In this example, we will see how to fit a distribution from data. For a purely pedagogical reason, we consider a synthetic dataset made of 100 realizations of *'X'*, a random variable distributed according to the standard normal distribution. These samples are generated from the NumPy library. .. GENERATED FROM PYTHON SOURCE LINES 42-46 .. code-block:: Python rng = default_rng(1) data = rng.normal(size=100) variable_name = "X" .. GENERATED FROM PYTHON SOURCE LINES 47-51 Create a distribution fitter ---------------------------- Then, we create an :class:`.OTDistributionFitter` from these data and this variable name: .. GENERATED FROM PYTHON SOURCE LINES 51-53 .. code-block:: Python fitter = OTDistributionFitter(variable_name, data) .. GENERATED FROM PYTHON SOURCE LINES 54-58 Fit a distribution ------------------ From this distribution fitter, we can easily fit any distribution available in the OpenTURNS library: .. GENERATED FROM PYTHON SOURCE LINES 58-60 .. code-block:: Python fitter.available_distributions .. rst-class:: sphx-glr-script-out .. code-block:: none ['Arcsine', 'Beta', 'Burr', 'Chi', 'ChiSquare', 'Dirichlet', 'Exponential', 'FisherSnedecor', 'Frechet', 'Gamma', 'GeneralizedPareto', 'Gumbel', 'Histogram', 'InverseNormal', 'Laplace', 'LogNormal', 'LogUniform', 'Logistic', 'MeixnerDistribution', 'Normal', 'Pareto', 'Rayleigh', 'Rice', 'Student', 'Trapezoidal', 'Triangular', 'TruncatedNormal', 'Uniform', 'VonMises', 'WeibullMax', 'WeibullMin'] .. GENERATED FROM PYTHON SOURCE LINES 61-63 For example, we can fit a normal distribution: .. GENERATED FROM PYTHON SOURCE LINES 63-66 .. code-block:: Python norm_dist = fitter.fit("Normal") norm_dist .. rst-class:: sphx-glr-script-out .. code-block:: none Normal([-0.0736121,0.855847]) .. GENERATED FROM PYTHON SOURCE LINES 67-68 or an exponential one: .. GENERATED FROM PYTHON SOURCE LINES 68-71 .. code-block:: Python exp_dist = fitter.fit("Exponential") exp_dist .. rst-class:: sphx-glr-script-out .. code-block:: none Exponential([0.375357,-2.73774]) .. GENERATED FROM PYTHON SOURCE LINES 72-75 The returned object is an :class:`.OTDistribution` that we can represent graphically in terms of probability and cumulative density functions: .. GENERATED FROM PYTHON SOURCE LINES 75-77 .. code-block:: Python norm_dist.plot() .. image-sg:: /examples/uncertainty/distributions/images/sphx_glr_plot_ot_distfactory_001.png :alt: Probability distribution of X :srcset: /examples/uncertainty/distributions/images/sphx_glr_plot_ot_distfactory_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none
.. GENERATED FROM PYTHON SOURCE LINES 78-85 Measure the goodness-of-fit --------------------------- We can also measure the goodness-of-fit of a distribution by means of a fitting criterion. Some fitting criteria are based on significance tests made of a test statistics, a p-value and a significance level. We can access the names of all the available fitting criteria: .. GENERATED FROM PYTHON SOURCE LINES 85-86 .. code-block:: Python fitter.available_criteria .. rst-class:: sphx-glr-script-out .. code-block:: none ['BIC', 'ChiSquared', 'Kolmogorov'] .. GENERATED FROM PYTHON SOURCE LINES 87-88 or only the significance tests .. GENERATED FROM PYTHON SOURCE LINES 88-90 .. code-block:: Python fitter.available_significance_tests .. rst-class:: sphx-glr-script-out .. code-block:: none [, ] .. GENERATED FROM PYTHON SOURCE LINES 91-95 For example, we can measure the goodness-of-fit of the previous distributions by considering the `Bayesian information criterion (BIC) `_: .. GENERATED FROM PYTHON SOURCE LINES 95-101 .. code-block:: Python quality_measure = fitter.compute_measure(norm_dist, "BIC") "Normal", quality_measure quality_measure = fitter.compute_measure(exp_dist, "BIC") "Exponential", quality_measure .. rst-class:: sphx-glr-script-out .. code-block:: none ('Exponential', 3.9597553873428653) .. GENERATED FROM PYTHON SOURCE LINES 102-107 Here, the fitted normal distribution is better than the fitted exponential one in terms of BIC. We can also the Kolmogorov fitting criterion which is based on the Kolmogorov significance test: .. GENERATED FROM PYTHON SOURCE LINES 107-112 .. code-block:: Python acceptable, details = fitter.compute_measure(norm_dist, "Kolmogorov") "Normal", acceptable, details acceptable, details = fitter.compute_measure(exp_dist, "Kolmogorov") "Exponential", acceptable, details .. rst-class:: sphx-glr-script-out .. code-block:: none ('Exponential', False, {'p-value': 4.8646243991869847e-11, 'statistics': 0.3434922163146683, 'level': 0.05}) .. GENERATED FROM PYTHON SOURCE LINES 113-126 In this case, the :meth:`.OTDistributionFitter.compute_measure` method returns a tuple with two values: 1. a boolean indicating if the measured distribution is acceptable to model the data, 2. a dictionary containing the test statistics, the p-value and the significance level. .. note:: We can also change the significance level for significance tests whose default value is 0.05. For that, use the ``level`` argument. .. GENERATED FROM PYTHON SOURCE LINES 128-146 Select an optimal distribution ------------------------------ Lastly, we can also select an optimal :class:`.OTDistribution` based on a collection of distributions names, a fitting criterion, a significance level and a selection criterion: - 'best': select the distribution minimizing (or maximizing, depending on the criterion) the criterion, - 'first': select the first distribution for which the criterion is greater (or lower, depending on the criterion) than the level. By default, the :meth:`.OTDistributionFitter.select` method uses a significance level equal to 0.5 and 'best' selection criterion. .. GENERATED FROM PYTHON SOURCE LINES 146-148 .. code-block:: Python selected_distribution = fitter.select(["Exponential", "Normal"], "Kolmogorov") selected_distribution .. rst-class:: sphx-glr-script-out .. code-block:: none Normal([-0.0736121,0.855847]) .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.155 seconds) .. _sphx_glr_download_examples_uncertainty_distributions_plot_ot_distfactory.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_ot_distfactory.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_ot_distfactory.py ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_