.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/uncertainty/distributions/plot_ot_distfactory.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_uncertainty_distributions_plot_ot_distfactory.py: Fitting a distribution from data based on OpenTURNS =================================================== .. GENERATED FROM PYTHON SOURCE LINES 25-37 .. code-block:: Python from __future__ import annotations from numpy.random import default_rng from gemseo import configure_logger from gemseo.uncertainty.distributions.openturns.distribution_fitter import ( OTDistributionFitter, ) configure_logger() .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 38-44 In this example, we will see how to fit a distribution from data. For a purely pedagogical reason, we consider a synthetic dataset made of 100 realizations of *'X'*, a random variable distributed according to the standard normal distribution. These samples are generated from the NumPy library. .. GENERATED FROM PYTHON SOURCE LINES 44-48 .. code-block:: Python rng = default_rng(1) data = rng.normal(size=100) variable_name = "X" .. GENERATED FROM PYTHON SOURCE LINES 49-53 Create a distribution fitter ---------------------------- Then, we create an :class:`.OTDistributionFitter` from these data and this variable name: .. GENERATED FROM PYTHON SOURCE LINES 53-55 .. code-block:: Python fitter = OTDistributionFitter(variable_name, data) .. GENERATED FROM PYTHON SOURCE LINES 56-60 Fit a distribution ------------------ From this distribution fitter, we can easily fit any distribution available in the OpenTURNS library: .. GENERATED FROM PYTHON SOURCE LINES 60-62 .. code-block:: Python fitter.available_distributions .. rst-class:: sphx-glr-script-out .. code-block:: none ['Arcsine', 'Beta', 'Burr', 'Chi', 'ChiSquare', 'Dirichlet', 'Exponential', 'FisherSnedecor', 'Frechet', 'Gamma', 'GeneralizedPareto', 'Gumbel', 'Histogram', 'InverseNormal', 'Laplace', 'LogNormal', 'LogUniform', 'Logistic', 'MeixnerDistribution', 'Normal', 'Pareto', 'Rayleigh', 'Rice', 'Student', 'Trapezoidal', 'Triangular', 'TruncatedNormal', 'Uniform', 'VonMises', 'WeibullMax', 'WeibullMin'] .. GENERATED FROM PYTHON SOURCE LINES 63-65 For example, we can fit a normal distribution: .. GENERATED FROM PYTHON SOURCE LINES 65-68 .. code-block:: Python norm_dist = fitter.fit("Normal") norm_dist .. rst-class:: sphx-glr-script-out .. code-block:: none Normal(-0.07361212127294708, 0.8558467188443057) .. GENERATED FROM PYTHON SOURCE LINES 69-70 or an exponential one: .. GENERATED FROM PYTHON SOURCE LINES 70-73 .. code-block:: Python exp_dist = fitter.fit("Exponential") exp_dist .. rst-class:: sphx-glr-script-out .. code-block:: none Exponential(0.3753570045811942, -2.7377425032695566) .. GENERATED FROM PYTHON SOURCE LINES 74-77 The returned object is an :class:`.OTDistribution` that we can represent graphically in terms of probability and cumulative density functions: .. GENERATED FROM PYTHON SOURCE LINES 77-79 .. code-block:: Python norm_dist.plot() .. image-sg:: /examples/uncertainty/distributions/images/sphx_glr_plot_ot_distfactory_001.png :alt: Normal(-0.07361212127294708, 0.8558467188443057) :srcset: /examples/uncertainty/distributions/images/sphx_glr_plot_ot_distfactory_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none
.. GENERATED FROM PYTHON SOURCE LINES 80-87 Measure the goodness-of-fit --------------------------- We can also measure the goodness-of-fit of a distribution by means of a fitting criterion. Some fitting criteria are based on significance tests made of a test statistics, a p-value and a significance level. We can access the names of all the available fitting criteria: .. GENERATED FROM PYTHON SOURCE LINES 87-88 .. code-block:: Python fitter.available_criteria .. rst-class:: sphx-glr-script-out .. code-block:: none ['BIC', 'ChiSquared', 'Kolmogorov'] .. GENERATED FROM PYTHON SOURCE LINES 89-90 or only the significance tests .. GENERATED FROM PYTHON SOURCE LINES 90-92 .. code-block:: Python fitter.available_significance_tests .. rst-class:: sphx-glr-script-out .. code-block:: none ['ChiSquared', 'Kolmogorov'] .. GENERATED FROM PYTHON SOURCE LINES 93-97 For example, we can measure the goodness-of-fit of the previous distributions by considering the `Bayesian information criterion (BIC) `_: .. GENERATED FROM PYTHON SOURCE LINES 97-103 .. code-block:: Python quality_measure = fitter.compute_measure(norm_dist, "BIC") "Normal", quality_measure quality_measure = fitter.compute_measure(exp_dist, "BIC") "Exponential", quality_measure .. rst-class:: sphx-glr-script-out .. code-block:: none ('Exponential', 3.9597553873428653) .. GENERATED FROM PYTHON SOURCE LINES 104-109 Here, the fitted normal distribution is better than the fitted exponential one in terms of BIC. We can also the Kolmogorov fitting criterion which is based on the Kolmogorov significance test: .. GENERATED FROM PYTHON SOURCE LINES 109-114 .. code-block:: Python acceptable, details = fitter.compute_measure(norm_dist, "Kolmogorov") "Normal", acceptable, details acceptable, details = fitter.compute_measure(exp_dist, "Kolmogorov") "Exponential", acceptable, details .. rst-class:: sphx-glr-script-out .. code-block:: none ('Exponential', False, {'p-value': 4.864624399187062e-11, 'statistics': 0.3434922163146683, 'level': 0.05}) .. GENERATED FROM PYTHON SOURCE LINES 115-128 In this case, the :meth:`.OTDistributionFitter.compute_measure` method returns a tuple with two values: 1. a boolean indicating if the measured distribution is acceptable to model the data, 2. a dictionary containing the test statistics, the p-value and the significance level. .. note:: We can also change the significance level for significance tests whose default value is 0.05. For that, use the ``level`` argument. .. GENERATED FROM PYTHON SOURCE LINES 130-148 Select an optimal distribution ------------------------------ Lastly, we can also select an optimal :class:`.OTDistribution` based on a collection of distributions names, a fitting criterion, a significance level and a selection criterion: - 'best': select the distribution minimizing (or maximizing, depending on the criterion) the criterion, - 'first': select the first distribution for which the criterion is greater (or lower, depending on the criterion) than the level. By default, the :meth:`.OTDistributionFitter.select` method uses a significance level equal to 0.5 and 'best' selection criterion. .. GENERATED FROM PYTHON SOURCE LINES 148-150 .. code-block:: Python selected_distribution = fitter.select(["Exponential", "Normal"], "Kolmogorov") selected_distribution .. rst-class:: sphx-glr-script-out .. code-block:: none Normal(-0.07361212127294708, 0.8558467188443057) .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.067 seconds) .. _sphx_glr_download_examples_uncertainty_distributions_plot_ot_distfactory.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_ot_distfactory.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_ot_distfactory.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_ot_distfactory.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_