.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/uncertainty/distributions/plot_ot_distfactory.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_uncertainty_distributions_plot_ot_distfactory.py: Fitting a distribution from data based on OpenTURNS =================================================== .. GENERATED FROM PYTHON SOURCE LINES 25-34 .. code-block:: default from __future__ import annotations from gemseo import configure_logger from gemseo.uncertainty.distributions.openturns.fitting import OTDistributionFitter from numpy.random import randn from numpy.random import seed configure_logger() .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 35-41 In this example, we will see how to fit a distribution from data. For a purely pedagogical reason, we consider a synthetic dataset made of 100 realizations of *'X'*, a random variable distributed according to the standard normal distribution. These samples are generated from the NumPy library. .. GENERATED FROM PYTHON SOURCE LINES 41-45 .. code-block:: default seed(1) data = randn(100) variable_name = "X" .. GENERATED FROM PYTHON SOURCE LINES 46-50 Create a distribution fitter ---------------------------- Then, we create an :class:`.OTDistributionFitter` from these data and this variable name: .. GENERATED FROM PYTHON SOURCE LINES 50-52 .. code-block:: default fitter = OTDistributionFitter(variable_name, data) .. GENERATED FROM PYTHON SOURCE LINES 53-57 Fit a distribution ------------------ From this distribution fitter, we can easily fit any distribution available in the OpenTURNS library: .. GENERATED FROM PYTHON SOURCE LINES 57-59 .. code-block:: default print(fitter.available_distributions) .. rst-class:: sphx-glr-script-out .. code-block:: none ['Arcsine', 'Beta', 'Burr', 'Chi', 'ChiSquare', 'Dirichlet', 'Exponential', 'FisherSnedecor', 'Frechet', 'Gamma', 'GeneralizedPareto', 'Gumbel', 'Histogram', 'InverseNormal', 'Laplace', 'LogNormal', 'LogUniform', 'Logistic', 'MeixnerDistribution', 'Normal', 'Pareto', 'Rayleigh', 'Rice', 'Student', 'Trapezoidal', 'Triangular', 'TruncatedNormal', 'Uniform', 'VonMises', 'WeibullMax', 'WeibullMin'] .. GENERATED FROM PYTHON SOURCE LINES 60-62 For example, we can fit a normal distribution: .. GENERATED FROM PYTHON SOURCE LINES 62-65 .. code-block:: default norm_dist = fitter.fit("Normal") print(norm_dist) .. rst-class:: sphx-glr-script-out .. code-block:: none Normal([0.0605829,0.889615]) .. GENERATED FROM PYTHON SOURCE LINES 66-67 or an exponential one: .. GENERATED FROM PYTHON SOURCE LINES 67-70 .. code-block:: default exp_dist = fitter.fit("Exponential") print(exp_dist) .. rst-class:: sphx-glr-script-out .. code-block:: none Exponential([0.419342,-2.3241]) .. GENERATED FROM PYTHON SOURCE LINES 71-74 The returned object is an :class:`.OTDistribution` that we can represent graphically in terms of probability and cumulative density functions: .. GENERATED FROM PYTHON SOURCE LINES 74-76 .. code-block:: default norm_dist.plot() .. image-sg:: /examples/uncertainty/distributions/images/sphx_glr_plot_ot_distfactory_001.png :alt: Probability distribution of X :srcset: /examples/uncertainty/distributions/images/sphx_glr_plot_ot_distfactory_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none
.. GENERATED FROM PYTHON SOURCE LINES 77-84 Measure the goodness-of-fit --------------------------- We can also measure the goodness-of-fit of a distribution by means of a fitting criterion. Some fitting criteria are based on significance tests made of a test statistics, a p-value and a significance level. We can access the names of the available fitting criteria: .. GENERATED FROM PYTHON SOURCE LINES 84-87 .. code-block:: default print(fitter.available_criteria) print(fitter.available_significance_tests) .. rst-class:: sphx-glr-script-out .. code-block:: none ['BIC', 'ChiSquared', 'Kolmogorov'] [, ] .. GENERATED FROM PYTHON SOURCE LINES 88-92 For example, we can measure the goodness-of-fit of the previous distributions by considering the `Bayesian information criterion (BIC) `_: .. GENERATED FROM PYTHON SOURCE LINES 92-98 .. code-block:: default quality_measure = fitter.compute_measure(norm_dist, "BIC") print("Normal: ", quality_measure) quality_measure = fitter.compute_measure(exp_dist, "BIC") print("Exponential: ", quality_measure) .. rst-class:: sphx-glr-script-out .. code-block:: none Normal: 2.5939451287694295 Exponential: 3.7381346286469945 .. GENERATED FROM PYTHON SOURCE LINES 99-104 Here, the fitted normal distribution is better than the fitted exponential one in terms of BIC. We can also the Kolmogorov fitting criterion which is based on the Kolmogorov significance test: .. GENERATED FROM PYTHON SOURCE LINES 104-109 .. code-block:: default acceptable, details = fitter.compute_measure(norm_dist, "Kolmogorov") print("Normal: ", acceptable, details) acceptable, details = fitter.compute_measure(exp_dist, "Kolmogorov") print("Exponential: ", acceptable, details) .. rst-class:: sphx-glr-script-out .. code-block:: none Normal: True {'p-value': 0.9879299613543082, 'statistics': 0.04330972976650932, 'level': 0.05} Exponential: False {'p-value': 5.628454180958696e-11, 'statistics': 0.34248997332293696, 'level': 0.05} .. GENERATED FROM PYTHON SOURCE LINES 110-122 In this case, the :meth:`.OTDistributionFitter.measure` method returns a tuple with two values: 1. a boolean indicating if the measured distribution is acceptable to model the data, 2. a dictionary containing the test statistics, the p-value and the significance level. .. note:: We can also change the significance level for significance tests whose default value is 0.05. For that, use the :code:`level` argument. .. GENERATED FROM PYTHON SOURCE LINES 124-142 Select an optimal distribution ------------------------------ Lastly, we can also select an optimal :class:`.OTDistribution` based on a collection of distributions names, a fitting criterion, a significance level and a selection criterion: - 'best': select the distribution minimizing (or maximizing, depending on the criterion) the criterion, - 'first': select the first distribution for which the criterion is greater (or lower, depending on the criterion) than the level. By default, the :meth:`.OTDistributionFitter.select` method uses a significance level equal to 0.5 and 'best' selection criterion. .. GENERATED FROM PYTHON SOURCE LINES 142-144 .. code-block:: default selected_distribution = fitter.select(["Exponential", "Normal"], "Kolmogorov") print(selected_distribution) .. rst-class:: sphx-glr-script-out .. code-block:: none Normal([0.0605829,0.889615]) .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.185 seconds) .. _sphx_glr_download_examples_uncertainty_distributions_plot_ot_distfactory.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_ot_distfactory.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_ot_distfactory.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_