.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/uncertainty/statistics/plot_param_stats.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_uncertainty_statistics_plot_param_stats.py: Parametric estimation of statistics =================================== In this example, we want to estimate statistics from synthetic data. These data are 500 realizations of x_0, x_1, x_2 and x_3 distributed in the following way: - x_0: standard uniform distribution, - x_1: standard normal distribution, - x_2: standard Weibull distribution, - x_3: standard exponential distribution. These samples are generated from the NumPy library. .. GENERATED FROM PYTHON SOURCE LINES 37-49 .. code-block:: default from gemseo.api import configure_logger from gemseo.api import create_dataset from gemseo.uncertainty.api import create_statistics from numpy import vstack from numpy.random import exponential from numpy.random import normal from numpy.random import rand from numpy.random import seed from numpy.random import weibull configure_logger() .. rst-class:: sphx-glr-script-out Out: .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 50-52 Create synthetic data --------------------- .. GENERATED FROM PYTHON SOURCE LINES 52-68 .. code-block:: default seed(0) n_samples = 500 uniform_rand = rand(n_samples) normal_rand = normal(size=n_samples) weibull_rand = weibull(1.5, size=n_samples) exponential_rand = exponential(size=n_samples) data = vstack((uniform_rand, normal_rand, weibull_rand, exponential_rand)).T variables = ["x_0", "x_1", "x_2", "x_3"] print(data) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none [[ 0.5488135 -0.98551074 1.37408242 1.11379656] [ 0.71518937 -1.47183501 2.13236167 0.63548465] [ 0.60276338 1.64813493 0.52518717 3.2112956 ] ... [ 0.40171354 -0.21252304 0.30225024 4.00986833] [ 0.24841347 -0.76211451 0.364483 0.55896365] [ 0.50586638 -0.88778014 0.82654114 2.12919171]] .. GENERATED FROM PYTHON SOURCE LINES 69-73 Create a :class:`.ParametricStatistics` object ---------------------------------------------- We create a :class:`.ParametricStatistics` object from this data encapsulated in a :class:`.Dataset`: .. GENERATED FROM PYTHON SOURCE LINES 73-76 .. code-block:: default dataset = create_dataset("Dataset", data, variables) .. GENERATED FROM PYTHON SOURCE LINES 77-84 and a list of names of candidate probability distributions: exponential, normal and uniform distributions (see :meth:`.ParametricStatistics.get_available_distributions`). We do not use the default fitting criterion ('BIC') but 'Kolmogorov' (see :meth:`.ParametricStatistics.get_available_criteria` and :meth:`.ParametricStatistics.get_significance_tests`). .. GENERATED FROM PYTHON SOURCE LINES 84-91 .. code-block:: default tested_distributions = ["Exponential", "Normal", "Uniform"] analysis = create_statistics( dataset, tested_distributions=tested_distributions, fitting_criterion="Kolmogorov" ) print(analysis) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none INFO - 10:07:18: Create ParametricStatistics_Dataset, a ParametricStatistics library. INFO - 10:07:18: | Set goodness-of-fit criterion: Kolmogorov. INFO - 10:07:18: | Set significance level of hypothesis test: 0.05. INFO - 10:07:18: Fit different distributions (Exponential, Normal, Uniform) per variable and compute the goodness-of-fit criterion. INFO - 10:07:18: | Fit different distributions for x_0. INFO - 10:07:18: | Fit different distributions for x_1. INFO - 10:07:18: | Fit different distributions for x_2. INFO - 10:07:18: | Fit different distributions for x_3. INFO - 10:07:18: Select the best distribution for each variable. INFO - 10:07:18: | The best distribution for x_0 is Uniform(class=Point name=Unnamed dimension=2 values=[0.00271509,1.00083]). INFO - 10:07:18: | The best distribution for x_1 is Normal(class=Point name=Unnamed dimension=2 values=[-0.100117,0.985312]). WARNING - 10:07:18: All criteria values are lower than the significance level 0.05. INFO - 10:07:18: | The best distribution for x_2 is Normal(class=Point name=Unnamed dimension=2 values=[0.9783,0.665983]). INFO - 10:07:18: | The best distribution for x_3 is Exponential(class=Point name=Unnamed dimension=2 values=[1.02231,7.35553e-05]). ParametricStatistics_Dataset n_samples: 500 n_variables: 4 variables: x_0, x_1, x_2, x_3 .. GENERATED FROM PYTHON SOURCE LINES 92-102 Print the fitting matrix ------------------------ At this step, an optimal distribution has been selected for each variable based on the tested distributions and on the Kolmogorov fitting criterion. We can print the fitting matrix to see the goodness-of-fit measures for each pair < variable, distribution > as well as the selected distribution for each variable. Note that in the case of significance tests, the goodness-of-fit measures are the p-values. .. GENERATED FROM PYTHON SOURCE LINES 102-104 .. code-block:: default print(analysis.get_fitting_matrix()) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none +----------+------------------------+------------------------+------------------------+-------------+ | Variable | Exponential | Normal | Uniform | Selection | +----------+------------------------+------------------------+------------------------+-------------+ | x_0 | 1.602160180879313e-10 | 0.005823020521403932 | 0.7338504331264553 | Uniform | | x_1 | 2.82659088382179e-53 | 0.8587721484840084 | 5.660300987516015e-18 | Normal | | x_2 | 1.5387797946575896e-09 | 0.0016128012413438864 | 7.748433868335025e-67 | Normal | | x_3 | 0.864074427829853 | 2.0987474708559965e-10 | 7.782983660200643e-152 | Exponential | +----------+------------------------+------------------------+------------------------+-------------+ .. GENERATED FROM PYTHON SOURCE LINES 105-110 Get statistics -------------- From this :class:`.ParametricStatistics` instance, we can easily get statistics for the different variables based on the selected distributions. .. GENERATED FROM PYTHON SOURCE LINES 112-115 Get minimum ~~~~~~~~~~~ Here is the minimum value for the different variables: .. GENERATED FROM PYTHON SOURCE LINES 115-117 .. code-block:: default print(analysis.compute_minimum()) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none {'x_0': array([0.00271509]), 'x_1': array([-inf]), 'x_2': array([-inf]), 'x_3': array([7.3555332e-05])} .. GENERATED FROM PYTHON SOURCE LINES 118-121 Get maximum ~~~~~~~~~~~ Here is the minimum value for the different variables: .. GENERATED FROM PYTHON SOURCE LINES 121-123 .. code-block:: default print(analysis.compute_maximum()) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none {'x_0': array([1.00082739]), 'x_1': array([inf]), 'x_2': array([inf]), 'x_3': array([inf])} .. GENERATED FROM PYTHON SOURCE LINES 124-129 Get range ~~~~~~~~~ Here is the range, i.e. the difference between the minimum and maximum values, for the different variables: .. GENERATED FROM PYTHON SOURCE LINES 129-131 .. code-block:: default print(analysis.compute_range()) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none {'x_0': array([0.99811229]), 'x_1': array([inf]), 'x_2': array([inf]), 'x_3': array([inf])} .. GENERATED FROM PYTHON SOURCE LINES 132-135 Get mean ~~~~~~~~ Here is the mean value for the different variables: .. GENERATED FROM PYTHON SOURCE LINES 135-137 .. code-block:: default print(analysis.compute_mean()) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none {'x_0': array([0.50177124]), 'x_1': array([-0.1001173]), 'x_2': array([0.97829969]), 'x_3': array([0.97825244])} .. GENERATED FROM PYTHON SOURCE LINES 138-141 Get standard deviation ~~~~~~~~~~~~~~~~~~~~~~ Here is the standard deviation for the different variables: .. GENERATED FROM PYTHON SOURCE LINES 141-143 .. code-block:: default print(analysis.compute_standard_deviation()) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none {'x_0': array([0.2881302]), 'x_1': array([0.98531188]), 'x_2': array([0.66598346]), 'x_3': array([0.97817888])} .. GENERATED FROM PYTHON SOURCE LINES 144-147 Get variance ~~~~~~~~~~~~ Here is the variance for the different variables: .. GENERATED FROM PYTHON SOURCE LINES 147-149 .. code-block:: default print(analysis.compute_variance()) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none {'x_0': array([0.08301901]), 'x_1': array([0.9708395]), 'x_2': array([0.44353397]), 'x_3': array([0.95683393])} .. GENERATED FROM PYTHON SOURCE LINES 150-153 Get quantile ~~~~~~~~~~~~ Here is the quantile with level 80% for the different variables: .. GENERATED FROM PYTHON SOURCE LINES 153-155 .. code-block:: default print(analysis.compute_quantile(0.8)) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none {'x_0': array([0.80120493]), 'x_1': array([0.72914209]), 'x_2': array([1.53880551]), 'x_3': array([1.57439174])} .. GENERATED FROM PYTHON SOURCE LINES 156-159 Get quartile ~~~~~~~~~~~~ Here is the second quartile for the different variables: .. GENERATED FROM PYTHON SOURCE LINES 159-161 .. code-block:: default print(analysis.compute_quartile(2)) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none {'x_0': array([0.50177124]), 'x_1': array([-0.1001173]), 'x_2': array([0.97829969]), 'x_3': array([0.67809549])} .. GENERATED FROM PYTHON SOURCE LINES 162-165 Get percentile ~~~~~~~~~~~~~~ Here is the 50th percentile for the different variables: .. GENERATED FROM PYTHON SOURCE LINES 165-167 .. code-block:: default print(analysis.compute_percentile(50)) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none {'x_0': array([0.50177124]), 'x_1': array([-0.1001173]), 'x_2': array([0.97829969]), 'x_3': array([0.67809549])} .. GENERATED FROM PYTHON SOURCE LINES 168-171 Get median ~~~~~~~~~~ Here is the median for the different variables: .. GENERATED FROM PYTHON SOURCE LINES 171-173 .. code-block:: default print(analysis.compute_median()) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none {'x_0': array([0.50177124]), 'x_1': array([-0.1001173]), 'x_2': array([0.97829969]), 'x_3': array([0.67809549])} .. GENERATED FROM PYTHON SOURCE LINES 174-178 Get tolerance interval ~~~~~~~~~~~~~~~~~~~~~~ Here is the two-sided tolerance interval with a coverage level equal to 50% with a confidence level of 95% for the different variables: .. GENERATED FROM PYTHON SOURCE LINES 178-180 .. code-block:: default print(analysis.compute_tolerance_interval(0.5, 0.95)) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none {'x_0': (array([0.2522558]), array([0.75684261])), 'x_1': (array([-0.80205726]), array([0.60182265])), 'x_2': (array([0.50385052]), array([1.45274885])), 'x_3': (array([0.23960073]), array([1.3194909]))} .. GENERATED FROM PYTHON SOURCE LINES 181-186 Get B-value ~~~~~~~~~~~ Here is the B-value for the different variables, which is a left-sided tolerance interval with a coverage level equal to 90% with a confidence level of 95%: .. GENERATED FROM PYTHON SOURCE LINES 186-187 .. code-block:: default print(analysis.compute_b_value()) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none {'x_0': array([0.10253656]), 'x_1': array([-1.43545972]), 'x_2': array([0.07572662]), 'x_3': array([0.09277731])} .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.049 seconds) .. _sphx_glr_download_examples_uncertainty_statistics_plot_param_stats.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_param_stats.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_param_stats.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_