Empirical estimation of statistics#

In this example, we want to empirically estimate statistics associated with the range of the Mission discipline of the Sobieski's SSBJ problem.

For simplification, we use uniform distributions for the discipline inputs based on the bounds of the design parameters.

from __future__ import annotations

from gemseo import configure_logger
from gemseo import create_discipline
from gemseo import sample_disciplines
from gemseo.problems.mdo.sobieski.core.design_space import SobieskiDesignSpace
from gemseo.uncertainty import create_statistics

configure_logger()
<RootLogger root (INFO)>

Create the dataset#

First of all, we create the dataset. For that, we instantiate the discipline SobieskiMission of the Sobieski's SSBJ problem which is known to GEMSEO.

discipline = create_discipline("SobieskiMission")

Then, we load the design space of the Sobieski's SSBJ problem by means of the class SobieskiDesignSpace() and DesignSpace.filter() the inputs of the discipline SobieskiMission.

parameter_space = SobieskiDesignSpace()
parameter_space.filter(discipline.io.input_grammar.names)
Sobieski design space:
Name Lower bound Value Upper bound Type
x_shared[0] 0.01 0.05 0.09 float
x_shared[1] 30000 45000 60000 float
x_shared[2] 1.4 1.6 1.8 float
x_shared[3] 2.5 5.5 8.5 float
x_shared[4] 40 55 70 float
x_shared[5] 500 1000 1500 float
y_14[0] 24850 50606.9741711 77100 float
y_14[1] -7700 7306.20262124 45000 float
y_24 0.44 4.15006276 11.13 float
y_34 0.44 1.10754577 1.98 float


Then, we sample the discipline over this design space by means of the sample_disciplines() function with a Monte Carlo algorithm and 100 samples.

dataset = sample_disciplines(
    [discipline],
    parameter_space,
    "y_4",
    formulation_name="DisciplinaryOpt",
    algo_name="OT_MONTE_CARLO",
    n_samples=100,
)
    INFO - 15:35:41: *** Start Sampling execution ***
    INFO - 15:35:41: Sampling
    INFO - 15:35:41:    Disciplines: SobieskiMission
    INFO - 15:35:41:    MDO formulation: DisciplinaryOpt
    INFO - 15:35:41: Running the algorithm OT_MONTE_CARLO:
    INFO - 15:35:41:      1%|          | 1/100 [00:00<00:00, 357.85 it/sec]
    INFO - 15:35:41:      2%|▏         | 2/100 [00:00<00:00, 638.45 it/sec]
    INFO - 15:35:41:      3%|▎         | 3/100 [00:00<00:00, 886.81 it/sec]
    INFO - 15:35:41:      4%|▍         | 4/100 [00:00<00:00, 1107.19 it/sec]
    INFO - 15:35:41:      5%|▌         | 5/100 [00:00<00:00, 1292.86 it/sec]
    INFO - 15:35:41:      6%|▌         | 6/100 [00:00<00:00, 1469.28 it/sec]
    INFO - 15:35:41:      7%|▋         | 7/100 [00:00<00:00, 1629.40 it/sec]
   ERROR - 15:35:41: Failed to evaluate function y_4
ValueError: math domain error
   ERROR - 15:35:41: Problem with evaluation of sample:[1.29698918e-02 4.61506629e+04 1.54257893e+00 7.81530848e+00
ValueError: math domain error
   ERROR - 15:35:41:  5.05577274e+01 1.02006624e+03 3.16585883e+04 3.74764805e+04
ValueError: math domain error
   ERROR - 15:35:41:  1.02715165e+01 8.69196356e-01] result is not taken into account in DOE.
ValueError: math domain error
    INFO - 15:35:41:      8%|▊         | 8/100 [00:00<00:00, 1549.07 it/sec]
    INFO - 15:35:41:      9%|▉         | 9/100 [00:00<00:00, 1658.19 it/sec]
    INFO - 15:35:41:     10%|█         | 10/100 [00:00<00:00, 1757.59 it/sec]
    INFO - 15:35:41:     11%|█         | 11/100 [00:00<00:00, 1854.92 it/sec]
   ERROR - 15:35:41: Failed to evaluate function y_4
ValueError: math domain error
   ERROR - 15:35:41: Problem with evaluation of sample:[5.48981321e-02 3.02492651e+04 1.78425506e+00 7.51304278e+00
ValueError: math domain error
   ERROR - 15:35:41:  5.91913274e+01 1.07102327e+03 3.89430460e+04 4.39972900e+04
ValueError: math domain error
   ERROR - 15:35:41:  5.23135403e+00 1.69750699e+00] result is not taken into account in DOE.
ValueError: math domain error
    INFO - 15:35:41:     12%|█▏        | 12/100 [00:00<00:00, 1793.40 it/sec]
    INFO - 15:35:41:     13%|█▎        | 13/100 [00:00<00:00, 1868.74 it/sec]
    INFO - 15:35:41:     14%|█▍        | 14/100 [00:00<00:00, 1944.96 it/sec]
    INFO - 15:35:41:     15%|█▌        | 15/100 [00:00<00:00, 2020.77 it/sec]
    INFO - 15:35:41:     16%|█▌        | 16/100 [00:00<00:00, 2089.58 it/sec]
    INFO - 15:35:41:     17%|█▋        | 17/100 [00:00<00:00, 2158.54 it/sec]
    INFO - 15:35:41:     18%|█▊        | 18/100 [00:00<00:00, 2226.67 it/sec]
    INFO - 15:35:41:     19%|█▉        | 19/100 [00:00<00:00, 2290.32 it/sec]
   ERROR - 15:35:41: Failed to evaluate function y_4
ValueError: math domain error
   ERROR - 15:35:41: Problem with evaluation of sample:[3.23734572e-02 5.75371340e+04 1.56588241e+00 3.92408088e+00
ValueError: math domain error
   ERROR - 15:35:41:  5.21270800e+01 6.95527982e+02 2.78544051e+04 2.90664780e+04
ValueError: math domain error
   ERROR - 15:35:41:  8.48303790e+00 1.07896745e+00] result is not taken into account in DOE.
ValueError: math domain error
    INFO - 15:35:41:     20%|██        | 20/100 [00:00<00:00, 2219.27 it/sec]
   ERROR - 15:35:41: Failed to evaluate function y_4
ValueError: math domain error
   ERROR - 15:35:41: Problem with evaluation of sample:[8.65821820e-02 5.67282712e+04 1.63166519e+00 8.05650222e+00
ValueError: math domain error
   ERROR - 15:35:41:  6.64935582e+01 7.09872962e+02 2.82276678e+04 3.83552559e+04
ValueError: math domain error
   ERROR - 15:35:41:  1.00916432e+01 1.92827068e+00] result is not taken into account in DOE.
ValueError: math domain error
    INFO - 15:35:41:     21%|██        | 21/100 [00:00<00:00, 2157.93 it/sec]
    INFO - 15:35:41:     22%|██▏       | 22/100 [00:00<00:00, 2202.26 it/sec]
    INFO - 15:35:41:     23%|██▎       | 23/100 [00:00<00:00, 2252.74 it/sec]
    INFO - 15:35:41:     24%|██▍       | 24/100 [00:00<00:00, 2303.61 it/sec]
    INFO - 15:35:41:     25%|██▌       | 25/100 [00:00<00:00, 2347.23 it/sec]
    INFO - 15:35:41:     26%|██▌       | 26/100 [00:00<00:00, 2392.38 it/sec]
    INFO - 15:35:41:     27%|██▋       | 27/100 [00:00<00:00, 2438.39 it/sec]
    INFO - 15:35:41:     28%|██▊       | 28/100 [00:00<00:00, 2483.15 it/sec]
    INFO - 15:35:41:     29%|██▉       | 29/100 [00:00<00:00, 2526.22 it/sec]
    INFO - 15:35:41:     30%|███       | 30/100 [00:00<00:00, 2563.08 it/sec]
   ERROR - 15:35:41: Failed to evaluate function y_4
ValueError: math domain error
   ERROR - 15:35:41: Problem with evaluation of sample:[8.73940398e-02 3.67026184e+04 1.56476942e+00 4.19812099e+00
ValueError: math domain error
   ERROR - 15:35:41:  6.17650588e+01 1.19302534e+03 3.41482711e+04 4.38203117e+04
ValueError: math domain error
   ERROR - 15:35:41:  1.00305508e+01 1.32482248e+00] result is not taken into account in DOE.
ValueError: math domain error
    INFO - 15:35:41:     31%|███       | 31/100 [00:00<00:00, 2498.10 it/sec]
    INFO - 15:35:41:     32%|███▏      | 32/100 [00:00<00:00, 2523.32 it/sec]
    INFO - 15:35:41:     33%|███▎      | 33/100 [00:00<00:00, 2553.44 it/sec]
    INFO - 15:35:41:     34%|███▍      | 34/100 [00:00<00:00, 2584.90 it/sec]
    INFO - 15:35:41:     35%|███▌      | 35/100 [00:00<00:00, 2619.19 it/sec]
    INFO - 15:35:41:     36%|███▌      | 36/100 [00:00<00:00, 2652.90 it/sec]
   ERROR - 15:35:41: Failed to evaluate function y_4
ValueError: math domain error
   ERROR - 15:35:41: Problem with evaluation of sample:[6.31287105e-02 4.23324465e+04 1.41438529e+00 4.38495848e+00
ValueError: math domain error
   ERROR - 15:35:41:  5.18016406e+01 9.81706178e+02 4.02684763e+04 4.04292612e+04
ValueError: math domain error
   ERROR - 15:35:41:  1.63495128e+00 1.58647382e+00] result is not taken into account in DOE.
ValueError: math domain error
    INFO - 15:35:41:     37%|███▋      | 37/100 [00:00<00:00, 2564.14 it/sec]
   ERROR - 15:35:41: Failed to evaluate function y_4
ValueError: math domain error
   ERROR - 15:35:41: Problem with evaluation of sample:[4.70074524e-02 4.40740007e+04 1.62275117e+00 4.39813380e+00
ValueError: math domain error
   ERROR - 15:35:41:  6.53898046e+01 1.27066268e+03 3.10026856e+04 3.93852349e+04
ValueError: math domain error
   ERROR - 15:35:41:  5.25707580e+00 4.69709094e-01] result is not taken into account in DOE.
ValueError: math domain error
    INFO - 15:35:41:     38%|███▊      | 38/100 [00:00<00:00, 2496.81 it/sec]
   ERROR - 15:35:41: Failed to evaluate function y_4
ValueError: math domain error
   ERROR - 15:35:41: Problem with evaluation of sample:[5.02116961e-02 4.11016864e+04 1.76501712e+00 7.39553905e+00
ValueError: math domain error
   ERROR - 15:35:41:  4.03455877e+01 8.42798741e+02 3.16404732e+04 3.77881168e+04
ValueError: math domain error
   ERROR - 15:35:41:  4.99873191e+00 5.33732657e-01] result is not taken into account in DOE.
ValueError: math domain error
    INFO - 15:35:41:     39%|███▉      | 39/100 [00:00<00:00, 2448.99 it/sec]
    INFO - 15:35:41:     40%|████      | 40/100 [00:00<00:00, 2474.73 it/sec]
    INFO - 15:35:41:     41%|████      | 41/100 [00:00<00:00, 2498.71 it/sec]
   ERROR - 15:35:41: Failed to evaluate function y_4
ValueError: math domain error
   ERROR - 15:35:41: Problem with evaluation of sample:[1.79955568e-02 4.13572866e+04 1.53514395e+00 6.06896708e+00
ValueError: math domain error
   ERROR - 15:35:41:  6.30366946e+01 5.07973536e+02 3.06338667e+04 3.54965135e+04
ValueError: math domain error
   ERROR - 15:35:41:  6.15365417e-01 1.38454066e+00] result is not taken into account in DOE.
ValueError: math domain error
    INFO - 15:35:41:     42%|████▏     | 42/100 [00:00<00:00, 2452.57 it/sec]
    INFO - 15:35:41:     43%|████▎     | 43/100 [00:00<00:00, 2477.44 it/sec]
    INFO - 15:35:41:     44%|████▍     | 44/100 [00:00<00:00, 2502.64 it/sec]
    INFO - 15:35:41:     45%|████▌     | 45/100 [00:00<00:00, 2521.32 it/sec]
    INFO - 15:35:41:     46%|████▌     | 46/100 [00:00<00:00, 2546.63 it/sec]
   ERROR - 15:35:41: Failed to evaluate function y_4
ValueError: math domain error
   ERROR - 15:35:41: Problem with evaluation of sample:[7.34156617e-02 4.29755065e+04 1.65638244e+00 4.73092799e+00
ValueError: math domain error
   ERROR - 15:35:41:  6.39497950e+01 7.54812818e+02 3.75942160e+04 4.10549422e+04
ValueError: math domain error
   ERROR - 15:35:41:  1.01970094e+01 1.93999914e+00] result is not taken into account in DOE.
ValueError: math domain error
    INFO - 15:35:41:     47%|████▋     | 47/100 [00:00<00:00, 2505.78 it/sec]
    INFO - 15:35:41:     48%|████▊     | 48/100 [00:00<00:00, 2528.69 it/sec]
   ERROR - 15:35:41: Failed to evaluate function y_4
ValueError: math domain error
   ERROR - 15:35:41: Problem with evaluation of sample:[8.69811400e-02 3.42276221e+04 1.48336149e+00 4.28449805e+00
ValueError: math domain error
   ERROR - 15:35:41:  4.74431990e+01 9.08368407e+02 3.06851402e+04 4.29931580e+04
ValueError: math domain error
   ERROR - 15:35:41:  9.77470946e+00 9.51964086e-01] result is not taken into account in DOE.
ValueError: math domain error
    INFO - 15:35:41:     49%|████▉     | 49/100 [00:00<00:00, 2490.56 it/sec]
    INFO - 15:35:41:     50%|█████     | 50/100 [00:00<00:00, 2510.99 it/sec]
   ERROR - 15:35:41: Failed to evaluate function y_4
ValueError: math domain error
   ERROR - 15:35:41: Problem with evaluation of sample:[8.44564380e-02 3.07957264e+04 1.54087352e+00 5.88383336e+00
ValueError: math domain error
   ERROR - 15:35:41:  4.21051139e+01 5.91254283e+02 3.08786610e+04 3.32313216e+04
ValueError: math domain error
   ERROR - 15:35:41:  4.95940225e+00 7.91759531e-01] result is not taken into account in DOE.
ValueError: math domain error
    INFO - 15:35:41:     51%|█████     | 51/100 [00:00<00:00, 2478.16 it/sec]
    INFO - 15:35:41:     52%|█████▏    | 52/100 [00:00<00:00, 2494.35 it/sec]
    INFO - 15:35:41:     53%|█████▎    | 53/100 [00:00<00:00, 2514.31 it/sec]
    INFO - 15:35:41:     54%|█████▍    | 54/100 [00:00<00:00, 2533.90 it/sec]
    INFO - 15:35:41:     55%|█████▌    | 55/100 [00:00<00:00, 2555.55 it/sec]
    INFO - 15:35:41:     56%|█████▌    | 56/100 [00:00<00:00, 2574.80 it/sec]
    INFO - 15:35:41:     57%|█████▋    | 57/100 [00:00<00:00, 2594.22 it/sec]
    INFO - 15:35:41:     58%|█████▊    | 58/100 [00:00<00:00, 2615.52 it/sec]
    INFO - 15:35:41:     59%|█████▉    | 59/100 [00:00<00:00, 2636.77 it/sec]
    INFO - 15:35:41:     60%|██████    | 60/100 [00:00<00:00, 2657.59 it/sec]
    INFO - 15:35:41:     61%|██████    | 61/100 [00:00<00:00, 2675.02 it/sec]
    INFO - 15:35:41:     62%|██████▏   | 62/100 [00:00<00:00, 2694.76 it/sec]
    INFO - 15:35:41:     63%|██████▎   | 63/100 [00:00<00:00, 2714.56 it/sec]
   ERROR - 15:35:41: Failed to evaluate function y_4
ValueError: math domain error
   ERROR - 15:35:41: Problem with evaluation of sample:[6.77585739e-02 3.79793238e+04 1.74373635e+00 3.43230716e+00
ValueError: math domain error
   ERROR - 15:35:41:  4.13701698e+01 1.31335579e+03 2.88266747e+04 3.74178761e+04
ValueError: math domain error
   ERROR - 15:35:41:  8.88078902e+00 1.83214824e+00] result is not taken into account in DOE.
ValueError: math domain error
    INFO - 15:35:41:     64%|██████▍   | 64/100 [00:00<00:00, 2675.63 it/sec]
    INFO - 15:35:41:     65%|██████▌   | 65/100 [00:00<00:00, 2690.04 it/sec]
    INFO - 15:35:41:     66%|██████▌   | 66/100 [00:00<00:00, 2705.45 it/sec]
    INFO - 15:35:41:     67%|██████▋   | 67/100 [00:00<00:00, 2716.99 it/sec]
    INFO - 15:35:41:     68%|██████▊   | 68/100 [00:00<00:00, 2732.42 it/sec]
    INFO - 15:35:41:     69%|██████▉   | 69/100 [00:00<00:00, 2748.51 it/sec]
    INFO - 15:35:41:     70%|███████   | 70/100 [00:00<00:00, 2765.57 it/sec]
    INFO - 15:35:41:     71%|███████   | 71/100 [00:00<00:00, 2782.59 it/sec]
    INFO - 15:35:41:     72%|███████▏  | 72/100 [00:00<00:00, 2797.14 it/sec]
   ERROR - 15:35:41: Failed to evaluate function y_4
ValueError: math domain error
   ERROR - 15:35:41: Problem with evaluation of sample:[1.63329215e-02 5.97324226e+04 1.56766182e+00 4.71404767e+00
ValueError: math domain error
   ERROR - 15:35:41:  4.44948320e+01 1.24758949e+03 4.39269694e+04 4.44436533e+04
ValueError: math domain error
   ERROR - 15:35:41:  6.30023047e+00 8.37747853e-01] result is not taken into account in DOE.
ValueError: math domain error
    INFO - 15:35:41:     73%|███████▎  | 73/100 [00:00<00:00, 2759.58 it/sec]
    INFO - 15:35:41:     74%|███████▍  | 74/100 [00:00<00:00, 2769.60 it/sec]
    INFO - 15:35:41:     75%|███████▌  | 75/100 [00:00<00:00, 2782.45 it/sec]
    INFO - 15:35:41:     76%|███████▌  | 76/100 [00:00<00:00, 2797.55 it/sec]
    INFO - 15:35:41:     77%|███████▋  | 77/100 [00:00<00:00, 2813.11 it/sec]
    INFO - 15:35:41:     78%|███████▊  | 78/100 [00:00<00:00, 2828.72 it/sec]
    INFO - 15:35:41:     79%|███████▉  | 79/100 [00:00<00:00, 2824.88 it/sec]
    INFO - 15:35:41:     80%|████████  | 80/100 [00:00<00:00, 2836.22 it/sec]
    INFO - 15:35:41:     81%|████████  | 81/100 [00:00<00:00, 2849.39 it/sec]
    INFO - 15:35:41:     82%|████████▏ | 82/100 [00:00<00:00, 2842.73 it/sec]
    INFO - 15:35:41:     83%|████████▎ | 83/100 [00:00<00:00, 2840.32 it/sec]
    INFO - 15:35:41:     84%|████████▍ | 84/100 [00:00<00:00, 2848.75 it/sec]
    INFO - 15:35:41:     85%|████████▌ | 85/100 [00:00<00:00, 2856.97 it/sec]
    INFO - 15:35:41:     86%|████████▌ | 86/100 [00:00<00:00, 2867.15 it/sec]
    INFO - 15:35:41: *** End Sampling execution (time: 0:00:00.031382) ***

Create an EmpiricalStatistics object for all variables#

In this second stage, we create an EmpiricalStatistics from the Dataset:

analysis = create_statistics(dataset, name="SobieskiMission")
analysis
SobieskiMission
  • n_samples: 86
  • n_variables: 5
  • variables: x_shared, y_14, y_24, y_34, y_4


and easily obtain statistics, such as the minimum values of the different variables over the dataset:

analysis.compute_minimum()
{'x_shared': array([1.13352258e-02, 3.01398316e+04, 1.40041295e+00, 2.55929483e+00,
       4.05869021e+01, 5.05494083e+02]), 'y_14': array([26794.72022057, -6556.01459554]), 'y_24': array([0.48388241]), 'y_34': array([0.44667594]), 'y_4': array([-1532.4108456])}

Create an EmpiricalStatistics object for the range#

We can only reduce the statistical analysis to the range variable:

analysis = create_statistics(
    dataset, variable_names=["y_4"], name="SobieskiMission.range"
)
analysis
SobieskiMission.range
  • n_samples: 86
  • n_variables: 1
  • variables: y_4


Get minimum#

Here is the minimum value:

analysis.compute_minimum()
{'y_4': array([-1532.4108456])}

Get maximum#

Here is the maximum value:

analysis.compute_maximum()
{'y_4': array([62030.01439676])}

Get range#

Here is the (different between minimum and maximum values):

analysis.compute_range()
{'y_4': array([63562.42524236])}

Get mean#

Here is the mean value:

analysis.compute_mean()
{'y_4': array([4539.61303269])}

Get central moment#

Here is the second central moment:

analysis.compute_moment(2)
{'y_4': array([85150303.53546816])}

Get standard deviation#

Here is the standard deviation:

analysis.compute_standard_deviation()
{'y_4': array([9227.6922107])}

Get variance#

Here is the variance.

analysis.compute_variance()
{'y_4': array([85150303.53546816])}

Get quantile#

Here is the quantile with level equal to 80%:

analysis.compute_quantile(0.8)
{'y_4': array([5261.95981236])}

Get probability#

Here are the probability to respectively be greater and lower than the default output value:

default_output = discipline.execute()
(
    analysis.compute_probability(default_output),
    analysis.compute_probability(default_output, greater=False),
)
({'y_4': array([0.76744186])}, {'y_4': array([0.23255814])})

Get quartile#

Here is the second quartile:

analysis.compute_quartile(2)
{'y_4': array([2117.20788623])}

Get percentile#

Here is the 50the percentile:

analysis.compute_percentile(50)
{'y_4': array([2117.20788623])}

Get median#

Here is the median:

analysis.compute_median()
{'y_4': array([2117.20788623])}

Plot the distribution#

We can use a boxplot to visualize the data distribution:

analysis.plot_boxplot()
plot emp stats
{'y_4': <gemseo.post.dataset.boxplot.Boxplot object at 0x7f5225bef0e0>}

draw the empirical cumulative distribution function:

analysis.plot_cdf()
plot emp stats
{'y_4': <gemseo.post.dataset.lines.Lines object at 0x7f5225bef1d0>}

or draw the empirical probability density function:

analysis.plot_pdf()
plot emp stats
{'y_4': <gemseo.post.dataset.lines.Lines object at 0x7f5225e5edb0>}

Total running time of the script: (0 minutes 0.427 seconds)

Gallery generated by Sphinx-Gallery