Note
Go to the end to download the full example code.
Optimal LHS vs LHS#
from __future__ import annotations
import matplotlib.pyplot as plt
from numpy import linspace
from gemseo import compute_doe
Latin hypercube sampling (LHS) is a technique to generate a set of \(n\) points in dimension \(d\), with good space-filling properties. LHS is also the name of the resulting design of experiments (DOE).
We can use the "OT_LHS"
algorithm,
coming from OpenTURNS as indicated by the "OT_"
prefix,
to generate such a DOE,
say with 15 points in dimension 2:
n = 15
d = 2
samples = compute_doe(d, algo_name="OT_LHS", n_samples=n)
plt.plot(samples[:, 0], samples[:, 1], "o")
plt.xticks(linspace(0, 1, n + 1), minor=True)
plt.yticks(linspace(0, 1, n + 1), minor=True)
plt.grid(which="both")
plt.title("An LHS")
plt.show()

From a technical point of view,
the range of each variable is divided into \(n\) equally probable intervals.
Then,
the \(n\) points are added sequentially
to satisfy the Latin hypercube requirement:
one and only one point per interval.
When adding a point,
an interval is chosen at random for each variable,
conditionally to this requirement,
then the point is drawn uniformly into the resulting hypercube.
Thus,
LHS is not a deterministic technique,
and so we can generate another LHS (if we change the seed
):
samples = compute_doe(d, algo_name="OT_LHS", n_samples=n, seed=123)
plt.plot(samples[:, 0], samples[:, 1], "o")
plt.xticks(linspace(0, 1, n + 1), minor=True)
plt.yticks(linspace(0, 1, n + 1), minor=True)
plt.grid(which="both")
plt.title("Another LHS")
plt.show()

These DOEs are different,
but share things in common:
they cover the space well in some places
and bad in others with points close to each other.
For both DOEs, there is room for improvement.
To search for this improvement,
one can use the "OT_OPT_LHS"
algorithm
by disabling its annealing
option,
to select the best LHS among a 1000 Monte Carlo instances:
samples = compute_doe(d, algo_name="OT_OPT_LHS", n_samples=n, annealing=False)
plt.plot(samples[:, 0], samples[:, 1], "o")
plt.xticks(linspace(0, 1, n + 1), minor=True)
plt.yticks(linspace(0, 1, n + 1), minor=True)
plt.grid(which="both")
plt.title("An LHS optimized by Monte Carlo")
plt.show()

The result is a little better but there is still room for improvement.
Finally,
we can use the "OT_OPT_LHS"
algorithm with its default settings,
to get an LHS improved by simulated annealing, a global optimization algorithm.
samples = compute_doe(d, algo_name="OT_OPT_LHS", n_samples=n)
plt.plot(samples[:, 0], samples[:, 1], "o")
plt.xticks(linspace(0, 1, n + 1), minor=True)
plt.yticks(linspace(0, 1, n + 1), minor=True)
plt.grid(which="both")
plt.title("An LHS optimized by simulated annealing")
plt.show()

We see that this DOE covers the space much better.
Note
These DOEs are optimal according to the C2 discrepancy,
measuring the distance between the empirical distribution of the points
and the uniform distribution.
"OT_OPT_LHS"
has options to change
this space-filling criterion (criterion
)
the number of Monte Carlo instances (n_replicates
),
and the profile temperature for simulated annealing (temperature
).
See OT_OPT_LHS for more information about the settings.
See also
This example uses the "OT_OPT_LHS"
algorithm from OpenTURNS
to create an optimal LHS.
For the same purpose,
we could also use the "LHS"
algorithm from SciPy
with its option optimization
set to "random-cd"
or "lloyd"
.
See LHS for more information about the settings.
Total running time of the script: (0 minutes 0.263 seconds)