gemseo.post.dataset.andrews_curves module#

Draw Andrews curves from a Dataset.

The AndrewsCurves class implements the Andrew plot, a.k.a. Andrews curves, which is a way to visualize \(n\) samples of a high-dimensional vector

\[x=(x_1,x_2,\ldots,x_d)\in\mathbb{R}^d\]

in a 2D referential by projecting each sample

\[x^{(i)}=(x_1^{(i)},x_2^{(i)},\ldots,x_d^{(i)})\]

onto the vector

\[\left(\frac{1}{\sqrt{2}},\sin(t),\cos(t),\sin(2t),\cos(2t), \ldots\right)\]

which is composed of the \(d\) first elements of the Fourier series:

\[f_i(t)=\left(\frac{x_1^{(i)}}{\sqrt{2}},x_2^{(i)}\sin(t),x_3^{(i)}\cos(t), x_4^{(i)}\sin(2t),x_5^{(i)}\cos(2t),\ldots\right)\]

Each curve \(t\mapsto f_i(t)\) is plotted over the interval \([-\pi,\pi]\) and structure in the data may be visible in these \(n\) Andrews curves.

A variable name can be passed to the DatasetPlot.execute() method by means of the classifier keyword in order to color the curves according to the value of the variable name. This is useful when the data is labeled.

class AndrewsCurves(dataset, classifier)[source]#

Bases: DatasetPlot

Andrews curves.

Parameters:
  • dataset (Dataset) -- The dataset containing the data to plot.

  • classifier (str) -- The name of the variable to group the data.

Raises:

ValueError -- If the dataset is empty.