gemseo.post.dataset.parallel_coordinates module#

Draw parallel coordinates from a Dataset.

The ParallelCoordinates class implements the parallel coordinates plot, a.k.a. cowebplot, which is a way to visualize \(n\) samples of a high-dimensional vector

\[x=(x_1,x_2,\ldots,x_d)\in\mathbb{R}^d\]

in a 2D referential by representing each sample

\[x^{(i)}=(x_1^{(i)},x_2^{(i)},\ldots,x_d^{(i)})\]

as a piece-wise line where the x-values of the nodes from left to right are the values of \(x_1\), \(x_2\), ... and \(x_d^{(i)}\).

A variable name is required by the DatasetPlot.execute() method by means of the classifier keyword in order to color the curves according to the value of the variable name. This is useful when the data is labeled or when we are looking for the samples for which the classifier value is comprised in some interval specified by the lower and upper arguments (default values are set to -inf and inf respectively). In the latter case, the color scale is composed of only two values: one for the samples positively classified and one for the others.

class ParallelCoordinates(dataset, classifier, lower=-inf, upper=inf, **kwargs)[source]#

Bases: DatasetPlot

Parallel coordinates.

Parameters:
  • dataset (Dataset) -- The dataset containing the data to plot.

  • classifier (str) -- The name of the variable to group the data.

  • lower (float) --

    The lower bound of the cluster.

    By default it is set to -inf.

  • upper (float) --

    The upper bound of the cluster.

    By default it is set to inf.

  • **kwargs (Any) -- The options to pass to pandas.

Raises:

ValueError -- If the dataset is empty.