Transformer pipeline example

In this example, we will create a pipeline of transformers.

from __future__ import annotations

import matplotlib.pyplot as plt
from gemseo.api import configure_logger
from gemseo.mlearning.transform.pipeline import Pipeline
from gemseo.mlearning.transform.scaler.scaler import Scaler
from numpy import allclose
from numpy import linspace
from numpy import matmul
from numpy import sin

configure_logger()
<RootLogger root (INFO)>

Create dataset

x = linspace(0, 1, 100)
data = sin(10 * x) - 3 * x

Create transformer pipeline

We create a pipeline of two transformers; the first performing a shift, the second a scale (both scalers). This could also be achieved using one scaler, but we here present a pipeline doing these transformations separately for illustrative purposes.

shift = Scaler(offset=5)
scale = Scaler(coefficient=0.5)
pipeline = Pipeline(transformers=[shift, scale])

Transform data

In order to use the transformer, we have to fit it to the data.

pipeline.fit(data)

# Transform data using the pipeline
transformed_data = pipeline.transform(data)

# Transform data using individual components of the pipeline
only_shifted_data = shift.transform(data)
WARNING - 10:43:28: The Scaler.fit() function does nothing; the instance of Scaler uses the coefficient and offset passed at its initialization
WARNING - 10:43:28: The Scaler.fit() function does nothing; the instance of Scaler uses the coefficient and offset passed at its initialization

Plot data

plt.plot(x, data, label="Original data")
plt.plot(x, transformed_data, label="Shifted and scaled data")
plt.plot(x, only_shifted_data, label="Shifted but not scaled data")
plt.legend()
plt.show()
plot pipeline

Compute jacobian

jac = pipeline.compute_jacobian(data)
only_shift_jac = shift.compute_jacobian(data)
only_scale_jac = scale.compute_jacobian(only_shifted_data)

print(jac)
print(only_shift_jac)
print(only_scale_jac)
print(allclose(jac, matmul(only_scale_jac, only_shift_jac)))
Traceback (most recent call last):
  File "/home/docs/checkouts/readthedocs.org/user_builds/gemseo/checkouts/develop/doc_src/_examples/mlearning/transformer/plot_pipeline.py", line 83, in <module>
    jac = pipeline.compute_jacobian(data)
  File "/home/docs/checkouts/readthedocs.org/user_builds/gemseo/envs/develop/lib/python3.9/site-packages/gemseo/mlearning/transform/pipeline.py", line 136, in compute_jacobian
    jacobian = matmul(transformer.compute_jacobian(data), jacobian)
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 100 is different from 1)

Total running time of the script: ( 0 minutes 0.136 seconds)

Gallery generated by Sphinx-Gallery