base_resampler module¶
A base class for resampling and surrogate modeling.
- class gemseo.mlearning.resampling.base_resampler.BaseResampler(sample_indices, n_splits, seed=0)[source]
Bases:
object
A base class for resampling and surrogate modeling.
- Parameters:
- execute(model, return_models=False, input_data=None, stack_predictions=True, fit_transformers=True, store_sampling_result=False)[source]
Apply the resampling technique to a machine learning model.
- Parameters:
model (BaseMLAlgo) – The machine learning model.
return_models (bool) –
Whether the sub-models resulting from resampling are returned.
By default it is set to False.
input_data (ndarray | None) – The input data for the prediction, if any.
stack_predictions (bool) –
Whether the sub-predictions are stacked per sub-model (first the predictions of the first sub-model, then the prediction of the second sub-model, etc.). This argument is ignored when
input_data
isNone
.By default it is set to True.
fit_transformers (bool) –
Whether to re-fit the transformers.
By default it is set to True.
store_sampling_result (bool) –
Whether to store the sampling results in the attribute
resampling_results
of the original model.By default it is set to False.
- Returns:
First the sub-models resulting from resampling if
return_models
isTrue
then the predictions, either per fold or stacked.- Raises:
ValueError – When the model is neither a supervised algorithm nor a clustering one.
- Return type:
tuple[list[BaseMLAlgo], list[ndarray] | ndarray]
- plot(file_path='', show=True, colors=('b', 'r'))[source]
Plot the train-test splits.
- Parameters:
- Returns:
The visualization.
- Return type:
- name: str
The name of the resampler.
Use the class name by default.
- property sample_indices: NDArray[int]
The indices of the samples after shuffling.
- property seed: int
The seed to initialize the random generator.
- property splits: Splits
The train-test splits resulting from the splitting of the samples.
A train-test split is a partition whose first component contains the indices of the learning samples and the second one the indices of the test samples.