quality_measure module¶
Here is the baseclass to measure the quality of machine learning algorithms.
The concept of quality measure is implemented with the MLQualityMeasure
class.
Classes:
|
An abstract quality measure for machine learning algorithms. |
|
A factory of |
Functions:
|
Modify a sequence in-place by shuffling its contents. |
- class gemseo.mlearning.qual_measure.quality_measure.MLQualityMeasure(algo)[source]¶
Bases:
object
An abstract quality measure for machine learning algorithms.
- Parameters
algo (MLAlgo) – A machine learning algorithm.
- Return type
None
Attributes:
Methods:
evaluate
([method, samples])Evaluate the quality measure.
evaluate_bootstrap
([n_replicates, samples, ...])Evaluate the quality measure using the bootstrap technique.
evaluate_kfolds
([n_folds, samples, ...])Evaluate the quality measure using the k-folds technique.
evaluate_learn
([samples, multioutput])Evaluate the quality measure using the learning dataset.
evaluate_loo
([samples, multioutput])Evaluate the quality measure using the leave-one-out technique.
evaluate_test
(test_data[, samples, multioutput])Evaluate the quality measure using a test dataset.
is_better
(val1, val2)Compare the quality between two values.
- BOOTSTRAP = 'bootstrap'¶
- KFOLDS = 'kfolds'¶
- LEARN = 'learn'¶
- LOO = 'loo'¶
- SMALLER_IS_BETTER = True¶
- TEST = 'test'¶
- evaluate(method='learn', samples=None, **options)[source]¶
Evaluate the quality measure.
- Parameters
method (str) –
The name of the method to evaluate the quality measure.
By default it is set to learn.
samples (Optional[Sequence[int]]) –
The indices of the learning samples. If None, use the whole learning dataset.
By default it is set to None.
**options (Optional[Union[Sequence[int], bool, int, gemseo.core.dataset.Dataset]]) – The options of the estimation method (e.g. ‘test_data’ for
method –
one ('n_replicates' for the bootstrap) –
...) –
- Returns
The value of the quality measure.
- Raises
ValueError – If the name of the method is unknown.
- Return type
Union[float, numpy.ndarray]
- evaluate_bootstrap(n_replicates=100, samples=None, multioutput=True)[source]¶
Evaluate the quality measure using the bootstrap technique.
- Parameters
n_replicates (int) –
The number of bootstrap replicates.
By default it is set to 100.
samples (Optional[Sequence[int]]) –
The indices of the learning samples. If None, use the whole learning dataset.
By default it is set to None.
multioutput (bool) –
If True, return the quality measure for each output component. Otherwise, average these measures.
By default it is set to True.
- Returns
The value of the quality measure.
- Return type
NoReturn
- evaluate_kfolds(n_folds=5, samples=None, multioutput=True, randomize=False)[source]¶
Evaluate the quality measure using the k-folds technique.
- Parameters
n_folds (int) –
The number of folds.
By default it is set to 5.
samples (Optional[Sequence[int]]) –
The indices of the learning samples. If None, use the whole learning dataset.
By default it is set to None.
multioutput (bool) –
If True, return the quality measure for each output component. Otherwise, average these measures.
By default it is set to True.
randomize (bool) –
Whether to shuffle the samples before dividing them in folds.
By default it is set to False.
- Returns
The value of the quality measure.
- Return type
NoReturn
- evaluate_learn(samples=None, multioutput=True)[source]¶
Evaluate the quality measure using the learning dataset.
- Parameters
samples (Optional[Sequence[int]]) –
The indices of the learning samples. If None, use the whole learning dataset.
By default it is set to None.
multioutput (bool) –
Whether to return the quality measure for each output component. If not, average these measures.
By default it is set to True.
- Returns
The value of the quality measure.
- Return type
NoReturn
- evaluate_loo(samples=None, multioutput=True)[source]¶
Evaluate the quality measure using the leave-one-out technique.
- Parameters
samples (Optional[Sequence[int]]) –
The indices of the learning samples. If None, use the whole learning dataset.
By default it is set to None.
multioutput (bool) –
If True, return the quality measure for each output component. Otherwise, average these measures.
By default it is set to True.
- Returns
The value of the quality measure.
- Return type
Union[float, numpy.ndarray]
- evaluate_test(test_data, samples=None, multioutput=True)[source]¶
Evaluate the quality measure using a test dataset.
- Parameters
dataset – The test dataset.
samples (Optional[Sequence[int]]) –
The indices of the learning samples. If None, use the whole learning dataset.
By default it is set to None.
multioutput (bool) –
If True, return the quality measure for each output component. Otherwise, average these measures.
By default it is set to True.
test_data (gemseo.core.dataset.Dataset) –
- Returns
The value of the quality measure.
- Return type
NoReturn
- classmethod is_better(val1, val2)[source]¶
Compare the quality between two values.
This methods returns True if the first one is better than the second one.
For most measures, a smaller value is “better” than a larger one (MSE etc.). But for some, like an R2-measure, higher values are better than smaller ones. This comparison method correctly handles this, regardless of the type of measure.
- Parameters
val1 (float) – The value of the first quality measure.
val2 (float) – The value of the second quality measure.
- Returns
Whether val1 is of better quality than val2.
- Return type
bool
- class gemseo.mlearning.qual_measure.quality_measure.MLQualityMeasureFactory(*args, **kwargs)[source]¶
Bases:
gemseo.core.factory.Factory
A factory of
MLQualityMeasure
.- Parameters
base_class – The base class to be considered.
module_names – The fully qualified modules names to be searched.
Attributes:
Return the available classes.
Methods:
create
(class_name, **options)Return an instance of a class.
get_class
(name)Return a class from its name.
Return the constructor kwargs default values of a class.
get_default_sub_options_values
(name, **options)Return the default values of the sub options of a class.
get_options_doc
(name)Return the constructor documentation of a class.
get_options_grammar
(name[, write_schema, ...])Return the options JSON grammar for a class.
get_sub_options_grammar
(name, **options)Return the JSONGrammar of the sub options of a class.
is_available
(name)Return whether a class can be instantiated.
update
()Search for the classes that can be instantiated.
- PLUGIN_ENTRY_POINT = 'gemseo_plugins'¶
- static cache_clear()¶
- property classes¶
Return the available classes.
- Returns
The sorted names of the available classes.
- create(class_name, **options)¶
Return an instance of a class.
- Parameters
class_name (str) – The name of the class.
**options (Any) – The arguments to be passed to the class constructor.
- Returns
The instance of the class.
- Raises
TypeError – If the class cannot be instantiated.
- Return type
Any
- get_class(name)¶
Return a class from its name.
- Parameters
name (str) – The name of the class.
- Returns
The class.
- Raises
ImportError – If the class is not available.
- Return type
Type[Any]
- get_default_options_values(name)¶
Return the constructor kwargs default values of a class.
- Parameters
name (str) – The name of the class.
- Returns
The mapping from the argument names to their default values.
- Return type
Dict[str, Union[str, int, float, bool]]
- get_default_sub_options_values(name, **options)¶
Return the default values of the sub options of a class.
- Parameters
name (str) – The name of the class.
**options (str) – The options to be passed to the class required to deduce the sub options.
- Returns
The JSON grammar.
- Return type
- get_options_doc(name)¶
Return the constructor documentation of a class.
- Parameters
name (str) – The name of the class.
- Returns
The mapping from the argument names to their documentation.
- Return type
Dict[str, str]
- get_options_grammar(name, write_schema=False, schema_file=None)¶
Return the options JSON grammar for a class.
Attempt to generate a JSONGrammar from the arguments of the __init__ method of the class.
- Parameters
name (str) – The name of the class.
write_schema (bool) –
If True, write the JSON schema to a file.
By default it is set to False.
schema_file (Optional[str]) –
The path to the JSON schema file. If None, the file is saved in the current directory in a file named after the name of the class.
By default it is set to None.
- Returns
The JSON grammar.
- Return type
- get_sub_options_grammar(name, **options)¶
Return the JSONGrammar of the sub options of a class.
- Parameters
name (str) – The name of the class.
**options (str) – The options to be passed to the class required to deduce the sub options.
- Returns
The JSON grammar.
- Return type
- is_available(name)¶
Return whether a class can be instantiated.
- Parameters
name (str) – The name of the class.
- Returns
Whether the class can be instantiated.
- Return type
bool
- update()¶
Search for the classes that can be instantiated.
- The search is done in the following order:
The fully qualified module names
The plugin packages
The packages from the environment variables
- Return type
None
- gemseo.mlearning.qual_measure.quality_measure.shuffle(x)¶
Modify a sequence in-place by shuffling its contents.
This function only shuffles the array along the first axis of a multi-dimensional array. The order of sub-arrays is changed but their contents remains the same.
Note
New code should use the
shuffle
method of adefault_rng()
instance instead; please see the random-quick-start.- Parameters
x (ndarray or MutableSequence) – The array, list or mutable sequence to be shuffled.
- Returns
- Return type
None
See also
Generator.shuffle
which should be used for new code.
Examples
>>> arr = np.arange(10) >>> np.random.shuffle(arr) >>> arr [1 7 5 2 9 4 3 6 0 8] # random
Multi-dimensional arrays are only shuffled along the first axis:
>>> arr = np.arange(9).reshape((3, 3)) >>> np.random.shuffle(arr) >>> arr array([[3, 4, 5], # random [6, 7, 8], [0, 1, 2]])