CrossValidationReport#
- class skore.CrossValidationReport(estimator, X, y=None, cv_splitter=None, n_jobs=None)[source]#
Report for cross-validation results.
Upon initialization,
CrossValidationReport
will cloneestimator
according tocv_splitter
and fit the generated estimators. The fitting is done in parallel, and can be interrupted: the estimators that have been fitted can be accessed even if the full cross-validation process did not complete. In particular,KeyboardInterrupt
exceptions are swallowed and will only interrupt the cross-validation process, rather than the entire program.- Parameters:
- estimatorestimator object
Estimator to make the cross-validation report from.
- X{array-like, sparse matrix} of shape (n_samples, n_features)
The data to fit. Can be for example a list, or an array.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None
The target variable to try to predict in the case of supervised learning.
- cv_splitterint, cross-validation generator or an iterable, default=5
Determines the cross-validation splitting strategy. Possible inputs for
cv_splitter
are:int, to specify the number of folds in a
(Stratified)KFold
,a scikit-learn CV splitter,
An iterable yielding (train, test) splits as arrays of indices.
For int/None inputs, if the estimator is a classifier and
y
is either binary or multiclass,StratifiedKFold
is used. In all other cases,KFold
is used. These splitters are instantiated withshuffle=False
so the splits will be the same across calls.Refer to scikit-learn’s User Guide for the various cross-validation strategies that can be used here.
- n_jobsint, default=None
Number of jobs to run in parallel. Training the estimator and computing the score are parallelized over the cross-validation splits. When accessing some methods of the
CrossValidationReport
, then_jobs
parameter is used to parallelize the computation.None
means 1 unless in ajoblib.parallel_backend
context.-1
means using all processors.
- Attributes:
- estimator_estimator object
The cloned or copied estimator.
- estimator_name_str
The name of the estimator.
- estimator_reports_list of EstimatorReport
The estimator reports for each split.
See also
skore.EstimatorReport
Report for a fitted estimator.
Examples
>>> from sklearn.datasets import make_classification >>> from sklearn.linear_model import LogisticRegression >>> X, y = make_classification(random_state=42) >>> estimator = LogisticRegression() >>> from skore import CrossValidationReport >>> report = CrossValidationReport(estimator, X=X, y=y, cv_splitter=2)
- cache_predictions(response_methods='auto', n_jobs=None)[source]#
Cache the predictions for sub-estimators reports.
- Parameters:
- response_methods{“auto”, “predict”, “predict_proba”, “decision_function”}, default=”auto
The methods to use to compute the predictions.
- n_jobsint, default=None
The number of jobs to run in parallel. If
None
, we use then_jobs
parameter when initializingCrossValidationReport
.
Examples
>>> from sklearn.datasets import load_breast_cancer >>> from sklearn.linear_model import LogisticRegression >>> from skore import CrossValidationReport >>> X, y = load_breast_cancer(return_X_y=True) >>> classifier = LogisticRegression(max_iter=10_000) >>> report = CrossValidationReport(classifier, X=X, y=y, cv_splitter=2) >>> report.cache_predictions() >>> report._cache {...}
- clear_cache()[source]#
Clear the cache.
Examples
>>> from sklearn.datasets import load_breast_cancer >>> from sklearn.linear_model import LogisticRegression >>> from skore import CrossValidationReport >>> X, y = load_breast_cancer(return_X_y=True) >>> classifier = LogisticRegression(max_iter=10_000) >>> report = CrossValidationReport(classifier, X=X, y=y, cv_splitter=2) >>> report.cache_predictions() >>> report.clear_cache() >>> report._cache {}