ModelPlotPy#

class scikitplot.modelplotpy.ModelPlotPy(feature_data=[], label_data=[], dataset_labels=[], models=[], model_labels=[], ntiles=10, seed=0)[source]#

ModelPlotPy decile analysis.

Parameters:

feature_datalist of objects (n_datasets, )

Objects containing the X matrix for one or more different datasets.

label_datalist of objects (n_datasets, )

Objects of the y vector for one or more different datasets.

dataset_labelslist of str (n_datasets, )

Containing the names of the different feature_data and label_data combination pairs.

modelslist of objects (n_models, )

Containing the sk-learn model objects.

model_labelslist of str (n_models, )

Names of the (sk-learn) models.

ntilesint, default 10

The number of splits range is (2, inf]:

10 is called deciles
100 is called percentiles
any other value is an ntile

seedint, default=0

Making the splits reproducible.

Changed in version 0.3.9: Default changed from 999 to 0.

Raises:

ValueError: If there is no match with the complete list or the input list again

aggregate_over_ntiles()[source]#

Create eval_t_tot

This function builds the pandas dataframe eval_t_tot and contains the aggregated output. The data is aggregated over datasets (feature and label-data pairs) and list of models.

Parameters:

feature_datalist of objects (n_datasets, ): Objects containing the X matrix for one or more different datasets.
label_datalist of objects (n_datasets, ): Objects of the y vector for one or more different datasets.
dataset_labelslist of str (n_datasets, ): Containing the names of the different feature feature_data and label label_data data combination pairs.
modelslist of objects (n_models, ): Containing the sk-learn model objects.
model_labelslist of str (n_models, ): Names of the (sk-learn) models.
ntilesint, default 10: The number of splits 10 is called deciles, 100 is called percentiles and any other value is an ntile. Range is (2, inf].
seedint, default=0: Making the splits reproducible.

Changed in version 0.3.9: Default changed from 999 to 0.

Returns:

pandas.DataFrame: Pandas dataframe with combination of all datasets, models, target values and ntiles. It already contains almost all necessary information for model plotting.

Raises:

ValueError: If there is no match with the complete list or the input list again.

get_params()[source]#: Get parameters of the model plots object.

Added in version 0.3.9.

plotting_scope(scope='no_comparison', select_model_label=[], select_dataset_label=[], select_targetclass=[], select_smallest_targetclass=True)[source]#

Create plot_input

This function builds the pandas dataframe plot_input which is a subset of scores_and_ntiles. The dataset is the subset of scores_and_ntiles that is dependent of 1 of the 4 evaluation types that a user can request.

Changed in version 0.3.9: Parameters has been reorganized.

Parameters:

scope{‘no_comparison’, ‘compare_models’, ‘compare_datasets’, ‘compare_targetclasses’}, default=’no_comparison’

How is this function evaluated? There are 4 different perspectives to evaluate model plots.

scope='no_comparison'
This perspective will show a single plot that contains the viewpoint from:
- 1 dataset
- 1 model
- 1 target class
scope='compare_models'
This perspective will show plots that contains the viewpoint from:
- 2 or more different models
- 1 dataset
- 1 target class
scope='compare_datasets'
This perspective will show plots that contains the viewpoint from:
- 2 or more different datasets
- 1 model
- 1 target class
scope='compare_targetclasses'
This perspective will show plots that contains the viewpoint from:
- 2 or more different target classes
- 1 dataset
- 1 model

select_model_labellist of str

List of one or more elements from the model_name parameter.

select_dataset_labellist of str

List of one or more elements from the description parameter.

select_targetclasslist of str

List of one or more elements from the label data.

select_smallest_targetclassbool, default = True

Should the plot only contain the results of the smallest targetclass. If True, the specific target is defined from the first dataset.

Returns:

pandas.DataFrame: Pandas dataframe, a subset of scores_and_ntiles, for all dataset, model and target value combinations for all ntiles. It contains all necessary information for model plotting.

Raises:

ValueError: If the wrong scope value is specified.

Return type:

pandas.DataFrame

prepare_scores_and_ntiles()[source]#

Create eval_tot

This function builds the pandas dataframe eval_tot that contains for each feature and label data pair given a description the actual and predicted value. It loops over the different models with the given model_name.

Parameters:

feature_datalist of objects (n_datasets, ): Objects containing the X matrix for one or more different datasets.
label_datalist of objects (n_datasets, ): Objects of the y vector for one or more different datasets.
dataset_labelslist of str (n_datasets, ): Containing the names of the different feature feature_data and label label_data data combination pairs.
modelslist of objects (n_models, ): Containing the sk-learn model objects.
model_labelslist of str (n_models, ): Names of the (sk-learn) models.
ntilesint, default 10: The number of splits 10 is called deciles, 100 is called percentiles and any other value is an ntile. Range is (2, inf].
seedint, default=0: Making the splits reproducible.

Changed in version 0.3.9: Default changed from 999 to 0.

Returns:

scores_and_ntilespandas.DataFrame: Pandas dataframe for all given information and for each target_class it makes a prediction and ntile. For each ntile a small value (based on the seed) is added and normalized to make the results reproducible.

Raises:

ValueError: If there is no match with the complete list or the input list again

reset_params()[source]#: Reset all parameters to default values.

Added in version 0.3.9.

set_params(**params)[source]#: Set parameters of the model plots object.

Added in version 0.3.9.

Gallery examples#

Introduction to modelplotpy

ModelPlotPy#

Gallery examples#

This Page