decile_table#

scikitplot.kds.decile_table(y_true, y_probas, *args, pos_label=None, class_index=1, change_deciles=10, digits=3, labels=True, **kwargs)[source]#

Generates the Decile Table from labels and probabilities

The Decile Table is creared by first sorting the customers by their predicted probabilities, in decreasing order from highest (closest to one) to lowest (closest to zero). Splitting the customers into equally sized segments, we create groups containing the same numbers of customers, for example, 10 decile groups each containing 10% of the customer base.

Added in version 0.3.9.

Parameters:
y_truearray-like, shape (n_samples,)

Ground truth (correct/actual) target values.

y_probasarray-like, shape (n_samples, n_classes)

Prediction probabilities for each class returned by a classifier/algorithm.

pos_labelscalar, optional

The positive label for binary classification. If None, it defaults to classes[1].

Added in version 0.3.9.

class_indexint, optional

Index of the class for which to extract probabilities in multi-class case. If None, returns all class probabilities in the 2D case. Ignored if y_probas is 1D.

Added in version 0.3.9.

change_decilesint, optional, default=10

The number of partitions for creating the table. Defaults to 10 for deciles.

digitsint, optional, default=6

The decimal precision for the result.

Added in version 0.3.9.

labelsbool, optional, default=True

If True, prints a legend for the abbreviations of decile table column names.

**kwargsdict, optional

Added in version 0.3.9.

Returns:
pandas.DataFrame

The dataframe dt (decile-table) with the deciles and related information.

See also

print_labels

A legend for the abbreviations of decile table column names.

References

[1] tensorbored/kds

Examples

>>> from sklearn.datasets import (
...     load_breast_cancer as data_2_classes,
... )
>>> from sklearn.model_selection import train_test_split
>>> from sklearn.tree import DecisionTreeClassifier
>>> import scikitplot as skplt
>>> X, y = data_2_classes(return_X_y=True, as_frame=True)
>>> X_train, X_test, y_train, y_test = train_test_split(
...     X, y, test_size=0.5, random_state=0
... )
>>> clf = DecisionTreeClassifier(max_depth=1, random_state=0).fit(
...     X_train, y_train
... )
>>> y_prob = clf.predict_proba(X_test)
>>> skplt.kds.decile_table(
>>>     y_test, y_prob, class_index=1
>>> )
LABELS INFO:

 prob_min         : Minimum probability in a particular decile
 prob_max         : Minimum probability in a particular decile
 prob_avg         : Average probability in a particular decile
 cnt_events       : Count of events in a particular decile
 cnt_resp         : Count of responders in a particular decile
 cnt_non_resp     : Count of non-responders in a particular decile
 cnt_resp_rndm    : Count of responders if events assigned randomly in a particular decile
 cnt_resp_wiz     : Count of best possible responders in a particular decile
 resp_rate        : Response Rate in a particular decile [(cnt_resp/cnt_cust)*100]
 cum_events       : Cumulative sum of events decile-wise 
 cum_resp         : Cumulative sum of responders decile-wise 
 cum_resp_wiz     : Cumulative sum of best possible responders decile-wise 
 cum_non_resp     : Cumulative sum of non-responders decile-wise 
 cum_events_pct   : Cumulative sum of percentages of events decile-wise 
 cum_resp_pct     : Cumulative sum of percentages of responders decile-wise 
 cum_resp_pct_wiz : Cumulative sum of percentages of best possible responders decile-wise 
 cum_non_resp_pct : Cumulative sum of percentages of non-responders decile-wise 
 KS               : KS Statistic decile-wise 
 lift             : Cumuative Lift Value decile-wise
decile prob_min prob_max prob_avg cnt_cust cnt_resp cnt_non_resp cnt_resp_rndm cnt_resp_wiz resp_rate cum_cust cum_resp cum_resp_wiz cum_non_resp cum_cust_pct cum_resp_pct cum_resp_pct_wiz cum_non_resp_pct KS lift
0 1 0.923 0.923 0.923 29.0 29.0 0.0 18.4 NaN 100.000 29.0 29.0 NaN 0.0 10.175 15.761 NaN 0.000 15.761 1.549
1 2 0.923 0.923 0.923 28.0 25.0 3.0 18.4 29.0 89.286 57.0 54.0 29.0 3.0 20.000 29.348 15.761 2.970 26.378 1.467
2 3 0.923 0.923 0.923 29.0 26.0 3.0 18.4 28.0 89.655 86.0 80.0 57.0 6.0 30.175 43.478 30.978 5.941 37.537 1.441
3 4 0.923 0.923 0.923 28.0 24.0 4.0 18.4 29.0 85.714 114.0 104.0 86.0 10.0 40.000 56.522 46.739 9.901 46.621 1.413
4 5 0.923 0.923 0.923 29.0 28.0 1.0 18.4 28.0 96.552 143.0 132.0 114.0 11.0 50.175 71.739 61.957 10.891 60.848 1.430
5 6 0.923 0.923 0.923 28.0 26.0 2.0 18.4 29.0 92.857 171.0 158.0 143.0 13.0 60.000 85.870 77.717 12.871 72.999 1.431
6 7 0.049 0.923 0.833 29.0 19.0 10.0 18.4 28.0 65.517 200.0 177.0 171.0 23.0 70.175 96.196 92.935 22.772 73.424 1.371
7 8 0.049 0.049 0.049 28.0 6.0 22.0 18.4 13.0 21.429 228.0 183.0 184.0 45.0 80.000 99.457 100.000 44.554 54.903 1.243
8 9 0.049 0.049 0.049 29.0 1.0 28.0 18.4 0.0 3.448 257.0 184.0 184.0 73.0 90.175 100.000 100.000 72.277 27.723 1.109
9 10 0.049 0.049 0.049 28.0 0.0 28.0 18.4 0.0 0.000 285.0 184.0 184.0 101.0 100.000 100.000 100.000 100.000 0.000 1.000