plot_ks_statistic#

scikitplot.kds.plot_ks_statistic(y_true, y_probas, *, pos_label=None, class_index=1, title='KS Statistic Plot', ax=None, fig=None, figsize=None, title_fontsize='large', text_fontsize='medium', digits=2, data=None, **kwargs)[source]#

Generates the KS Statistic Plot from labels and probabilities

Kolmogorov-Smirnov (KS) statistic is used to measure how well the binary classifier model separates the Responder class (Yes) from Non-Responder class (No). The range of K-S statistic is between 0 and 1. Higher the KS statistic value better the model in separating the Responder class from Non-Responder class.

Parameters:
y_truearray-like, shape (n_samples)

Ground truth (correct) target values.

y_probasarray-like, shape (n_samples, n_classes)

Prediction probabilities for each class returned by a classifier.

class_indexint, optional, default=1

Index of the class of interest for multi-class classification. Ignored for binary classification.

titlestr, optional, default=’KS Statistic Plot’

Title of the generated plot.

axlist of matplotlib.axes.Axes, optional, default=None

The axis to plot the figure on. If None is passed in the current axes will be used (or generated if required). Axes like fig.add_subplot(1, 1, 1) or plt.gca()

figmatplotlib.pyplot.figure, optional, default: None

The figure to plot the Visualizer on. If None is passed in the current plot will be used (or generated if required).

Added in version 0.3.9.

figsizetuple of 2 ints, optional

Tuple denoting figure size of the plot e.g. (6, 6). Defaults to None.

title_fontsizestr or int, optional

Matplotlib-style fontsizes. Use e.g. “small”, “medium”, “large” or integer-values. Defaults to “large”.

text_fontsizestr or int, optional

Matplotlib-style fontsizes. Use e.g. “small”, “medium”, “large” or integer-values. Defaults to “medium”.

digitsint, optional

Number of digits for formatting output floating point values. Use e.g. 2 or 4. Defaults to 2.

Added in version 0.3.9.

Returns:
matplotlib.axes.Axes

The axes on which the plot was drawn.

Examples

>>> from sklearn.datasets import load_breast_cancer as data_2_classes
>>> from sklearn.model_selection import train_test_split
>>> from sklearn.linear_model import LogisticRegression
>>> import scikitplot as skplt
>>> X, y = data_2_classes(return_X_y=True, as_frame=False)
>>> X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.5, random_state=0)
>>> model = LogisticRegression(max_iter=int(1e5), random_state=0).fit(X_train, y_train)
>>> y_probas = model.predict_proba(X_val)
>>> skplt.kds.plot_ks_statistic(
>>>     y_val, y_probas, class_index=1,
>>> );

(Source code, png)

KS Statistic Plot