.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/snsx/plot_kdsplot_script.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via JupyterLite or Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_snsx_plot_kdsplot_script.py: plot_kdsplot_script with examples ========================================== An example showing the :py:func:`~scikitplot.snsx.kdsplot` function used by a scikit-learn regressor. .. GENERATED FROM PYTHON SOURCE LINES 8-12 .. code-block:: Python :lineno-start: 9 # Authors: The scikit-plots developers # SPDX-License-Identifier: BSD-3-Clause .. GENERATED FROM PYTHON SOURCE LINES 13-14 Import scikit-plot .. GENERATED FROM PYTHON SOURCE LINES 14-17 .. code-block:: Python :lineno-start: 14 import scikitplot.snsx as sp .. GENERATED FROM PYTHON SOURCE LINES 18-30 .. code-block:: Python :lineno-start: 18 import matplotlib.pyplot as plt import numpy as np; np.random.seed(0) # reproducibility import pandas as pd from sklearn.datasets import ( load_breast_cancer as data_2_classes, # load_iris as data_3_classes, ) from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split .. GENERATED FROM PYTHON SOURCE LINES 31-33 Load the data X, y = data_3_classes(return_X_y=True, as_frame=False) .. GENERATED FROM PYTHON SOURCE LINES 33-37 .. code-block:: Python :lineno-start: 33 X, y = data_2_classes(return_X_y=True, as_frame=False) X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.5, random_state=0) np.unique(y) .. rst-class:: sphx-glr-script-out .. code-block:: none array([0, 1]) .. GENERATED FROM PYTHON SOURCE LINES 38-39 Create an instance of the LogisticRegression .. GENERATED FROM PYTHON SOURCE LINES 39-56 .. code-block:: Python :lineno-start: 39 model = ( LogisticRegression(max_iter=int(1e5), random_state=0) .fit(X_train, y_train) ) # Perform predictions y_val_prob = model.predict_proba(X_val) # Create a DataFrame with predictions df = pd.DataFrame({ "y_true": y_val==1, # target class (0,1,2) "y_score": y_val_prob[:, 1], # target class (0,1,2) # "y_true": np.random.normal(0.5, 0.1, 100).round(), # "y_score": np.random.normal(0.5, 0.15, 100), # "hue": np.random.normal(0.5, 0.4, 100).round(), }) .. GENERATED FROM PYTHON SOURCE LINES 57-73 .. code-block:: Python :lineno-start: 57 p = sp.kdsplot( df, x="y_true", y="y_score", kind="df", n_deciles=10, round_digits=4, verbose=True, ) p # p.columns.tolist() # p[["decile", "cnt_resp", "cnt_resp_wiz", "cum_resp_pct", "cum_resp_wiz_pct"]] p.iloc[:, range(9, 23)] # p.iloc[:, [11, 12, 12, 14]] .. rst-class:: sphx-glr-script-out .. code-block:: none { "decile": "Meaning: Ranked group (1 = highest predicted probability). Critical: Ensure sorted descending by model score. Fatal if top deciles don't capture positives.Formula: rank by model score into k quantiles (e.g., 10 deciles). ", "prob_min": "Meaning: Lowest predicted probability in the decile. Critical: Signals model calibration. Fatal if too close to prob_max (poor ranking).Formula: min(score in decile). ", "prob_max": "Meaning: Highest predicted probability in the decile. Critical: Checks separation. Fatal if overlaps lower deciles (poor discrimination).Formula: max(score in decile). ", "prob_avg": "Meaning: Average predicted probability in the decile. Critical: Useful for calibration curves; should decrease monotonically across deciles.Formula: mean(score in decile). ", "cnt_resp_total": "Meaning: Total samples in the decile. Critical: Denominator for rate_resp and cumulative % calculations. Fatal if deciles uneven.Formula: count(samples in decile). ", "cnt_resp": "Meaning: Actual responders in the decile (how many responders we captured). Critical: Should never exceed cnt_resp_wiz. Flat counts across deciles indicate useless model.Formula: sum(y_true=1 in decile). ", "cnt_resp_non": "Meaning: Non-responders in the decile. Critical: Used for KS/statistics. Too high in top deciles is a warning.Formula: cnt_resp_total - cnt_resp. ", "cnt_resp_rndm": "Meaning: Expected responders if randomly assigned. Critical: Baseline for comparison. Fatal if model only slightly above random.Formula: cnt_resp_total * (total_responders / total_samples). ", "cnt_resp_wiz": "Meaning: Ideal responders if model were perfect. Critical: Must be ≥ cnt_resp. Fatal if NaN or actual far below.Formula: allocate top responders directly into highest deciles. ", "rate_resp": "Meaning: Per-decile response rate (alias to decile_wise_response, decile_wise_gain). Critical: Measures decile quality. Early deciles should outperform later ones.Formula: rate_resp = decile_wise_response = cnt_resp / cnt_resp_total. ", "cum_resp_total": "Meaning: Cumulative total samples. Critical: Tracks population coverage.Formula: Σ cnt_resp_total(≤ current decile). ", "cum_resp_total_pct": "Meaning: % cumulative population. Critical: X-axis for lift/gain curves; check decile balance.Formula: cum_resp_total / total_samples * 100. ", "cum_resp": "Meaning: Cumulative responders (alias to cumulative_gain) up to this decile so ML evaluation (how much `gain` vs random baseline). Critical: Should increase; max = total responders. Flat curve = weak model.Formula: cumulative_gain = cumulative_response = Σ cnt_resp(≤ current decile) = cum_resp_pct vs cum_resp_total_pct. ", "cum_resp_pct": "Meaning: % cumulative responders = cum_resp / total_responders * 100. Critical: Wizard curve should be ≥ model; used in lift/gain charts.Formula: cum_resp / total_responders * 100. ", "cum_resp_non": "Meaning: Cumulative non-responders. Critical: Used in KS statistic; early dominance is bad.Formula: Σ cnt_resp_non(≤ current decile). ", "cum_resp_non_pct": "Meaning: % cumulative non-responders. Critical: Should differ from cum_resp_pct; almost equal = model fails.Formula: cum_resp_non / total_nonresponders * 100. ", "cum_resp_rndm": "Meaning: Cumulative expected responders if randomly assigned. Critical: Baseline for cumulative lift. Fatal if model ≈ random curve.Formula: Σ cnt_resp_rndm(≤ current decile). ", "cum_resp_rndm_pct": "Meaning: % cumulative random responders = cum_resp_rndm / total_responders * 100. Critical: Random baseline curve (diagonal). Always linear from (0,0) to (100,100). Fatal if model curve is near or below it.Formula: cum_resp_rndm / total_responders * 100. ", "cum_resp_wiz": "Meaning: Cumulative ideal responders. Critical: Should always ≥ model; never NaN.Formula: Σ cnt_resp_wiz(≤ current decile). ", "cum_resp_wiz_pct": "Meaning: % cumulative ideal responders. Critical: Wizard benchmark for lift/gain curves; gaps indicate model weakness.Formula: cum_resp_wiz / total_responders * 100. ", "KS": "Meaning: KS Kolmogorov-Smirnov statistic. Range: 0-100 (percent scale) or 0-1 (fractional scale). Interpretation: - <20 → Poor discrimination (model barely better than random). - 20-40 → Fair. - 40-60 → Good. - ≥60 → Excellent. - ≥70 → Suspiciously high; likely overfitting or data leakage unless justified by very strong signal. Critical: Report max KS and check across train/validation/test. Fatal if KS is too low (<0.2) or unrealistically high (≥0.7 without strong justification).Formula: KS = max(cum_resp_pct - cum_resp_non_pct). ", "cumulative_lift": "Meaning: Cumulative lift = cum_resp_pct / cum_resp_total_pct. Critical: Shows model gain over random. Always cumulative. Fatal if <1 or <2 in top decile.Formula: Lift@k = cum_resp_pct / cum_resp_total_pct. ", "decile_wise_lift": "Meaning: Decile-wise lift = cnt_resp / cnt_resp_rndm. Critical: Measures decile-level improvement vs random. Fatal if <1.Formula: cnt_resp / cnt_resp_rndm. " } .. raw:: html
rate_resp cum_resp_total cum_resp_total_pct cum_resp cum_resp_pct cum_resp_non cum_resp_non_pct cum_resp_rndm cum_resp_rndm_pct cum_resp_wiz cum_resp_wiz_pct KS cumulative_lift decile_wise_lift
0 100.0000 29.0 10.1754 29.0 15.7609 0.0 0.0000 18.4 10.0 29 15.7609 15.7609 1.5489 1.5761
1 100.0000 57.0 20.0000 57.0 30.9783 0.0 0.0000 36.8 20.0 57 30.9783 30.9783 1.5489 1.5217
2 100.0000 86.0 30.1754 86.0 46.7391 0.0 0.0000 55.2 30.0 86 46.7391 46.7391 1.5489 1.5761
3 100.0000 114.0 40.0000 114.0 61.9565 0.0 0.0000 73.6 40.0 114 61.9565 61.9565 1.5489 1.5217
4 100.0000 143.0 50.1754 143.0 77.7174 0.0 0.0000 92.0 50.0 143 77.7174 77.7174 1.5489 1.5761
5 89.2857 171.0 60.0000 168.0 91.3043 3.0 2.9703 110.4 60.0 171 92.9348 88.3341 1.5217 1.3587
6 55.1724 200.0 70.1754 184.0 100.0000 16.0 15.8416 128.8 70.0 184 100.0000 84.1584 1.4250 0.8696
7 0.0000 228.0 80.0000 184.0 100.0000 44.0 43.5644 147.2 80.0 184 100.0000 56.4356 1.2500 0.0000
8 0.0000 257.0 90.1754 184.0 100.0000 73.0 72.2772 165.6 90.0 184 100.0000 27.7228 1.1089 0.0000
9 0.0000 285.0 100.0000 184.0 100.0000 101.0 100.0000 184.0 100.0 184 100.0000 0.0000 1.0000 0.0000


.. GENERATED FROM PYTHON SOURCE LINES 74-76 .. code-block:: Python :lineno-start: 74 p = sp.kdsplot(df, x="y_true", y="y_score", kind="cumulative_lift", n_deciles=10) .. image-sg:: /auto_examples/snsx/images/sphx_glr_plot_kdsplot_script_001.png :alt: Cumulative Lift Curve :srcset: /auto_examples/snsx/images/sphx_glr_plot_kdsplot_script_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 77-79 .. code-block:: Python :lineno-start: 77 p = sp.kdsplot(df, x="y_true", y="y_score", kind="decile_wise_lift", n_deciles=10) .. image-sg:: /auto_examples/snsx/images/sphx_glr_plot_kdsplot_script_002.png :alt: Decile-wise Lift Curve :srcset: /auto_examples/snsx/images/sphx_glr_plot_kdsplot_script_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 80-82 .. code-block:: Python :lineno-start: 80 p = sp.kdsplot(df, x="y_true", y="y_score", kind="cumulative_gain", n_deciles=10) .. image-sg:: /auto_examples/snsx/images/sphx_glr_plot_kdsplot_script_003.png :alt: Cumulative Gain Curve :srcset: /auto_examples/snsx/images/sphx_glr_plot_kdsplot_script_003.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 83-85 .. code-block:: Python :lineno-start: 83 p = sp.kdsplot(df, x="y_true", y="y_score", kind="cumulative_response", n_deciles=10) .. image-sg:: /auto_examples/snsx/images/sphx_glr_plot_kdsplot_script_004.png :alt: Cumulative Response Curve :srcset: /auto_examples/snsx/images/sphx_glr_plot_kdsplot_script_004.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 86-88 .. code-block:: Python :lineno-start: 86 p = sp.kdsplot(df, x="y_true", y="y_score", kind="decile_wise_gain", n_deciles=10) .. image-sg:: /auto_examples/snsx/images/sphx_glr_plot_kdsplot_script_005.png :alt: Decile-wise Gain/Response Curve :srcset: /auto_examples/snsx/images/sphx_glr_plot_kdsplot_script_005.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 89-91 .. code-block:: Python :lineno-start: 89 p = sp.kdsplot(df, x="y_true", y="y_score", kind="ks_statistic", n_deciles=10) .. image-sg:: /auto_examples/snsx/images/sphx_glr_plot_kdsplot_script_006.png :alt: KS Statistic Curve :srcset: /auto_examples/snsx/images/sphx_glr_plot_kdsplot_script_006.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 92-104 .. code-block:: Python :lineno-start: 92 fig, ax = plt.subplots(figsize=(10, 10)) p = sp.kdsplot( df, x="y_true", y="y_score", kind="report", n_deciles=10, round_digits=6, verbose=True, ) .. image-sg:: /auto_examples/snsx/images/sphx_glr_plot_kdsplot_script_007.png :alt: Cumulative Lift Curve, Decile-wise Lift Curve, Cumulative Gain Curve, KS Statistic Curve :srcset: /auto_examples/snsx/images/sphx_glr_plot_kdsplot_script_007.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none { "decile": "Meaning: Ranked group (1 = highest predicted probability). Critical: Ensure sorted descending by model score. Fatal if top deciles don't capture positives.Formula: rank by model score into k quantiles (e.g., 10 deciles). ", "prob_min": "Meaning: Lowest predicted probability in the decile. Critical: Signals model calibration. Fatal if too close to prob_max (poor ranking).Formula: min(score in decile). ", "prob_max": "Meaning: Highest predicted probability in the decile. Critical: Checks separation. Fatal if overlaps lower deciles (poor discrimination).Formula: max(score in decile). ", "prob_avg": "Meaning: Average predicted probability in the decile. Critical: Useful for calibration curves; should decrease monotonically across deciles.Formula: mean(score in decile). ", "cnt_resp_total": "Meaning: Total samples in the decile. Critical: Denominator for rate_resp and cumulative % calculations. Fatal if deciles uneven.Formula: count(samples in decile). ", "cnt_resp": "Meaning: Actual responders in the decile (how many responders we captured). Critical: Should never exceed cnt_resp_wiz. Flat counts across deciles indicate useless model.Formula: sum(y_true=1 in decile). ", "cnt_resp_non": "Meaning: Non-responders in the decile. Critical: Used for KS/statistics. Too high in top deciles is a warning.Formula: cnt_resp_total - cnt_resp. ", "cnt_resp_rndm": "Meaning: Expected responders if randomly assigned. Critical: Baseline for comparison. Fatal if model only slightly above random.Formula: cnt_resp_total * (total_responders / total_samples). ", "cnt_resp_wiz": "Meaning: Ideal responders if model were perfect. Critical: Must be ≥ cnt_resp. Fatal if NaN or actual far below.Formula: allocate top responders directly into highest deciles. ", "rate_resp": "Meaning: Per-decile response rate (alias to decile_wise_response, decile_wise_gain). Critical: Measures decile quality. Early deciles should outperform later ones.Formula: rate_resp = decile_wise_response = cnt_resp / cnt_resp_total. ", "cum_resp_total": "Meaning: Cumulative total samples. Critical: Tracks population coverage.Formula: Σ cnt_resp_total(≤ current decile). ", "cum_resp_total_pct": "Meaning: % cumulative population. Critical: X-axis for lift/gain curves; check decile balance.Formula: cum_resp_total / total_samples * 100. ", "cum_resp": "Meaning: Cumulative responders (alias to cumulative_gain) up to this decile so ML evaluation (how much `gain` vs random baseline). Critical: Should increase; max = total responders. Flat curve = weak model.Formula: cumulative_gain = cumulative_response = Σ cnt_resp(≤ current decile) = cum_resp_pct vs cum_resp_total_pct. ", "cum_resp_pct": "Meaning: % cumulative responders = cum_resp / total_responders * 100. Critical: Wizard curve should be ≥ model; used in lift/gain charts.Formula: cum_resp / total_responders * 100. ", "cum_resp_non": "Meaning: Cumulative non-responders. Critical: Used in KS statistic; early dominance is bad.Formula: Σ cnt_resp_non(≤ current decile). ", "cum_resp_non_pct": "Meaning: % cumulative non-responders. Critical: Should differ from cum_resp_pct; almost equal = model fails.Formula: cum_resp_non / total_nonresponders * 100. ", "cum_resp_rndm": "Meaning: Cumulative expected responders if randomly assigned. Critical: Baseline for cumulative lift. Fatal if model ≈ random curve.Formula: Σ cnt_resp_rndm(≤ current decile). ", "cum_resp_rndm_pct": "Meaning: % cumulative random responders = cum_resp_rndm / total_responders * 100. Critical: Random baseline curve (diagonal). Always linear from (0,0) to (100,100). Fatal if model curve is near or below it.Formula: cum_resp_rndm / total_responders * 100. ", "cum_resp_wiz": "Meaning: Cumulative ideal responders. Critical: Should always ≥ model; never NaN.Formula: Σ cnt_resp_wiz(≤ current decile). ", "cum_resp_wiz_pct": "Meaning: % cumulative ideal responders. Critical: Wizard benchmark for lift/gain curves; gaps indicate model weakness.Formula: cum_resp_wiz / total_responders * 100. ", "KS": "Meaning: KS Kolmogorov-Smirnov statistic. Range: 0-100 (percent scale) or 0-1 (fractional scale). Interpretation: - <20 → Poor discrimination (model barely better than random). - 20-40 → Fair. - 40-60 → Good. - ≥60 → Excellent. - ≥70 → Suspiciously high; likely overfitting or data leakage unless justified by very strong signal. Critical: Report max KS and check across train/validation/test. Fatal if KS is too low (<0.2) or unrealistically high (≥0.7 without strong justification).Formula: KS = max(cum_resp_pct - cum_resp_non_pct). ", "cumulative_lift": "Meaning: Cumulative lift = cum_resp_pct / cum_resp_total_pct. Critical: Shows model gain over random. Always cumulative. Fatal if <1 or <2 in top decile.Formula: Lift@k = cum_resp_pct / cum_resp_total_pct. ", "decile_wise_lift": "Meaning: Decile-wise lift = cnt_resp / cnt_resp_rndm. Critical: Measures decile-level improvement vs random. Fatal if <1.Formula: cnt_resp / cnt_resp_rndm. " } decile prob_min prob_max ... KS cumulative_lift decile_wise_lift 0 1 0.999898 0.999998 ... 15.760870 1.548913 1.576087 1 2 0.999395 0.999897 ... 30.978261 1.548913 1.521739 2 3 0.997622 0.999376 ... 46.739130 1.548913 1.576087 3 4 0.992830 0.997497 ... 61.956522 1.548913 1.521739 4 5 0.959560 0.992494 ... 77.717391 1.548913 1.576087 5 6 0.771810 0.955756 ... 88.334051 1.521739 1.358696 6 7 0.065823 0.769488 ... 84.158416 1.425000 0.869565 7 8 0.000369 0.048060 ... 56.435644 1.250000 0.000000 8 9 0.000001 0.000349 ... 27.722772 1.108949 0.000000 9 10 0.000000 0.000001 ... 0.000000 1.000000 0.000000 [10 rows x 23 columns] .. GENERATED FROM PYTHON SOURCE LINES 105-113 .. tags:: model-type: classification model-workflow: model evaluation plot-type: line plot-type: cum-gain curve level: beginner purpose: showcase .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 1.323 seconds) .. _sphx_glr_download_auto_examples_snsx_plot_kdsplot_script.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/scikit-plots/scikit-plots/maintenance/0.4.X?urlpath=lab/tree/notebooks/auto_examples/snsx/plot_kdsplot_script.ipynb :alt: Launch binder :width: 150 px .. container:: lite-badge .. image:: images/jupyterlite_badge_logo.svg :target: ../../lite/lab/index.html?path=auto_examples/snsx/plot_kdsplot_script.ipynb :alt: Launch JupyterLite :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_kdsplot_script.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_kdsplot_script.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_kdsplot_script.zip ` .. include:: plot_kdsplot_script.recommendations .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_