plot_silhouette#

scikitplot.api.metrics.plot_silhouette(X, cluster_labels, *, metric='euclidean', copy=True, title='Silhouette Analysis', title_fontsize='large', text_fontsize='medium', cmap=None, digits=4, **kwargs)[source]#

Plots silhouette analysis of clusters provided.

Silhouette analysis is a method of interpreting and validating the consistency within clusters of data. It measures how similar an object is to its own cluster compared to other clusters.

Parameters:
Xarray-like, shape (n_samples, n_features)

Data to cluster, where n_samples is the number of samples and n_features is the number of features.

cluster_labelsarray-like, shape (n_samples,)

Cluster label for each sample.

metricstr or callable, optional, default=’euclidean’

The metric to use when calculating distance between instances in a feature array. If metric is a string, it must be one of the options allowed by sklearn.metrics.pairwise.pairwise_distances. If X is the distance array itself, use “precomputed” as the metric.

copybool, optional, default=True

Determines whether fit is used on clf or on a copy of clf.

titlestr, optional, default=’Silhouette Analysis’

Title of the generated plot.

title_fontsizestr or int, optional, default=’large’

Font size for the plot title.

text_fontsizestr or int, optional, default=’medium’

Font size for the text in the plot.

cmapNone, str or matplotlib.colors.Colormap, optional, default=None

Colormap used for plotting. Options include ‘viridis’, ‘PiYG’, ‘plasma’, ‘inferno’, ‘nipy_spectral’, etc. See Matplotlib Colormap documentation for available choices.

digitsint, optional, default=4

Number of digits for formatting output floating point values.

Added in version 0.3.9.

**kwargs: dict

Generic keyword arguments.

Returns:
axmatplotlib.axes.Axes

The axes on which the plot was drawn.

References * “scikit-learn silhouette_score”.#
Other Parameters:
axmatplotlib.axes.Axes, optional, default=None

The axis to plot the figure on. If None is passed in the current axes will be used (or generated if required).

figmatplotlib.pyplot.figure, optional, default: None

The figure to plot the Visualizer on. If None is passed in the current plot will be used (or generated if required).

figsizetuple, optional, default=None

Width, height in inches. Tuple denoting figure size of the plot e.g. (12, 5)

nrowsint, optional, default=1

Number of rows in the subplot grid.

ncolsint, optional, default=1

Number of columns in the subplot grid.

plot_stylestr, optional, default=None

Check available styles with “plt.style.available”. Examples include: [‘ggplot’, ‘seaborn’, ‘bmh’, ‘classic’, ‘dark_background’, ‘fivethirtyeight’, ‘grayscale’, ‘seaborn-bright’, ‘seaborn-colorblind’, ‘seaborn-dark’, ‘seaborn-dark-palette’, ‘tableau-colorblind10’, ‘fast’].

Added in version 0.4.0.

show_figbool, default=True

Show the plot.

save_figbool, default=False

Save the plot.

save_fig_filenamestr, optional, default=’’

Specify the path and filetype to save the plot. If nothing specified, the plot will be saved as png inside result_images under to the current working directory. Defaults to plot image named to used func.__name__.

verbosebool, optional

If True, prints debugging information.

Examples

>>> from sklearn.cluster import KMeans
>>> from sklearn.datasets import load_iris as data_3_classes
>>> import scikitplot as skplt
>>> X, y = data_3_classes(return_X_y=True, as_frame=False)
>>> kmeans = KMeans(n_clusters=3, random_state=0)
>>> cluster_labels = kmeans.fit_predict(X)
>>> skplt.metrics.plot_silhouette(
>>>     X,
>>>     cluster_labels,
>>> );

(Source code, png)

Silhouette Plot