plot_silhouette#
- scikitplot.api.metrics.plot_silhouette(X, cluster_labels, *, metric='euclidean', copy=True, title='Silhouette Analysis', ax=None, fig=None, figsize=None, title_fontsize='large', text_fontsize='medium', cmap=None, digits=4, **kwargs)[source]#
Plots silhouette analysis of clusters provided.
Silhouette analysis is a method of interpreting and validating the consistency within clusters of data. It measures how similar an object is to its own cluster compared to other clusters.
- Parameters:
- Xarray-like, shape (n_samples, n_features)
Data to cluster, where
n_samples
is the number of samples andn_features
is the number of features.- cluster_labelsarray-like, shape (n_samples,)
Cluster label for each sample.
- metricstr or callable, optional, default=’euclidean’
The metric to use when calculating distance between instances in a feature array. If metric is a string, it must be one of the options allowed by
sklearn.metrics.pairwise.pairwise_distances
. IfX
is the distance array itself, use “precomputed” as the metric.- copybool, optional, default=True
Determines whether
fit
is used onclf
or on a copy ofclf
.- titlestr, optional, default=’Silhouette Analysis’
Title of the generated plot.
- axlist of matplotlib.axes.Axes, optional, default=None
The axis to plot the figure on. If None is passed in the current axes will be used (or generated if required). Axes like
fig.add_subplot(1, 1, 1)
orplt.gca()
- figmatplotlib.pyplot.figure, optional, default: None
The figure to plot the Visualizer on. If None is passed in the current plot will be used (or generated if required).
Added in version 0.3.9.
- figsizetuple of int, optional, default=None
Size of the figure (width, height) in inches.
- title_fontsizestr or int, optional, default=’large’
Font size for the plot title.
- text_fontsizestr or int, optional, default=’medium’
Font size for the text in the plot.
- cmapNone, str or matplotlib.colors.Colormap, optional, default=None
Colormap used for plotting. Options include ‘viridis’, ‘PiYG’, ‘plasma’, ‘inferno’, ‘nipy_spectral’, etc. See Matplotlib Colormap documentation for available choices.
https://matplotlib.org/stable/users/explain/colors/index.html
plt.colormaps()
plt.get_cmap() # None == ‘viridis’
- digitsint, optional, default=4
Number of digits for formatting output floating point values.
Added in version 0.3.9.
- Returns:
- matplotlib.axes.Axes
The axes on which the plot was drawn.
References * “scikit-learn silhouette_score”.#
Examples
>>> from sklearn.cluster import KMeans >>> from sklearn.datasets import load_iris as data_3_classes >>> import scikitplot as skplt >>> X, y = data_3_classes(return_X_y=True, as_frame=False) >>> kmeans = KMeans(n_clusters=3, random_state=0) >>> cluster_labels = kmeans.fit_predict(X) >>> skplt.metrics.plot_silhouette( >>> X, >>> cluster_labels, >>> );
(
Source code
,png
)