plot_silhouette#
- scikitplot.api.metrics.plot_silhouette(X, cluster_labels, *, metric='euclidean', copy=True, title='Silhouette Analysis', title_fontsize='large', text_fontsize='medium', cmap=None, digits=4, **kwargs)[source]#
Plots silhouette analysis of clusters provided.
Silhouette analysis is a method of interpreting and validating the consistency within clusters of data. It measures how similar an object is to its own cluster compared to other clusters.
- Parameters:
- Xarray-like, shape (n_samples, n_features)
Data to cluster, where
n_samples
is the number of samples andn_features
is the number of features.- cluster_labelsarray-like, shape (n_samples,)
Cluster label for each sample.
- metricstr or callable, optional, default=’euclidean’
The metric to use when calculating distance between instances in a feature array. If metric is a string, it must be one of the options allowed by
sklearn.metrics.pairwise.pairwise_distances
. IfX
is the distance array itself, use “precomputed” as the metric.- copybool, optional, default=True
Determines whether
fit
is used onclf
or on a copy ofclf
.- titlestr, optional, default=’Silhouette Analysis’
Title of the generated plot.
- title_fontsizestr or int, optional, default=’large’
Font size for the plot title.
- text_fontsizestr or int, optional, default=’medium’
Font size for the text in the plot.
- cmapNone, str or matplotlib.colors.Colormap, optional, default=None
Colormap used for plotting. Options include ‘viridis’, ‘PiYG’, ‘plasma’, ‘inferno’, ‘nipy_spectral’, etc. See Matplotlib Colormap documentation for available choices.
https://matplotlib.org/stable/users/explain/colors/index.html
plt.colormaps()
plt.get_cmap() # None == ‘viridis’
- digitsint, optional, default=4
Number of digits for formatting output floating point values.
Added in version 0.3.9.
- **kwargs: dict
Generic keyword arguments.
- Returns:
- axmatplotlib.axes.Axes
The axes on which the plot was drawn.
References * “scikit-learn silhouette_score”.#
- Other Parameters:
- axmatplotlib.axes.Axes, optional, default=None
The axis to plot the figure on. If None is passed in the current axes will be used (or generated if required).
- figmatplotlib.pyplot.figure, optional, default: None
The figure to plot the Visualizer on. If None is passed in the current plot will be used (or generated if required).
- figsizetuple, optional, default=None
Width, height in inches. Tuple denoting figure size of the plot e.g. (12, 5)
- nrowsint, optional, default=1
Number of rows in the subplot grid.
- ncolsint, optional, default=1
Number of columns in the subplot grid.
- plot_stylestr, optional, default=None
Check available styles with “plt.style.available”. Examples include: [‘ggplot’, ‘seaborn’, ‘bmh’, ‘classic’, ‘dark_background’, ‘fivethirtyeight’, ‘grayscale’, ‘seaborn-bright’, ‘seaborn-colorblind’, ‘seaborn-dark’, ‘seaborn-dark-palette’, ‘tableau-colorblind10’, ‘fast’].
Added in version 0.4.0.
- show_figbool, default=True
Show the plot.
- save_figbool, default=False
Save the plot.
- save_fig_filenamestr, optional, default=’’
Specify the path and filetype to save the plot. If nothing specified, the plot will be saved as png inside
result_images
under to the current working directory. Defaults to plot image named to usedfunc.__name__
.- verbosebool, optional
If True, prints debugging information.
Examples
>>> from sklearn.cluster import KMeans >>> from sklearn.datasets import load_iris as data_3_classes >>> import scikitplot as skplt >>> X, y = data_3_classes(return_X_y=True, as_frame=False) >>> kmeans = KMeans(n_clusters=3, random_state=0) >>> cluster_labels = kmeans.fit_predict(X) >>> skplt.metrics.plot_silhouette( >>> X, >>> cluster_labels, >>> );
(
Source code
,png
)