scott_bin_width#

scikitplot._astropy.stats.scott_bin_width(data, return_bins=False)[source]#

Return the optimal histogram bin width using Scott’s rule.

Scott’s rule is a normal reference rule: it minimizes the integrated mean squared error in the bin approximation under the assumption that the data is approximately Gaussian.

Parameters:
dataarray-like, ndim=1

observed (one-dimensional) data

return_binsbool, optional

if True, then return the bin edges

Returns:
widthfloat

optimal bin width using Scott’s rule

binsndarray

bin edges: returned if return_bins is True

Parameters:
  • data (ArrayLike)

  • return_bins (bool | None)

Return type:

float | tuple[float, NDArray]

Notes

The optimal bin width is

\[\Delta_b = \frac{3.5\sigma}{n^{1/3}}\]

where \(\sigma\) is the standard deviation of the data, and \(n\) is the number of data points [1].

References

[1]

Scott, David W. (1979). “On optimal and data-based histograms”. Biometricka 66 (3): 605-610