scikitplot.stats#
Elegant statistical tools for intuitive and insightful data visualization and interpretation.
The stats
module offers a wide range of probability distributions, summary
and frequency statistics, correlation functions, statistical tests,
masked statistics, and additional tools.
User guide. See the Stats (experimental) section for further details.
Astrostatistics: Bayesian Blocks for Time Series Analysis#
Bayesian blocks fitness for binned or unbinned events. |
|
Base class for bayesian blocks fitness functions. |
|
Bayesian blocks fitness for point measures. |
|
Bayesian blocks fitness for regular events. |
|
Compute optimal segmentation of data with Scargle's Bayesian Blocks. |
Astrostatistics Tools#
This module contains simple statistical algorithms that are straightforwardly implemented as a single python function (or family of functions).
This module should generally not be used directly. Everything in
__all__
is imported into astropy.stats
, and hence that package
should be used for access.
Binomial proportion and confidence interval in bins of a continuous variable |
|
Binomial proportion confidence interval given k successes, n trials. |
|
Performs bootstrap resampling on numpy arrays. |
|
Construct a callable piecewise-linear CDF from a pair of arrays. |
|
Fold the weighted intervals to the interval (0,1). |
|
Convert a string or number to a floating point number, if possible. |
|
Convert a string or number to a floating point number, if possible. |
|
Histogram of a piecewise-constant weight function. |
|
Compute the length of overlap of two intervals. |
|
Compute the Kuiper statistic. |
|
Compute the false positive probability for the Kuiper statistic. |
|
Compute the Kuiper statistic to compare two samples. |
|
Calculate a robust standard deviation using the median absolute deviation (MAD). |
|
Calculate the median absolute deviation (MAD). |
|
Poisson parameter confidence interval given observed counts. |
|
Computes the signal to noise ratio for source being observed in the optical/IR using a CCD. |
Astrostatistics: Selecting the bin width of histograms#
Calculate histogram bin edges like |
|
Return the optimal histogram bin width using the Freedman-Diaconis rule. |
|
Enhanced histogram function, providing adaptive binnings. |
|
Return the optimal histogram bin width using Knuth's rule. |
|
Return the optimal histogram bin width using Scott's rule. |
Astrostatistics: Model Selection#
This module contains simple functions for model selection.
Computes the Akaike Information Criterion (AIC). |
|
Computes the Akaike Information Criterion assuming that the observations are Gaussian distributed. |
|
Computes the Bayesian Information Criterion (BIC) given the log of the likelihood function evaluated at the estimated (or analytically derived) parameters, the number of parameters, and the number of samples. |
|
Computes the Bayesian Information Criterion (BIC) assuming that the observations come from a Gaussian distribution. |
Discrete Distributions Tools#
Tweedie Distribution Module#
This module implements the Tweedie distribution,
a member of the exponential dispersion model (EDM) family,
using SciPy’s rv_continuous
class.
It is especially useful for modeling claim amounts in the insurance industry, where data often exhibit a mixture of zeroes and positive continuous values.
The primary focus of this package is the compound-Poisson behavior
of the Tweedie distribution, particularly in the range 1 < p < 2
.
However, it supports calculations for all valid values of the shape parameter p
.
Notes
The probability density function (PDF) of the Tweedie distribution cannot be expressed in a closed form for most values of p
.
However, approximations and numerical methods are employed to compute the PDF for practical purposes.
The Tweedie distribution family includes several well-known distributions based on the value of the shape parameter p
:
p = 0
: Normal distributionp = 1
: Poisson distribution1 < p < 2
: Compound Poisson-Gamma distributionp = 2
: Gamma distribution2 < p < 3
: Positive stable distributionsp = 3
: Inverse Gaussian distributionp > 3
: Positive stable distributions
The Tweedie distribution is undefined for values of p
in the range (0, 1)
.
References
- [1] Jørgensen, B. (1987). “Exponential dispersion models”.
Journal of the Royal Statistical Society, Series B. 49 (2): 127–162.
- [2] Tweedie, M. C. K. (1984). “An index which distinguishes between some important exponential families”.
In Statistics: Applications and New Directions. Proceedings of the Indian Statistical Institute Golden Jubilee International Conference.
- [3] [YouTube]
Statistical Methods Series: Zero-Inflated GLM and GLMM.
- [4] [Google]
A Tweedie continuous random variable inherited |
|
An instance of |