8. Statistics (scipy.stats) 1. Random Variables (random variable)
Continuous (continuous) random variables and discrete (discrete) random variables
Over continuous random variables (RVs) and discrete random, variables have been implemented using these classes.
2. Main methods
Rvs:random variates pdf:probability density function cdf:cumulative distribution function Sf:survival function (1-CDF) Ppf:percent Point Function (inverse of CDF) Isf:inverse survival Function (inverse of SF) Stats:return mean, Variance, (Fisher ' s) skew, or (Fisher's) kurtosis moment:non-central moments of the distribution
(1) Normal distribution norm
(2) Shifting (move) and scaling (shrink)
All continuous distributions take loc and scale as keyword parameters to adjust the location and scale of the distribution ,
For the standard normal distribution The location is the mean (average) and the scale is the standard (standard deviation).
In many cases the standardized distribution for a random variable X is obtained through the transformation (X-LOC)/SCA Le. The default values are loc = 0and scale = 1.
(3) Uniform distribution of uniform
(4) Shape parameters (shape parameter)
(5) Freezing a distribution
Passing the LOC and scale keywords time and again can become quite (annoying).
(6) Broadcasting (broadcast)
3. Specific Points (specific point) for discrete distributions (discrete distribution)
The PDF is replaced the probability mass function PMF.
(1) hypergeometric distribution (ultra-geometrical distribution)
4. Build Your own distributions
(1) makeing a continuous distribution (rv_continuous)
(2) makeing a discrete distribution (rv_discrete)
"You can construct a arbitrary discrete RV where p{x=xk} = PK by passing to the Rv_discrete method (initialization H the values= keyword) a tuple of sequences (XK, PK) which describes only those values of X (XK) then occur with nonzero p Robability (PK). "
5. Comparing two samples distribution
We want to test whether these samples have the same statistical properties.
(1) Compare means
(2) Kolmogorov-smirnov test (Ks_2samp) rejects null hypothesis
6. Kernel Density estimation (nuclear density estimate)
The most well-known tool to does this is the histogram. Kernel density Estimation (KDE) is a more efficient tool
To estimate the probability density. The Gaussian_kde estimator can be used to estimate the PDF of univariate as as as the multivariate data. It works best if the data is unimodal (single peak).
(1) uni-variate (single variable) Estimatino
(2) A Student ' s T distribution with 5 degrees of freedom
(3) bimodal distribution (Shuangfeng distribution)--not well estimated
(4) Multi-variate estimate