SCIPY Module 1--scipy Tutorial 5 Statistics

Source: Internet
Author: User

8. Statistics (scipy.stats) 1. Random Variables (random variable)

Continuous (continuous) random variables and discrete (discrete) random variables

Over continuous random variables (RVs) and discrete random, variables have been implemented using these classes.





2. Main methods

Rvs:random variates pdf:probability density function cdf:cumulative distribution function Sf:survival function (1-CDF) Ppf:percent Point Function (inverse of CDF) Isf:inverse survival Function (inverse of SF) Stats:return mean, Variance, (Fisher ' s) skew, or (Fisher's) kurtosis moment:non-central moments of the distribution

(1) Normal distribution norm


(2) Shifting (move) and scaling (shrink)

All continuous distributions take loc and scale as keyword parameters to adjust the location and scale of the distribution ,

For the standard normal distribution The location is the mean (average) and the scale is the standard (standard deviation).


In many cases the standardized distribution for a random variable X is obtained through the transformation (X-LOC)/SCA Le. The default values are loc = 0and scale = 1.

(3) Uniform distribution of uniform


(4) Shape parameters (shape parameter)



(5) Freezing a distribution

Passing the LOC and scale keywords time and again can become quite (annoying).


(6) Broadcasting (broadcast)


3. Specific Points (specific point) for discrete distributions (discrete distribution)

The PDF is replaced the probability mass function PMF.

(1) hypergeometric distribution (ultra-geometrical distribution)


4. Build Your own distributions

(1) makeing a continuous distribution (rv_continuous)


(2) makeing a discrete distribution (rv_discrete)

"You can construct a arbitrary discrete RV where p{x=xk} = PK by passing to the Rv_discrete method (initialization H the values= keyword) a tuple of sequences (XK, PK) which describes only those values of X (XK) then occur with nonzero p Robability (PK). "

5. Comparing two samples distribution

We want to test whether these samples have the same statistical properties.

(1) Compare means


(2) Kolmogorov-smirnov test (Ks_2samp) rejects null hypothesis


6. Kernel Density estimation (nuclear density estimate)

The most well-known tool to does this is the histogram. Kernel density Estimation (KDE) is a more efficient tool

To estimate the probability density. The Gaussian_kde estimator can be used to estimate the PDF of univariate as as as the multivariate data. It works best if the data is unimodal (single peak).

(1) uni-variate (single variable) Estimatino



(2) A Student ' s T distribution with 5 degrees of freedom



(3) bimodal distribution (Shuangfeng distribution)--not well estimated



(4) Multi-variate estimate





Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.