A brief introduction to several statistical distributions in R

Last Update:2016-11-14 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

There are many statistical distributions, which are basically described in R. Due to their limited capabilities, we have selected a few common, more important, and simple descriptions of each distribution definition, formula, and presentation in R.

Here's a list of distributions:

rnorm (n, mean=0, sd=1) Gaussian (normal) distribution
rexp (n, rate=1)? Exponential distribution
rgamma (n, Shape, scale=1) Gamma distribution

Rpois (n, Lambda) Poisson distribution
rweibull (n, Shape, scale=1) Weibull distribution
rcauchy (n, location=0, scale=1) Cauchy distribution
Rbeta (n, Shape1, shape2) Beta distribution
RT (n, DF) t distribution
RF (n, DF1, DF2) F Distribution
RCHISQ (n, DF) χ2 Distribution
rbinom (n, size, prob) two distributions?
rgeom (n, prob) geometric distribution
rhyper (NN, m, N, k) ? hypergeometric distribution
Rlogis (n, location=0, scale=1) Logistic distribution
rlnorm (n, meanlog=0, sdlog=1) logarithmic normality
rnbinom (n, size, prob) negative two-item distribution
runif (n, min=0, max=1) evenly distributed
Rwilcox (NN, m, n), Rsignrank (NN, n) Wilcoxon distribution
Note that the above distribution has a pattern, that is, all functions are preceded by R , so, if you want to get the probability density, use d to replace R

If you want to get cumulative probability density, replace r with p

If you want to get the number of bits, replace R with Q

Two items distributed:

That is to repeat the N-time independent Bernoulli test. There are only two possible outcomes in each test, the two outcomes are antagonistic and independent, unrelated to the results of other tests, and the probability of occurrence or absence of events remains unchanged in each independent test , This series of experiments is always called N-heavy-Knoop experiment, when the number of trials is 1 o'clock, two distributions obey 0-1 distribution.

Formula: P (ξ=k) = C (n,k) * p^k * (1-p) ^ (n-k)

where p is the probability of success, N is n independent repetition experiment, K is the probability of k occurrence of n experiment

Expected: EΞ=NP

Variance: DΞ=NP (1-p)

The two distributions are shown in R:

p=.4

k=200

n=10000

X=rbinom (N,K,P)

hist (x)

To standardize processing:

Mean=k*p

var=k*p* (1-P)

z= (X-mean)/sqrt (Var)

hist (z)

Plot density plots

Mean=k*p

var=k*p* (1-P)

z= (X-mean)/sqrt (Var)

hist (z)

Normal distribution :

The normal curve is bell-shaped, the two are low, the middle is high, the symmetry of left and right is bell-shaped, so people often call it bell-shaped curve.

If the random variable x obeys a normal distribution with a mathematical expectation of μ and a variance of σ^2, it is recorded as N (μ,σ^2)

When μ= 0,σ= 1 o'clock, the normal distribution is the standard normal distribution.

Representation of normal distribution in R:

X=rnorm (k, Mean=mean,sd=sqrt (Var))

hist (x)

Poisson distribution:

is a discrete probability distribution commonly found in statistics and probability, published by French mathematician Simon Denis (Siméon-denis Poisson) in 1838.

probability function of Poisson distribution:

The parameter λ of the Poisson distribution is the average occurrence of random events in the unit time (or unit area). The Poisson distribution is suitable for describing the number of random events that occur per unit time.

The Poisson distribution in R shows:

Par (Mfrow=c (2,2), mar = C (3,4,1,1))

lambda=.5

X=rpois (k, Lambda)

hist (x)

Lambda=1

X=rpois (k, Lambda)

hist (x)

Lambda=5

X=rpois (k, Lambda)

hist (x)

lambda=10

X=rpois (k, Lambda)

hist (x)

Two distribution and Poisson distribution:

When N of two distributions is large and p is very small, the Poisson distribution can be approximated as a two-item distribution, where λ is NP. Usually, when n≧10,p≦0.1, the Poisson formula can be used to approximate the calculation.

Par (Mfrow=c (3,3), mar = C (3,4,1,1))

k=10000

P=c (. 5,. 05,. 005)

N=c (10,100,1000)

For (I in P) {

for (j in N) {

X=rbinom (K,j,i)

hist (x)

}}

Chi-Square Distribution:

If n independent random variables ξ, ξ?、......、 ξn, are subject to the standard normal distribution (also known as independent distribution in the standard normal distribution), then the sum of the squares of the random variables which obey the standard normal distribution is a new random variable, the distribution law is called Chi-square distribution (chi-square Distribution).

Chi-square distribution is a new distribution constructed from the normal distribution, when the degree of Freedom N is large,

The distribution is approximate to normal distribution.

Chi-square distribution in R:

k=10000

Par (Mfrow=c (2,2), mar = C (3,4,1,1))

X=RCHISQ (k,2)

D=density (x)

Plot (d)

X=RCHISQ (k,5)

D=density (x)

Plot (d)

X=RCHISQ (k,100)

D=density (x)

Plot (d)

X=RCHISQ (k,1000)

D=density (x)

Plot (d)

F Distribution:

The F distribution is defined as: set X, y as two independent random variables, x obey the chi-square distribution of degrees of freedom K1, y obey the K2 distribution of degrees of freedom, these 2 independent chi-square distributions are separated by their degrees of freedom in addition to the ratio of this statistic distribution. That is: the F-distribution is the distribution that obeys the first degree of freedom for K1 and the second degree of freedom for K2.

k=10000

Par (Mfrow=c (2,2), mar = C (3,4,1,1))

X=RF (k,1, 100)

hist (x)

X=RF (k,1, 10000)

hist (x)

X=RF (k,10, 10000)

hist (x)

X=RF (k,10000, 10000)

hist (x)

T distribution:

The shape of the T distribution curve is related to the size of N (exactly, the degree of Freedom V). Compared with the standard normal distribution curve, the lower the Freedom V, the flatter the T distribution curve, the lower the middle of the curve, the higher the end of the curve, the more the T distribution curve is closer to the normal distribution curve, and the T distribution curve is the normal normal distribution curve when the degree of freedom v=∞.

k=10000

Par (Mfrow=c (2,2), mar = C (3,4,1,1))

X=rt (k,2)

hist (x)

X=rt (k,5)

hist (x)

X=rt (k,10)

hist (x)

X=rt (k,100)

hist (x)

Diagram of several distribution relationships:

I2mean=function (x,n=10) {

K=length (x)

nobs=k/n

Xm=matrix (X,nobs,n)

Y=rowmeans (XM)

Return (y)

}

Par (Mfrow=c (5,1), mar = C (3,4,1,1))

#Binomia

p=.05

n=100

k=10000

X=i2mean (Rbinom (k, n,p))

D=density (x)

Plot (d,main= "binomial")

#Poisson

lambda=10

X=i2mean (Rpois (k, Lambda))

D=density (x)

Plot (d,main= "Poisson")

#Chi-square

X=i2mean (RCHISQ (k,5))

D=density (x)

Plot (d,main= "Chi-Square")

X=i2mean (RF (k,10, 10000))

D=density (x)

Plot (d,main= "F Dist")

X=i2mean (RT (k,5))

D=density (x)

Plot (d,main= "T Dist")

From: Chopping wood and asking the woodcutter

A brief introduction to several statistical distributions in R

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

A brief introduction to several statistical distributions in R

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

A brief introduction to several statistical distributions in R

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support