R Language Data Analysis series four

Last Update:2015-03-31 Source: Internet

Author: User

Tags ggplot

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

R Language Data Analysis series four --by Comaple.zhang

When it comes to statistical analysis, we can't leave random variables, so-called random variables are mathematical models that mathematicians build to better fit the real-world data. With her, we can even predict a website for the next few days to visit users, the future trend of stocks and so on. So in this section we will explore the following common function distributions, as well as the process control statements.

Common distributions are: normal distribution (Gaussian distribution), exponential distribution, beta distribution, gamma distribution, etc.

Normal

If the random variable x obeys a normal distribution with a mathematical expectation of μ and a variance of σ^2, it is recorded as N (μ,σ^2). The probability density function curve is determined by the expected μ of the normal distribution, and its standard deviation σ determines the amplitude of the distribution. Because the curve is bell-shaped, it is often called the bell-shaped curve. The standard normal distribution we usually call is the normal distribution of μ= 0,σ= 1.

Par (Mgp=c (0.6,0.6,0)) x <-seq ( -5,5,length.out=100) y <-dnorm (x,0,1) plot (X,y,xlim=c ( -4,4), col= ' Red ', ylim=c ( 0,0.8), type= ' l ', ylab= ' density ', xlab= ' x ', main= "The Normal density Distribution") lines (X,dnorm (x,0,2), col= "Blue") Lines (X,dnorm (x,-2,1), col= "Orange") lines (X,dnorm (x,0,0.5), col= "green")

Exponential distribution

The life distribution of many electronic products is generally subject to exponential distribution. The lifetime distributions of some systems can also be approximated by exponential distributions. It is one of the most commonly used distribution forms in reliability research. When the failure of a product is accidental failure, its lifetime is subject to exponential distribution. For example, if an original is known to have been used for S-hour, then it can use the T-hour conditional probability, just like the probability of using T-hour from the beginning. This is the non-memory of the exponential distribution, it has a wide range of applications in reliability research.

X<-seq ( -1,2,length.out=100) y<-dexp (x,0.5) plot (x,y,col= "Red", Xlim=c (0,2), Ylim=c (0,5), type= ' l ',        xaxs= " I ", yaxs=" I ", ylab= ' density ', xlab= ' x ',        main=" The exponential density Distribution ") lines (X,dexp (x,1), col=" Green ") lines (X,dexp (x,2), col=" Blue ") lines (X,dexp (x,5), col=" Orange ")

Gamma Gamma distribution

Gamma function:

The gamma function is the generalization of the factorial on the real number.

Probability density function of gamma distribution:

X<-seq (0,10,length.out=100) Y<-dgamma (x,1,2) plot (x,y,col= "Red", Xlim=c (0,10), Ylim=c (0,2), type= ' l ',     Xaxs= "I", yaxs= "I", ylab= ' density ', xlab= ',     main= "The Gamma density Distribution") lines (X,dgamma (x,2,2), col= " Green ") lines (X,dgamma (x,3,2), col=" Blue ") lines (X,dgamma (x,5,1), col=" Orange ") lines (X,dgamma (x,9,1), col=" Black ")

Beta Tower Distribution

An important part of the beta distribution should be the presence of conjugate prior distributions as Bernoulli distributions and two-term distributions, which have important applications in machine learning and mathematical statistics.

The distribution has two parameters,α,β (α,β>0).

X<-seq ( -5,5,length.out=10000) Y<-dbeta (x,0.5,0.5) plot (x,y,col= "Red", xlim=c (0,1), Ylim=c (0,6), type= ' l ',    xaxs= "i", yaxs= "I", ylab= ' density ', xlab= ',    main= "The Beta density Distribution") lines (X,dbeta (x,5,1), col= "Green") lines (X,dbeta (x,1,3), col= "Blue") lines (X,dbeta (x,2,2), col= "Orange") lines (X,dbeta (x,2,5), col= "Black") Legend ("Top", Legend=paste ("A=", C (. 5,5,1,2,2), "b=", C (. 5,1,3,2,5)), Lwd=1,col=c ("Red", "green", "Blue", "Orange", " Black "))

Flow Control Statement Branch statement

If else:

A < 5if (a>10) {print (' a>10 ')} else if (a<10) {print (' a<10 ')} else{print (' a=10 ')}

Switch Branch statement:

Case <-4

switch (case, ' low anomaly ', ' low ', ' normal ', ' high ', ' high anomaly ')

Too high

For loop:

WEB.PV <-C (sample (100:5000,30)) Web.day <-seq (as. Date (' 2015-01-01 '), by=1,length=30) web.data <-data.frame (WEB.DAY,WEB.PV) for (item INWEB.DATA$WEB.PV) {print ( Paste (Web.data$web.day[which (WEB.DATA$WEB.PV ==item)], ", item)}

While loop:

while (I <length (WEB.PV)) {print (web.pv[i]); i = i + 1}; i=0

Function

Define a function expression: Y=a*x + B, and then we also draw the function path graph:

Demo.fun1 <-Function (x,a,b) {return (A * x + b)} a=3b=7y <-demo.fun1 (x,a,b) DF <-data.frame (x, y) G<-ggplot (d F,aes (x, y)) G <-G + geom_line (col= ' red ') # once equation curve g <-g + geom_hline (yintercept=0) +geom_vline (yintercept=0) #设置坐标轴 G <-G + ggtitle (paste (' y= ', A, ' * x+ ', b)) # Add title G

Define a multiple equation function:

Demo.fun3 <-Function (x,a,b,c,d) {return (A * x^3 + b * x^2 + c * x +d)} a=1b=5c=6d=-10x <-seq ( -5,5,by=0.01) y <-  DEMO.FUN3 (x,a,b,c,d) DF <-data.frame (x, y) G <-ggplot (Df,aes (x, y)) G <-G + geom_line (col= ' green ') #三次曲线g <-g + Geom_hline (yintercept=0) + geom_vline (yintercept=0) #设置坐标轴g <-g + ggtitle (paste (' y= ', A, ' *x^3 + ', B, ' *x^2 + ', C, ' * x + ') , d) # add title G

R Language Data Analysis series four

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More