R Language Learning Notes

Source: Internet
Author: User
Tags diff garbage collection memory usage

Http://www.cnblogs.com/wentingtu/archive/2012/03/03/2377965.html

Summarize the available R language resources on the Internet http://www.douban.com/note/262946592/?type=like R language: Machine Learning packages

Http://blog.sciencenet.cn/blog-634847-497887.html

Reproduced R Basics: R language and its learning resources

Http://www.biosino.org/R/R-doc/onepage/R-data_cn.html#Importing-from-other-statistical-systems

Import and export of R data

Http://www.biostatistic.net/thread-40320-1-1.html

READ.DBF (Foreign)

We can see what variables are in work space through LS ().

Help (functionname) functionname args (functionname) example (functionname)

Use Object.size () to see how much memory each variable occupies.

View current work space's memory usage through memory.size ()

Use the Memory.limit () to see the maximum memory usage limits specified by the system. If you think the current memory limit is insufficient, you can change to a new upper limit by Memory.limit (Newlimit).

Note that in 32-bit R, the cap is capped at 4G, and you cannot use more than 4G (upper limit) on a program. At such times, you might consider using a 64-bit version. For some large but useless intermediate variables, get into the habit of cleaning up: You can use RM (object) To delete variables, but remember, RM remember to use GC () to do garbage collection, otherwise the memory will not be automatically released, the equivalent of you did not do RM.

Using the foreign package of READ.DBF () to read hundreds of DBF files can not

Http://blog.sina.com.cn/s/blog_62b37bfe0101f4h0.html

Accessing a column of data

For removing a row of data, you can use the subset () function or subscript operations, which can be vectors, matrices, and data frames. Taking the matrix as an example, when using subscript for data deletion, the deletion of whole rows or columns is often done, and by default, columns are deleted.

> X<-data.frame (Matrix (1:30,nrow=5,byrow=t))

>new.x1<-x[-c (1,4)] #去除第一和第四列

>new.x1<-x[-c (1,4),] #去除第一和第四行

> New.x1<-x[,-c (1,4)] # Remove first and fourth columns

Using the subset () function to access and select data frame data is more flexible, the subset function returns a subset of vectors, matrices, and data frames that meet the criteria.

Three ways to apply the subset function:

Subset (x, subset, ...)

Subset (x, subset, select, Drop =false, ...) # #对于矩阵

Subset (x, subset, select, Drop =false, ...) # #对于数据框

X is an object, subset is a logical expression that retains elements or rows, and Na is substituted for missing values.

Select is the selected range and should be less than X.
>x<-data.frame (Matrix (1:30,nrow=5,byrow=t))

> Rownames (x) =c ("One", "I", "three", "four", "five")

>colnames (x) =c ("A", "B", "C", "D", "E", "F")

> x

>new<-subset (X,A>=14,SELECT=A:F)

>new # # Select the A>14 row from a to f column.

http://my.oschina.net/zarger/blog/102818

access a column of data

The 1.1 str function can view the properties of each variable in the data frame:

> str (SQUID)

Data parameters in 1.2 functions--best way to access variables in a data frame

> M1 <-lm (GSI ~ factor (location) +factor (year), data = Squid)

1.3 $ symbol Another way to access a variable

> Squid$gsi or > squid[,6]

1.4 The Attach function to add a data frame to the search path of R, you can view the GSI data directly from the GSI command

>attach (Squid)
> GSI

Filter Data

Data2 <-data[data$v6==2,]

Y <-x[(X$timeindex >) & (X$timeindex <5000),]

X3 <-x2[(x2$date< as. Date ("2015/1/5")),]

X2 <-X1[order (x1[,3]),]

r for data filtering

http://blog.163.com/xiaoji0106@126/blog/static/136134661201392532614464/

References to box columns :
DN, D[,n], D "Name", d[, "name"], D$name gets the vector of a column.
D[n], d["name"] gets the data frame of a column.
D[c (M,n,...)], d[,c (M,n,...), D[,c ("name1", "name2",...)] Get a data frame of several columns

Other tips: Negative sign for culling.
You can use grep () to search for variable names. For example Mydata[grep ("^q", Names (MyData))] Select the data column whose name begins with "Q".

Reference to the data frame row :

D[n,] Gets a data frame that consists of a row.
D[c (M,n,...), n] gets a data frame consisting of several rows.
Head () gets a data frame consisting of the first 6 rows.
Tail () Gets the last 6 rows of the data frame.

Http://www.cnblogs.com/youxilua/archive/2012/01/12/2320455.html the more advanced

Http://www.360doc.com/content/13/1221/22/7440765_339121786.shtml Low to Advanced

http://blog.sina.com.cn/s/blog_5de124240101q5vw.html R Language Drawing progressive

Http://blog.sina.com.cn/s/blog_6cfc336b01018wcg.html

Http://www.cnblogs.com/holbrook/archive/2013/05/13/3075777.html

Drawing

Output Data

(1) R for Beginners (Chinese version). pdf

Write.table (x1,file= "E:\\siemens\\trafficdata\\result (). csv", append=false,quote=false,sep= ",", eol= "\ n", na= " NULL ", dec=". ", Row.names=false,col.names=true)

(2)

Cat ()

Http://blog.sina.com.cn/s/blog_5de124240101pwyv.html

Plot parameter

Http://blog.sina.com.cn/s/blog_6a02b6330101abn5.html

Plot parameter row multiple graphs

Par (mfrow=c (2,3)) a chart shows 2 rows, 3 columns. Mfrow Mfcol

Http://www.biostatistic.net/thread-94936-1-1.html

R-Language graphics Interactive iplots graphic interaction package

Http://www.cnblogs.com/speeding/p/4060500.html

# date to convert

Fitbit$date <-as. Date (Fitbit$date, "%y year%m month%d Day")

Strptime ("", format)

typeof (Sys.date ())

Http://www.biostatistic.net/thread-7035-1-1.html

R language: As. Posixlt () function Chinese help document (bilingual) Date- time conversion function

http://www.cnblogs.com/speeding/p/4159264.html

Solutions for Rjava packages that cannot be loaded in the R locale

Run the following command in the R language:

Sys.setenv (java_home= ' C:/Program Files (x86)/java/jdk1.7.0_55/jre ')

Sys.setenv (java_home= ' D:/programfiles/java/jdk1.7.0_40/jre ')

or add it in the profile R\r-3.1.2\etc\rprofile.site

The 32bits R must be 32bits java.

http://blog.fens.me/r-rjava-java/

http://www.haodaima.net/art/2522754

JAVA calls R language

1. Install Rjava-install.packages ("Rjava")-/rjava/jri/there will be 3 Jri files.

2. System environment variable settings: path add

... \library\rjava\jri

... R\win-library\3.1\rjava\jri\x64

... \r\r-3.1.2\bin\x64

3. Import 3 Jri files in a Java project.

4. Eclipse Run Settings Add VM arguments:-djava.library.path= "C:\Users\zhangjiajie\Documents\R\win-library\3.1\rJava\jri\x64 "

Rjava Tomcat Configuration

1) Three jar packages are placed in the Lib directory;

2) \rjava\jri\i386\jri.dll placed in the \tomcat8.0\bin\ directory;

Fitting

Linear regression

M <-LM (Y~x1+x2+x3,data=dataframe)//M saves the regression model

ANOVA (m)//Variance Analysis Table

Coef (m)//model factor

Confint (m)//confidence interval of regression coefficients

Deviance (m)//residuals squared sum

Effects (m)//orthogonal effect vectors

Fitted (m)//fit vector of y values

Residuals (m)//model residuals

Resid (m)//model residuals

Summary (m)//r2,f statistics

Vcov (m)//covariance matrix

Linear regression with interacting items

Lm (y~x1*x2), y = a*x1+b*x2+c*x1*x2+d

Lm (Y~X1*X2*X3)

Y =a*x1+b*x2+c*x3+d*x1*x2+e*x1*x2+f*x2*x3+g*x1*x2*x3+h

Lm (Y~X1+X2+X3+X1:X2:X3)

y = a*x1+b*x2+c*x3+d*x1*x2*x3+e

SETP stepwise regression, you can remove the meaningless variable backwards, you can add a new variable to the forward regression

Lm (y~x1, subset=1:100) selects only the first 100 data for regression

Lm (Y~i (X1+X2)) to (X1+X2) regression

Lm (Y~ploy (x,3,raw=true)) Y is the three-quadratic polynomial regression of x

Lm (log (y) ~ x1)

System.time (r_expression)//R run time

Calculate correlation coefficients

Http://www.biostatistic.net/thread-7287-1-1.htmlby

By (data, INDICES, fun, ..., simplify = TRUE)

Http://www.biostatistic.net/thread-7388-1-1.htmlcor

Cor (data)

By (data, data$x, Cor)

Http://www.klshu.com/tag/r%E8%AF%AD%E8%A8%80/page/2

Happy Tree R language

Http://www.klshu.com/1798.html

Http://www.klshu.com/1077.html using R to complete decision tree classification

Http://www.klshu.com/1555.html

Ggplot

http://www.klshu.com/1591.html Plot and Ggplot2 drawing comparison

Http://www.klshu.com/1667.html

SVM pattern Recognition, classification, regression analysis

Http://www.klshu.com/1430.html

Apply function Family

Http://www.klshu.com/1202.html

Http://www.klshu.com/1175.html

Http://www.klshu.com/1188.html

Http://www.klshu.com/1107.html

Association Rules

Http://www.klshu.com/1185.html

Common functions

Http://www.klshu.com/1144.html

Http://www.klshu.com/1147.html R language with shiny package to quickly build interactive Web applications

Http://www.klshu.com/1073.html

Set of data mining functions

the finding of http://www.klshu.com/1071.html K-means clustering number

Http://www.klshu.com/238.html

Http://www.klshu.com/693.html

Http://www.klshu.com/719.html

Http://www.klshu.com/107.html Basic

Drawing

Http://www.klshu.com/25.html

Three-dimensional drawing

Http://www.klshu.com/21.html

R Code Specification

Http://www.mamicode.com/info-detail-374357.html

R Language and Data Analysis VI: A brief introduction to time series

Http://www.empowerstats.com/cn/download.html

Download Easy Empower Statistics

http://yanping.me/shiny-tutorial/

Chinese tutorial: Building shiny Applications with R

Http://book.51cto.com/art/201408/449427.htm

Xts

http://blog.csdn.net/desilting/article/details/39013825

ARIMA (P,D,Q) model, ACF, PACF

Http://www.biostatistic.net/thread-6683-1-1.html

AIC Information Guidelines

BIC Information Guidelines

Http://www.biostatistic.net/thread-40266-1-1.html

Forecast package. Auto.arima (): Returns the best Arima model based on the AIC,AICC or BIC value

diff (x,lag=n) #滞后差分, lag is used to specify several lags. The default lag value is 1.

X<-c (1,5,23,29)

diff (x, lag=n) = = = C (x[1+n]-x[1], x[2+n]-x[2], x[3+n]-x[3],...)

http://blog.163.com/zzz216@yeah/blog/static/16255468420147179438149/

DPLYR Data Processing Package

http://ju.outofmemory.cn/entry/84555

Processing time data with lubridate packets

return

Http://www.cnblogs.com/luosha/archive/2012/06/30/2571542.html

Linear regression LM and prediction

http://www.douban.com/note/298285612/

LM Results Analysis

Http://www.biostatistic.net/thread-7433-1-1.html

Linear regression LM

Summary looks at the results returned by LM.

In residuals: we can see some information about the residuals: minimum maximum, 4-cent, etc. Coefficients: Medium is the most critical of the relevant estimates for C and B. Where estimate is the estimate with the B,c value, Std. Error is the standard deviation of the regression parameters B and c: SD (b), SD (c). The remaining two parameters are hypothesis tests of the regression parameters: T value is the t of the hypothesis test for B,c, and the P-value (used to compare with the significance level to decide whether to accept the AH hypothesis test) Pr (>|t|).

Finally we can see 3 * numbers, which indicates that X

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.