R language Knowledge points too much, can only one to understand, to apply, I believe that the end of the cumulative can achieve proficiency, the following is in the study of "statistical Modeling and R Software" when the notes
1, the data frame is the R language in a data structure, its internal can be a variety of data types, each column is a variable, each row is an observation record. In R the data frame is a very common data structure, it is a special kind of list object
2. Initialize Data frame
> mydataframe=data.frame (
+ name=c (\ "Zhang San \", \ "John Doe \", \ "Harry \", \ "Zhao Liu \", \ "ding \"),
+ sex=c (\ "f\", \ "f\", \ "m\", \ "m\ ", \" M\ "),
+ age=c (55.0, 60.0, +, +),
+ height=c (167.5, 156.3, 177.3, 167.5, 170.0), +
Weight=c 63.0, 53.0, 69.5)
+);
> mydataframe
name Sex Age Height weight
1 sheets three f 167.5 55.0
2 John Doe F 17< c16/>156.3 60.0
3 Harry m 177.3 63.0
4 Zhao Liu M 167.5 53.0
5 Ding M 69.5 170.0
3, the list data can be converted into dataframe
> mylist<-list (
+ name=c (\ "Zhang San \", \ "John Doe \", \ "Harry \", \ "Zhao Liu \", \ "ding \"),
+ sex=c (\ "f\", \ "f\", \ "m\", \ "M\", \ "M\"),
+ age=c (55.0, +, +),
+ height=c (167.5, 156.3, 177.3, 167.5, 170.0)
, + weight=c , 60.0, 63.0, 53.0, 69.5)
+);
> MyList
$name
[1] \ "Zhang San \" \ "john Doe \" \ "Harry \" \ "Zhao Liu \" \ "ding \"
$sex
[1] \ "f\" \ "f\
" \ "m\" \ "m\" \ "M\" $age
[1]
$height [
1] 167.5 156.3 177.3 167.5 170.0
$weight
[1] 55.0 60.0 63.0 53.0 69.5
> Mylist=as.data.frame (mylist)
> MyList
name Sex Age Height weight
1 sheets three F 16 167.5 55.0
2 John Doe F 156.3 60.0
3 Harry M 177.3 63.0
4 Zhao Liu m 167.5 53.0
5 ding m 170.0 69.5
4, the matrix can be converted into a data frame, if the original column name, then the column will be changed as the data frame variable name, if there is no column name, then the system automatically for the matrix of the columns of a variable name, such as: V1,v2,v3 ...
> X=array (1:12,c (3,4))
> x
[, 1] [, 2] [, 3] [, 4]
[1,] 1 4 7
[2,] 2 5 8 each
[3,] 3 6 9 12
> x=as.data.frame (x)
> x
V1 V2 V3 V4
1 1 4 7 2 2 5
8 each
3 3 6 9 12
5. Reference to Data frame
(1) Using subscript reference
> Mydataframe[1:4,3:5] Age
Height weight
1 167.5
2 156.3
3 177.3
4 16 167.5 53
Description: Represents the data from the 3rd to 5th column showing rows 1th through 3rd
(2) Reference by list name
> mydataframe[[\ "weight\"]
[1] 55.0 60.0 63.0 53.0 69.5
> mydataframe[[\ "height\"]
[1] 167.5 156.3 177.3 167.5 170.0
> Mydataframe$height
[1] 167.5 156.3 177.3 167.5 170.0
(3) The function of the names of data frame
> Names (mydataframe)
[1] \ "name\" \ "dex\" \ "age\" \ "height\" \ "weight\"
> Mydataframe
name Dex age Height weight
1 sheets three f 167.5 55.0
2 John Doe F 156.3 60.0
3 Harry M 177.3 63.0
4 Zhao six m all 167.5 53.0
5 Ding M 170.0 69.5
> Rownames (mydataframe) =c (\ "First line \", \ "second row \", \ "third row \", \ "line fourth \", \ "fifth row \")
> Mydataframe
name dex Age Height weight
first line Zhang San F 167.5 55.0
Second baggage four F the 156.3 60.0
third row Harry M
177.3 63.0 Row four Zhao Liu m 167.5 53.0
fifth row ding M 170.0 69.5
> Colnames (mydataframe) =c (\ "first column \", \ "second column \", \ "third column \", \ "fourth column \", \ "V Column \ ")
> Mydataframe
first column second column column fourth column fifth column
first row Zhang San F 167.5 55.0
Second baggage four F 156.3 60.0
third row Harry M 177.3 63.0
Fourth line Zhao Liu m 167.5 53.0
fifth row ding M 19 170.0 69.5
6, attach function, the main purpose of data frame is to save statistical modeling data, r statistical modeling functions need to data frame as input data, we can treat the data frame as a matrix. You can use the data frame name $ variable name to get the variable value of the data frame when you use the variable for the data frame. But this usage is cumbersome, R provides the attach () function to "connect" the variables in the data frame into memory, which makes it easy to call data frame data.
(1) using the Attach () function to load the data frame into memory
> Attach (mydataframe)
> R=height/weight
Error: Object not found \ ' height\ '
> r=\ ' fourth column \ '/\ ' fifth column \ '
error in \ ' fourth column \ "/\" fifth column \ ": There are non-numeric parameters in the binary operators
> Mydataframe=mylist
> Attach (mydataframe)
>
r=height/weight > R
[1] 3.045455 2.605000 2.814286 3.160377 2.446043
(2) Add a new variable to the data frame
> mydataframe
name Sex Age Height weight
1 three f 167.5 55.0
2 John Doe F 156.3 60.0
3 Harry M 18 177.3 63.0
4 Zhao six m 167.5 53.0
5 ding M 170.0 69.5
> Mydataframe$myr=height/weight
> Mydataframe
name sex Age Height weight MyR
1 three F 167.5 55.0 3.045455
2 John Doe F 156.3 60.0 2.605000
3 Harry M 18 177. 3 63.0 2.814286
4 Zhao six m 167.5 53.0 3.160377
5 ding M 170.0 69.5 2.446043
>