Use of the R language data frame-dataframe

Source: Internet
Author: User
Tags first row

R language Knowledge points too much, can only one to understand, to apply, I believe that the end of the cumulative can achieve proficiency, the following is in the study of "statistical Modeling and R Software" when the notes
1, the data frame is the R language in a data structure, its internal can be a variety of data types, each column is a variable, each row is an observation record. In R the data frame is a very common data structure, it is a special kind of list object
2. Initialize Data frame

> mydataframe=data.frame (
+ name=c (\ "Zhang San \", \ "John Doe \", \ "Harry \", \ "Zhao Liu \", \ "ding \"),
+ sex=c (\ "f\", \ "f\", \ "m\", \ "m\ ", \" M\ "),
+ age=c (55.0, 60.0, +, +),
+ height=c (167.5, 156.3, 177.3, 167.5, 170.0), +
Weight=c 63.0, 53.0, 69.5)
+);
> mydataframe
  name Sex Age Height weight
1 sheets three   f  167.5   55.0
2 John Doe   F  17< c16/>156.3   60.0
3 Harry   m  177.3   63.0
4 Zhao Liu M 167.5   53.0
5 Ding   M  69.5 170.0   


3, the list data can be converted into dataframe

> mylist<-list (
+ name=c (\ "Zhang San \", \ "John Doe \", \ "Harry \", \ "Zhao Liu \", \ "ding \"),
+  sex=c (\ "f\", \ "f\", \ "m\", \ "M\", \ "M\"),
+  age=c (55.0, +, +),
+  height=c (167.5, 156.3, 177.3, 167.5, 170.0)
, + weight=c , 60.0, 63.0, 53.0, 69.5)
+);
> MyList
$name
[1] \ "Zhang San \" \ "john Doe \" \ "Harry \" \ "Zhao Liu \" \ "ding \"
$sex
[1] \ "f\" \ "f\
" \ "m\" \ "m\" \ "M\"  $age
[1]
$height [
1] 167.5 156.3 177.3 167.5 170.0
$weight
[1] 55.0 60.0 63.0 53.0 69.5
> Mylist=as.data.frame (mylist)
> MyList
  name Sex Age Height weight
1 sheets three   F  16  167.5   55.0
2 John Doe   F  156.3   60.0
3 Harry   M  177.3   63.0
4 Zhao Liu   m  167.5   53.0
5 ding   m  170.0   69.5

4, the matrix can be converted into a data frame, if the original column name, then the column will be changed as the data frame variable name, if there is no column name, then the system automatically for the matrix of the columns of a variable name, such as: V1,v2,v3 ...

> X=array (1:12,c (3,4))
> x
[, 1] [, 2] [, 3] [, 4]
[1,] 1 4 7
[2,] 2 5 8 each
[3,] 3 6 9 12
  > x=as.data.frame (x)
> x
V1 V2 V3 V4
1 1 4 7 2 2 5
8 each
3 3 6 9 12


5. Reference to Data frame
(1) Using subscript reference

> Mydataframe[1:4,3:5] Age
Height weight
1 167.5
2 156.3
3 177.3
4 16 167.5 53


Description: Represents the data from the 3rd to 5th column showing rows 1th through 3rd
(2) Reference by list name

> mydataframe[[\ "weight\"]
[1] 55.0 60.0 63.0 53.0 69.5
> mydataframe[[\ "height\"]
[1] 167.5 156.3 177.3 167.5 170.0
> Mydataframe$height
[1] 167.5 156.3 177.3 167.5 170.0


(3) The function of the names of data frame

> Names (mydataframe)
[1] \ "name\" \ "dex\" \ "age\" \ "height\" \ "weight\"
> Mydataframe
name Dex age  Height weight
1 sheets three f 167.5 55.0
2 John Doe F 156.3 60.0
3 Harry M 177.3 63.0
4 Zhao six m all 167.5 53.0
5 Ding M 170.0 69.5
> Rownames (mydataframe) =c (\ "First line \", \ "second row \", \ "third row \", \ "line fourth \", \ "fifth row \")
> Mydataframe
name dex Age Height weight
first line Zhang San F 167.5 55.0
Second baggage four F the 156.3 60.0
third row Harry M
177.3 63.0 Row four Zhao Liu m 167.5 53.0
fifth row ding M 170.0 69.5
> Colnames (mydataframe) =c (\ "first column \", \ "second column \", \ "third column \", \ "fourth column \", \ "V Column \ ")
> Mydataframe
first column second column column fourth column fifth column
first row Zhang San F 167.5 55.0
Second baggage four F 156.3 60.0
third row Harry M 177.3 63.0
Fourth line Zhao Liu m 167.5 53.0
fifth row ding M 19 170.0 69.5


6, attach function, the main purpose of data frame is to save statistical modeling data, r statistical modeling functions need to data frame as input data, we can treat the data frame as a matrix. You can use the data frame name $ variable name to get the variable value of the data frame when you use the variable for the data frame. But this usage is cumbersome, R provides the attach () function to "connect" the variables in the data frame into memory, which makes it easy to call data frame data.
(1) using the Attach () function to load the data frame into memory

> Attach (mydataframe)
> R=height/weight
Error: Object not found \ ' height\ '
> r=\ ' fourth column \ '/\ ' fifth column \ '
error in \ ' fourth column \ "/\" fifth column \ ": There are non-numeric parameters in the binary operators
> Mydataframe=mylist
> Attach (mydataframe)
>
r=height/weight > R
[1] 3.045455 2.605000 2.814286 3.160377 2.446043


(2) Add a new variable to the data frame

> mydataframe
name Sex Age Height weight
1 three f 167.5 55.0
2 John Doe F 156.3 60.0
3 Harry M 18 177.3 63.0
4 Zhao six m 167.5 53.0
5 ding M 170.0 69.5
> Mydataframe$myr=height/weight
> Mydataframe
  name sex Age Height weight MyR
1 three F 167.5 55.0 3.045455
2 John Doe F 156.3 60.0 2.605000
3 Harry M 18 177. 3 63.0 2.814286
4 Zhao six m 167.5 53.0 3.160377
5 ding M 170.0 69.5 2.446043
> 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.