R Language Data structure

Source: Internet
Author: User

5. Data Structure 5.1 data structure Introduction

(1) Vector

All elements of a vector must have the same type (pattern)

(2) List

The list can be non-homogeneous

List can be indexed by location: lst[[2]]

Extract sub-list: Lst[c (2,5)]

The list can have a name: lst[["Moe"]] or Lst$moe

Lists are similar to dictionaries, hash lists, and so on

(3) Mode: Entity type

> Mode (3.1415)

Each object in R has a pattern that indicates how the object is stored in memory:

Object

Example

Mode

Number

3.14

Numeric

Vector of Numbers

C (2.7, 3.14)

Numeric

Character string

"Moe"

Character

Vector of Character string

C ("Moe", "Larry")

Character

Factor

Factor (C ("NY", "CA", "IL"))

Numeric

List

List ("Moe", "Larry")

List

Data Frame

Data.frame (X=1:3, Y=c ("NY", "CA", "IL"))

List

Function

Print

function

(4) Class: Abstract type

> D <-as. Date ("2010-03-10")

> class (D)//result is "Date"

Each object in R has a class that defines their abstract type (class)

(5) Pure quantity (constant)

It's also called a vector with a unique element.

(6) Matrix

The matrix in R is just a vector of dimensions

The dimension of the vector, with an initial value of NULL

> A <-1:6

> Dim (A) <-c (2,3)//become 2*3 Matrix

(7) Arrays (array)

Matrices are just two-dimensional vectors, and arrays can be multi-dimensional vectors.

(8) factor (factor)

The unique value in the R record vector, each unique value is called the level of the associated factor, referring to 5.5

Factor two key applications: categorical variables, grouping

(9) Data frame

Designed to simulate datasets, with data sets in SAS or SPSS

5.2 Adding data to vectors

> v <-c (All-in-all)

> v <-c (V, 4)//Add 4 to the original vector: 1,2,3,4

> w <-c (5,6,7,8)

> v <-c (v,w)//Combine V and W

5.3 Inserting data into a vector

> Append (vec,newvalues, after=n)//insert NewValues after nth element in VEC

5.4 Understanding Circular Rules

When the shorter vector finishes all the elements, and the longer vectors still have the unhandled elements, the shorter vectors return to the starting position loop elements

5.5 Build Factor

Factors are made up of categorical variables, and the possible values for each categorical variable are called a horizontal

> F <-factor (v)

5.6 Creating a list

> LST <-list (0.5,0.8,0.3)

> LST <-list (mid=0.5, right=0.8, left=0.3)

> Lst[[2]]

>lst[["mid"] or lst["mid"] or Lst$mid

5.7 Removing elements from the list

> lst[["mid"]]<-null//Remove MID element

5.8 Converting a list to a vector

> v <-unlist (LST)

5.9 Remove the null-valued element from the list

> lst[sapply (lst,is.null)] <-null

5.10 Using conditions to remove list elements

> lst[lst< 0] <-null//removing elements less than 0

>lst[is.na (LST)] <-null//Remove elements with a value of NA

> Lst[abs (unlist (LST)) < 1]

5.11 Matrix Initialization

> Mat <-matrix (VEC, 2, 3)//Generate a 2*3 matrix from the VEC data

> Dim (VEC) <-c (2,3)//Method 2

5.12 Matrix Operations

> The transpose of T (a)//matrix A

> Solve (a)//inverse of matrix A

> A%*% B//Matrix A*b

> diag (n)//Generate an N-order diagonal Unit matrix

5.13 Assigning a descriptive name to the rows and columns of a matrix

> Rownames (MAT) <-c ("Rowname_1", "rowname_2", ..., "Rowname_n")

> Colnames (MAT) <-c ("Colname_1", "colname_2", ..., "Colname_n")

5.14 Select a row or column from the matrix

> Vec <-mat[1,]//result is a vector

> Vec <-mat[,2, Drop=false]//result is a matrix

5.15 initializing a data frame with column data

> Dfrm <-data.frame (v1, v2, v3, F1, F2)//Initialize data frame with vectors and factors

> LST <-list (v1, v2, v3)

> Dfrm <-as.data.frame (LST)//Method 2

5.16 initializing a data frame with row data

It is not possible to store data in a vector when the data in each row is mixed by different patterns of data, such as numbers, characters, and so on. Typically, each row is stored in a single row of data frames, and then a list is made, calling functions Rbind and Do.call to combine multiple rows into a large data frame.

> Obs <-list (Data.frame (Vc1=1, f1=0), Data.frame (vc1=2, f1=1))

> Dfrm <-rbind (obs[[1]], obs[[2])//Make the first two rows a data frame

> Dfrm <-do.call (rbind, OBS)//group all rows into one data frame

When OBS is not a list of data frames, but rather a list of lists, first call the map function to convert the row data into data frame data and then use the Do.call

> Dfrm <-do.call (Rbind, Map (As.data.frame, OBS))

5.17 add row to data frame

The new row is a single-line data frame pattern.

> Suburbs<-rbind (suburbs,

+ data.frame (city= "Nanjing", county= "Kane", pop=5421)

+ data.frame (city= "Beijing", county= "Jane", pop=5552))//Add two lines at a time

5.18 pre-allocated data frame

When the data volume is large, the memory manager of R will run poorly when you add new rows to build the data frame. If you know the number of rows that you must have, you can allocate space beforehand.

> N <-100000

> Dfrm <-data.frame (Colname1=numeric (n), Colname2=character (n), ...)

5.19 Select columns of the data frame

> Dfrm[[n]]//Return to column N, a vector

> Dfrm[n]//Returns a data frame with only nth column

>dfrm[c (N1,N2,N4)]//

> Dfrm[,n]//return a vector

>dfrm[,c (N1,N3)]//

> dfrm[["name"] > Dfrm$name//Return column named Name

> Subset (dfrm,select=c (colname1, colname2))//Select columns by column name

>subset (Dfrm, Select=c (colname1, colname2), subset= (colname1>0))//rows that meet the criteria, as well as as long as two columns

5.20 Modifying the column name of the data frame

> colnames (dfrm) <-c ("Before", "treatment", "after")

5.21 Edit Data frame

> Temp <-edit (dfrm)

> dfrm <-temp//Save the modified data frame as Temp

> Fix (DFRM)//overwrite original data frame after direct modification

5.22 removing rows containing na from the data frame

> Clean_dfrm<-na.omit (DFRM)

5.23 removing columns from the data frame

> subset (dfrm,select =-colname2)

5.24 merging two data frames

When the columns of the two data frames are inconsistent, the merge is horizontal, with Cbind:

> All.cols <-cbind (Dfrm1, Dfrm2)//Horizontal column Merge

When the columns of the two data frames are consistent, the merge is vertical, with Rbind:

> All.rows <-rbind (Dfrm1, Dfrm2)//Vertical row Merge

Based on a common column merge data frame, a SQL-like join with merge:

> M <-merge (Dfrm1, Dfrm2, by= "name")

5.25 more convenient access to data frame content

When using the columns in the data frame, you would need to dfrm$colname1, you can omit dfrm with the following command:

> with (dfrm,expr)//The current expression expr can be used directly with colname1

>attach (DFRM)//Can be used in the following expressions Colname1

5.26 conversions between basic data types

>as.character (x)//character type

>as.complex (x)//plural type

>as.numeric (x) or as.double (x)

>as.integer (x)

>as.logical (x)

5.27 conversions between different structured data types

Some conversions are not feasible, be careful.

>as.data.frame (x)

> as.list (x)

>as.matrix (x)

> As.vector (x)

R language Data structure

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.