R Language Practical reading notes (iv) Basic data management

Source: Internet
Author: User

4.2 Creating a new variable

Several operators:

^ or * *: exponentiation

X%%y: Seeking Redundancy

X%/%y: integer except

4.3 Re-Encoding of variables

With ():

Within (): Can modify data frame

4.4 Variable renaming

Package reshape has a function rename, can be renamed rename (df,c (manage= ' ManagerID ', date= ' testdate '))

Or

Names (DF) [2]<-' NewName '

4.5 Missing values

Is.na (): Check for missing value, return True, no return false

The Na.rm=true option can be used, such as

Y<-sum (X,na.rm=true)

Removing missing values

Newdf<-na.omit (DF)

4.6 Date value

As. Date (): It is hard to remember the value of the parameter Input_format, the default date is YYYY-MM-DD

%d: numeric date

%a: Abbreviated Week name

%A: Non-abbreviated weekday name

%m:00~12

%B: Abbreviated Month

%B: Non-abbreviated month

%y: Two-bit year

%Y: four-bit year

Sys.date (): Current date

Date (): Returns the current date and time, why not make a good name, not to call date and date

You can use format to extract something.

Today <-Sys.date ()
Format (today, format = "%B%d%Y")
Format (today, format = "%A")

Date can be subtracted

StartDate <-as. Date ("2004-02-13")
EndDate <-as. Date ("2009-06-22")
Days <-Enddate-startdate

can also

Today <-Sys.date ()
Format (today, format = "%B%d%Y")
Dob <-as. Date ("1956-10-10")
Format (DOB, format = "%A")

4.6.1 Converting a date to a character variable

As.character

4.7 Type Conversions

Is.numeric-As.numeric

Is.character

Is.vector

Is.data.frame

Is.factor

Is.logical

4.8 Sorting Data

Order ()

NewData <-Leadership[order (leadership$age),] This is ascending, preceded by a minus sign is descending

NewData <-Leadership[order (Gender,-age),] this is sorted by sex ascending, age descending

4.9 Merging of datasets

4.9.1 Adding columns

Merge two data frames horizontally, with merge ()

Newdf<-merge (dfa,dfb,by= "ID")

Newdf<-merge (Dfa,dfb,by=c ("ID", "country"))

If you do not need to connect, you can use Cbind

4.9.2 Adding rows

Rbind

4.10 Subset of data sets

4.10.1 Selecting variables

Select columns

Data<-df[,c (6:10)]

or select by name

Myvars <-C ("Q1", "Q2", "Q3", "Q4", "Q5")
NewData <-Leadership[myvars]

4.10.2 Culling variables

Myvars <-Names (leadership)%in% C ("Q3", "Q4") get intersection
NewData <-Leadership[!myvars] Take the reverse, the result is to remove q3,q4

Or

NewData <-Leadership[c (-7,-8)]

4.10.3 Selective observation

which function

4.10.4 subset

NewData <-Subset (leadership, age >= | Age <, select = C (Q1, Q2, Q3, Q4))
NewData <-Subset (leadership, gender = "M" & Age >, select = Gender:q4)

4.10.5 Random Sampling

Sample

Sample<-df[sample (1:nrow* (DF), 3,replace=false] do not put back sampling

4.11 Manipulating data frames with SQL

Library (SQLDF)
NEWDF <-sqldf ("select * from Mtcars where carb=1 order by mpg", row.names = TRUE)
NEWDF <-sqldf ("Select AVG (MPG) as Avg_mpg, AVG (disp) as avg_disp,gear from Mtcars where Cyl in (4, 6) GROUP by Gear")

R Language Practical reading notes (iv) Basic data management

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.