4.2 Creating a new variable
Several operators:
^ or * *: exponentiation
X%%y: Seeking Redundancy
X%/%y: integer except
4.3 Re-Encoding of variables
With ():
Within (): Can modify data frame
4.4 Variable renaming
Package reshape has a function rename, can be renamed rename (df,c (manage= ' ManagerID ', date= ' testdate '))
Or
Names (DF) [2]<-' NewName '
4.5 Missing values
Is.na (): Check for missing value, return True, no return false
The Na.rm=true option can be used, such as
Y<-sum (X,na.rm=true)
Removing missing values
Newdf<-na.omit (DF)
4.6 Date value
As. Date (): It is hard to remember the value of the parameter Input_format, the default date is YYYY-MM-DD
%d: numeric date
%a: Abbreviated Week name
%A: Non-abbreviated weekday name
%m:00~12
%B: Abbreviated Month
%B: Non-abbreviated month
%y: Two-bit year
%Y: four-bit year
Sys.date (): Current date
Date (): Returns the current date and time, why not make a good name, not to call date and date
You can use format to extract something.
Today <-Sys.date ()
Format (today, format = "%B%d%Y")
Format (today, format = "%A")
Date can be subtracted
StartDate <-as. Date ("2004-02-13")
EndDate <-as. Date ("2009-06-22")
Days <-Enddate-startdate
can also
Today <-Sys.date ()
Format (today, format = "%B%d%Y")
Dob <-as. Date ("1956-10-10")
Format (DOB, format = "%A")
4.6.1 Converting a date to a character variable
As.character
4.7 Type Conversions
Is.numeric-As.numeric
Is.character
Is.vector
Is.data.frame
Is.factor
Is.logical
4.8 Sorting Data
Order ()
NewData <-Leadership[order (leadership$age),] This is ascending, preceded by a minus sign is descending
NewData <-Leadership[order (Gender,-age),] this is sorted by sex ascending, age descending
4.9 Merging of datasets
4.9.1 Adding columns
Merge two data frames horizontally, with merge ()
Newdf<-merge (dfa,dfb,by= "ID")
Newdf<-merge (Dfa,dfb,by=c ("ID", "country"))
If you do not need to connect, you can use Cbind
4.9.2 Adding rows
Rbind
4.10 Subset of data sets
4.10.1 Selecting variables
Select columns
Data<-df[,c (6:10)]
or select by name
Myvars <-C ("Q1", "Q2", "Q3", "Q4", "Q5")
NewData <-Leadership[myvars]
4.10.2 Culling variables
Myvars <-Names (leadership)%in% C ("Q3", "Q4") get intersection
NewData <-Leadership[!myvars] Take the reverse, the result is to remove q3,q4
Or
NewData <-Leadership[c (-7,-8)]
4.10.3 Selective observation
which function
4.10.4 subset
NewData <-Subset (leadership, age >= | Age <, select = C (Q1, Q2, Q3, Q4))
NewData <-Subset (leadership, gender = "M" & Age >, select = Gender:q4)
4.10.5 Random Sampling
Sample
Sample<-df[sample (1:nrow* (DF), 3,replace=false] do not put back sampling
4.11 Manipulating data frames with SQL
Library (SQLDF)
NEWDF <-sqldf ("select * from Mtcars where carb=1 order by mpg", row.names = TRUE)
NEWDF <-sqldf ("Select AVG (MPG) as Avg_mpg, AVG (disp) as avg_disp,gear from Mtcars where Cyl in (4, 6) GROUP by Gear")
R Language Practical reading notes (iv) Basic data management