The regular expression of R language is mainly used to deal with textual data, such as searching, replacing and so on.
The first is some of the functions that will be used when working with text:
String split: Strsplit ()
String connection: Paste (), Paste0 ()
Calculate string Length: nchar (), Length ()
String intercept: substr (), substring ()
See if there is a character: grep (P,X) < returns the subscript (position) of the matched element, p is a regular expression
GREPL (P,X) < return logical value FALSE or True
Give me a chestnut:
s<-c ( " 123abc\\456 " , " abc123edf " Span style= "COLOR: #800000" > " ) grep ( " 123 ", s) #匹配" 123 " Grepl ( " xcd ", s) #" 123 "is successful, returns the logical value
> grep ("123", s) [112> Grepl (" Xcd", s) [1] false false
Character substitution: Sub (p,replace,x) < replace the first character found
Gsub (P,REPLACE,X) < replace all found content
eg
s1<-c ( " 123edf123 " ) Sub ( " 123 ", " Span style= "COLOR: #800000" >sss ", S1) #替换s1里 " 123 " is" Sss " gsub ( 123 " , " sss ", S1) #对找到的所有内容进行替换
> Sub ("123","sss", S1) [1] sssedf123 "> Gsub ("123","sss", S1) [ 1"sssedfsss"
____________________________________________________________________
The following is a description of the regular expression in the R language
Illustrate the problem directly with an example:
1, \d whether to match any one number
S3<-c ("123abc\\456","abc123\\def123", "" ") GREPL ("\\d", S3) #是否匹配到任意一个数字
[1] True True
#等价于: Grepl ("[0-9]", S3)
2, whether the \d match to any non-digital
Grepl ("\\d", S3) #是否匹配到任意一个非数字 # equivalent to: Grepl ("[^0-9]" , S3) #^ represents a logical non-
3. Does \w match to any number, letter, underline
Grepl ("\\w", S3) #是否匹配到任何一个数字, letters, underscores equivalent to: Grepl ("[a-za-z0-9] ", S3)
4. Does \w match to any one non-digit, letter, underline
Grepl ("\\w"
5, \ \ Escape
Grepl ("\\\\", S3) #是否匹配到 \ \
6,. Match to any character
Grepl (". ", S3)
7, | Represent or
Grepl ("56|ab", s) #是否匹配到 "+" or "AB"
8. ^
Grepl ("^a", S3) #是否以a开头
9, $
Grepl ("6$", S3) #是否以6结尾
10. ()
Grepl ("ABC (. +) 456", S3) #"ABC" and "456 " is there any (.) between Greater than or equal to 1 (+) characters
11. []
Grepl ("[123,456]", S3) #等价于: Grepl ("123|456 ", S3)
12. {}
Grepl ("[def]{2}", S3) #"def" did not appear more than 3 times
13. *
Grepl (". *", S3) #任意字符是否匹配0次以上, must return true (including spaces)
14, +
Grepl (". +", S3) #任意字符至少能匹配一次, Space returns false
15,?
Grepl ("[456]? " ", S3) #匹配0次为true, Matches 1 times also true
R language-Regular expression 1