Summarize:
(1)
. Match any character except the line feed
\w Match letters or numbers
\s matches any whitespace character
\d Matching numbers
\b Match the start or end of a word
^ Start of Match string
$ end of Match string
(2)
If you want to find the specific code itself, such as if you're looking for, or *, there's a problem: You can't specify them because they'll be interpreted as other meanings.
Then you have to use \ To remove the special meaning of these characters. Therefore, you should use \. and \*. Of course, to find \ itself, you have to use \.
(3)
* Repeat 0 or more times
+ Repeat one or more times
? Repeat 0 times or once
{n} repeat n times
{N,} repeat n or more times
{N,m} repeats n to M times
(4)
\w matches any character that is not a letter or a number
\s matches any character that is not whitespace
\d matches any number of non-numeric characters
\b Match is not the beginning or end of a word
[^x] matches any character other than X
[^aeiou] matches any character other than the aeiou of these letters
(5)
(exp) match exp, and capture text into an automatically named group
(? <name>exp) match exp and capture text into a group named name
(?: EXP) match exp, do not capture matching text
Location designation
(? =exp) match the position of the exp front
(? <=exp) matches the position of the exp rear
(?! EXP) match the position that follows not exp
(? <!exp) matches a position not previously exp
Comments
(? #comment) This type of group does not have any effect on the processing of regular expressions, just to provide readers with comments
(6)
* Repeat any time, but as little as possible
+? Repeat 1 or more times, but repeat as little as possible
?? Repeat 0 or 1 times, but repeat as little as possible
{n,m}? Repeat N to M times, but repeat as little as possible
{N,}? Repeat more than n times, but repeat as little as possible
Give an example:
In Java, a regular expression is used to intercept the first occurrence of a string in a string before the English left parenthesis. For example: Beijing (Haidian District) (Xicheng District), the results of the interception: Beijing. The regular expression is ()
".*? (?=\\()"
".*? (?=\()"
".*(?=\\()"
".*(?=\()"
Code:
Import Java.util.regex.Matcher;
Import Java.util.regex.Pattern;
public class Staticstuff {
public static void Main (string[] args) {
String text = "Beijing (Haidian) (Chaoyang District) (Xicheng area)";
Pattern pattern = Pattern.compile (". *?") =\\()" );
Matcher Matcher = pattern.matcher (text);
if (Matcher.find ()) {
System.out.println (Matcher.group (0));
}
}
}
BD will complain: Invalid escape sequence (valid ones are \b \ \ \ \f \ r \ \ \) need to use \\\
The results of the operation of C: Beijing (Haidian) (Chaoyang District)
A's operation result is: Beijing
The difference between A and C is:
In front of. * The meaning of a greedy match means to find the smallest.
(? =expression) in order to look around, (? =\\ () is the matching bracket
1. What is the regular expression of greed and non-greedy match
such as: String str= "ABCAXC";
Patter p= "ab*c";
Greedy match: Regular expressions tend to match the maximum length, which is called greedy match. As above using pattern p to match the string str, the result is a match to: Abcaxc (ab*c). Non-greedy match: that is, match to the result is good, less matching characters. As above using pattern p to match the string str, the result is a match to: ABC (AB*C).
2. How to differentiate between two modes in programming
The default is greedy mode; Add a question mark directly after the quantifier. is not a greedy model.
Quantifiers: {m,n}:m to n
*: Any number of
+: one to multiple. : 0 or one then this topic. Represents any character other than \ n that matches 0-infinity
+ Represents a matching 1-infinity (? =expression) sequential Look, (? =\\ () is the matching bracket lazy mode Regular:
Src= ". *? (? =\\ ()) "Results: Beijing ended a match because it matched to the first one. Does not continue to match backwards. Because he is lazy.
Reprinted from: Http://deerchao.net/tutorials/regex/regex-1.htm