As we all know, in the process of development, it is inevitable to encounter the need to match, find, replace, judge the situation of strings, and these situations are sometimes more complex, if the use of pure coding to solve, will often waste the programmer's time and energy. Therefore, learning and using regular expressions are the main means to solve this contradiction.
As we all know, regular expressions are a specification that can be used for pattern matching and substitution. A regular expression is a literal pattern consisting of ordinary characters (such as characters A through Z) and special characters (metacharacters) that describe one or more strings to be matched when finding the body of the text. A regular expression is used as a template to match a character pattern with the string being searched for.
One: What is a regular expression
1. Definition: Regular expressions are a specification that can be used for pattern matching and substitution. A regular expression is a literal pattern consisting of ordinary characters (such as characters A through Z) and special characters (metacharacters) that describe one or more strings to be matched when finding the body of the text. A regular expression is used as a template to match a character pattern with the string being searched for.
2. Use:
String matching (character matching)
String Lookup
String substitution
String segmentation
For example:
Remove email address from web page
IP address is correct
Find links from Web pages
the class that handles regular expressions in 3.java:
Java.lang.String
Java.util.regex.Pattern: Pattern class: A pattern in which strings are matched so that the pattern itself has been compiled and used with much greater efficiency.
Java.util.regex.Matcher: Matching class: This pattern matches the results of a string, and there may be many.
4: The following through a small program to introduce the regular expression of a simple
Import Java.util.regex.Matcher;
Import Java.util.regex.Pattern;
The public class Test {public
static void Main (string[] args) {
//matches () determines whether the string matches an expression, "." Represents any one character
p ("ABC". Matches ("..."));
Replaces the number in the string "a2389a" with *, \d represents the "0--9" number
p ("a2389a". ReplaceAll ("\\d", "*"));
Compiles any string that is a--z with a string length of 3, which speeds up the matching speed pattern
p = pattern.compile ("[A-z]{3}");
To match and place the matching result in the Matcher object
Matcher m = P.matcher ("abc");
P (m.matches ());
The three lines above can replace
p ("ABC". Matches ("[A-z]{3}") with the following line of code)
;
public static void P (Object o) {
System.out.println (o);
}
Here is the print result
True
A****a
True
True
Now there are some experiments to illustrate the matching rules of regular expressions, here is the greedy way
. Any character
A? A once or once is not
A * A 0 or more times
A + a one or more times
A{n}? A happens to be n times
A{n,}? A at least n times
A{n,m}? A at least n times, but not more than m times
//Preliminary understanding. * +?
P ("a". Matches (".")); /true
p ("AA". Matches ("AA"));//true
p ("AAAA". Matches ("A *"));//true
P ("AAAA". Matches ("A +")); /true
P ("". Matches ("A *"));//true
p ("AAAA". Matches ("a?"); /false
P ("". Matches ("A?")); /true
P ("a". Matches ("a")); /true
P ("1232435463685899". Matches ("\\d{3,100}"));//true
p ("192.168.0.aaa". Matches ("\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}") "));//false
P (" matches "(" [0-2][0-9][0-9] "));//true
[ABC] A, B, or C (simple Class)
[^ABC] Any character except A, B, or C (negation)
[A-za-z] A to Z or A to Z, and the letters at both ends are included (range)
[A-d[m-p]] A to D or M to P:[a-dm-p] (and set)
[A-z&&[def]] D, E or F (intersection)
[A-Z&&[^BC]] A to Z, except B and C:[ad-z] (minus)
[A-z&&[^m-p]] A to Z, not M to P:[a-lq-z] (minus)
Range
P ("a". Matches ("[ABC]"));//true
P ("a". Matches ("[^ABC]"));//false
P ("A". Matches ("[A-za-z]"));//true
P ("A". Matches ("[a-z]|[ A-z]);//true
P ("A". Matches ("[a-z[a-z]]");//true
P ("R". Matches ("[A-Z&&[RFG]]");//true
\d number: [0-9]
\d Non-digit: [^0-9]
\s whitespace characters: [\t\n\x0b\f\r]
\s non-whitespace characters: [^\s]
\w Word characters: [a-za-z_0-9]
\w non-word characters: [^\w]
Know \s \w \d \
P ("\n\r\t". Matches ("\\s (4)"));//false
P ("". Matches ("\\s"));//false
P ("A_8". Matches ("\\w (3)"));//false
P ("abc888&^%". Matches ("[a-z]{1,3}\\d+[&^#%]+"));//true
P ("\ \" Matches ("\\\\"));//true
Boundary Matching Device
^ The beginning of a line
$ End of line
\b Word boundaries
\b Non-word boundaries
\a the beginning of the input
\g the end of the previous match
\z the end of the input, only for the last terminator (if any)
End of \z input
Boundary matching
P ("Hello sir". Matches ("^h.*"));//true
P ("Hello sir". Matches (". *ir$"));//true
P ("Hello sir". Matches ("^h[a-z]{1,3}o\\b.*"));//true
P ("Hellosir". Matches ("^h[a-z]{1,3}o\\b.*"));//false
Blank line: One or more (blank and non-newline characters) at the beginning and ending with a newline character
P ("\ n". Matches ("^[\\s&&[^\\n]]*\\n$"));//true
Method parsing
Matches (): matches the entire string
Find (): matching substring
Lookingat (): Always start matching from the beginning of the entire string
Email
P ("Asdsfdfagf@adsdsfd.com". Matches ("[\\w[.-]]+@[\\w[.-]]+\\.[ \\w]+ "));//true
Matches () find () Lookingat ()
Pattern p = pattern.compile ("\\d{3,5}");
Matcher m = P.matcher ("123-34345-234-00");
The entire "123-34345-234-00" is found to match with the regular expression engine, and when the first "-" does not match, it stops.
But will not be mismatched "-" Spit It Out
P (m.matches ());
Will not match the "-" Spit It Out
M.reset ();
1: When the front has P (m.matches ()), find the substring from the "..." 34345-234-00 "Start
It's going to be 1th, 22, "34345" and "234", and 2 will not be found to be false.
2: When the Front has P (m.matches ()); and M.reset (); Find substring starting from "123-34345-234-00"
will be for True,true,true,false
P (M.find ());
P (M.start () + "---" +m.end ());
P (M.find ());
P (M.start () + "---" +m.end ());
P (M.find ());
P (M.start () + "---" +m.end ());
P (M.find ());
If we don't find it, we'll report an anomaly. Java.lang.IllegalStateException
P (M.start () + "---" +m.end ());
P (M.lookingat ());
P (M.lookingat ());
P (M.lookingat ());
P (M.lookingat ());
String substitution: The following method is very flexible for string substitution
String substitution
Pattern.case_insensitive not sensitive to case
Pattern p = pattern.compile ("java", pattern.case_insensitive);
Matcher m = P.matcher ("Java Java Java ilovejava Youhatejava adsdsfd");
Storing strings
StringBuffer buf = new StringBuffer ();
Count Odd even
int i = 0;
while (M.find ()) {
i++;
if (i%2 = = 0) {
M.appendreplacement (buf, "Java");
}else{
M.appendreplacement (buf, "JAVA");
}
}
Without this sentence, the string adsdsfd will be abandoned
M.appendtail (BUF);
P (BUF);
Results Print:
Java Java Java Ilovejava Youhatejava ADSDSFD
Group
Group group, grouped by ()
Pattern p = pattern.compile ("(\\d{3,5}) ([a-z]{2})");
String s = "123aa-34345bb-234cc-00";
Matcher m = P.matcher (s);
P (M.groupcount ());//2 Group
while (M.find ()) {
P (M.group ());//digital letters all have
P (M.group (1));//Only number
P (M.group (2));//Only letters
}
Second, regular expression simple use
Java Regular Expressions use the