1. Definition: A regular expression is a specification that can be used for pattern matching and substitution, and a regular expression is a text pattern consisting of ordinary characters (such as characters A through Z) and special characters (metacharacters) that describe one or more strings to match when looking up a text body. A regular expression, as a template, matches a character pattern to the string you are searching for.
2. Use:
- String match (character match)
- String Lookup
- String substitution
- String segmentation
For example:
- Pull out an email address from a webpage
- IP address is correct
- Pulling links from a webpage
Class for working with regular expressions in 3.java:
- Java.lang.String
- Java.util.regex.Pattern: The Pattern class: the string to be matched by such a pattern, the pattern itself has been compiled, the use of the word is much higher efficiency.
- Java.util.regex.Matcher: Match class: This pattern matches the result of a string, which can be a lot of results.
4: Below a small program to briefly introduce the regular expression
Import Java.util.regex.matcher;import Java.util.regex.pattern;public class Test {public static void Main (string[] args) { //matches () determines whether a string matches an expression, "." Represents any one character p ("ABC". Matches ("...")); Replaces the number in the string "a2389a" with A *, \d represents the "0--9" number p ("a2389a". ReplaceAll ("\\d", "*")); Compiles any string that is A--z string length 3, which speeds up the match speed Pattern p = pattern.compile ("[A-z]{3}"); Match and place the matching result in the Matcher object Matcher m = P.matcher ("abc"); P (m.matches ()); The above three lines of code can replace P ("ABC") with the following line of code . Matches ("[A-z]{3}"); } public static void P (Object o) { System.out.println (o); }}
Here is the print result
Truea****atruetrue
Now there are some experiments to illustrate the regular expression matching rules, here is the greedy way
. any character
A? A not once or once
a * a 0 or more times
A + a one or more times
a {n}? A exactly n times
A{n,}? A at least n times
a {n,m}? A at least n times, but not more than m times
Preliminary understanding. * + ? P ("a". Matches (".")); /true p ("AA". Matches ("AA")),//true p ("AAAA". Matches ("A *")),//true p ("AAAA". Matches ("A +"));//true p ("". Matches ("*"));//true P ("AAAA". Matches ("A?")); /false P ("". Matches ("A?")); /true P ("a". Matches ("A?")); /true P ("1232435463685899". Matches ("\\d{3,100}")),//true p ("192.168.0.aaa". Matches ("\\d{1,3}\\.\\d{ 1,3}\\.\\d{1,3}\\.\\d{1,3} ");//false P (" 192 ". Matches (" [0-2][0-9][0-9] ");//true
[ABC] a,b or C(Simple Class)
[^ABC] any character except a,b , or C(negation)
[A-za-z] a to z or a to z, the letters at both ends are included (range)
[A-d[m-p]] a to D or m to p:[a-dm-p](set)
[A-z&&[def]] d,e or F(intersection)
[A-Z&&[^BC]] a to Z, except for b and C:[ad-z](minus)
[A-z&&[^m-p]] a to z, not m to p:[a-lq-z](minus)
The range P (" a". Matches ("[ABC]"));//true P ("a". Matches ("[^ABC]"));//false P ("a". Matches ("[A-za-z]"));// True p ("A". Matches ("[a-z]|[ A-z]);//true P ("A". Matches ("[a-z[a-z]]");//true P ("R". Matches ("[A-Z&&[RFG]]");//true
\d Number:[0-9]
\d non-numeric: [^0-9]
\s whitespace characters:[\t\n\x0b\f\r]
\s non-whitespace characters:[^\s]
\w Word character:[a-za-z_0-9]
\w non-word characters:[^\w]
Recognize \s \w \d p ("\n\r\t". Matches ("\\s (4)");//false P ("". Matches ("\\s"));//false P ("A_8". Matches ("\\w (3));//false P ("abc888&^%". Matches ("[a-z]{1,3}\\d+[&^#%]+"));//true p ("\ \". Matches ("\\\\")) ;//true
Boundary Matching Device
^ The beginning of the line
End of the $ line
\b Word boundaries
\b non-word boundaries
\a The beginning of the input
\g The end of the previous match
\z The end of the input, only for the last terminator (if any)
\z End of input
The boundary matches p ("Hello sir". Matches ("^h.*")),//true p ("Hello sir". Matches (". *ir$"));//true p ("Hello sir"). Matches ("^h[a-z]{1,3}o\\b.*")),//true p ("Hellosir". Matches ("^h[a-z]{1,3}o\\b.*"));//false //blank line: one or more ( Blank and non-newline) begins with a newline character and ends with p ("\ n"). Matches ("^[\\s&&[^\\n]]*\\n$");//true
Method parsing
Matches (): matches the entire string
Find (): Match substring
Lookingat (): Always start from the beginning of the entire string
Email P ("[email protected]". Matches ("[\\w[.-]][email protected][\\w[.-]]+\\.[ \\w]+ ")//true//matches () find () Lookingat () Pattern p = pattern.compile (" \\d{3,5} "); Matcher m = P.matcher ("123-34345-234-00"); The entire "123-34345-234-00" is found to match with the regular expression engine, and when the first "-" does not match, it stops,//but does not match the "-" spit Out P (m.matches ()); Will not match the "-" Spit out M.reset (); 1: When preceded by P (M.matches ()); Find substring from "... 34345-234-00 "Start//will be the 1th, 22 find" 34345 "and" 234 "after 2 can not be found to false//2: When the Front has P (m.matches ()), and M.reset (); Find substring from" 123-34345 -234-00 "start//Will be True,true,true,false P (m.find ()); P (M.start () + "---" +m.end ()); P (M.find ()); P (M.start () + "---" +m.end ()); P (M.find ()); P (M.start () + "---" +m.end ()); P (M.find ()); If you don't find it, you'll report an anomaly. Java.lang.IllegalStateException//p (M.start () + "---" +m.end ()); P (M.lookingat ()); P (M.lookingat ()); P (M.lookingat ()); P (M.lookingaT ());
String substitution: The following method is very flexible for string substitution
String substitution //pattern.case_insensitive case insensitive Pattern p = pattern.compile ("java", pattern.case_insensitive); Matcher m = P.matcher ("Java Java Java ilovejava Youhatejava adsdsfd"); Store string stringbuffer buf = new StringBuffer (); Count Odd even int i = 0; while (M.find ()) { i++; if (i%2 = = 0) { m.appendreplacement (buf, "Java"); } else{ m.appendreplacement (buf, "JAVA"); } } Without this sentence, the string adsdsfd will be abandoned m.appendtail (BUF); P (BUF);
Results Print:
Java Java Ilovejava Youhatejava ADSDSFD
Group
Group grouping, with () grouping Pattern p = pattern.compile ("(\\d{3,5}) ([a-z]{2})"); String s = "123aa-34345bb-234cc-00"; Matcher m = P.matcher (s); P (M.groupcount ());//2 Group while (M.find ()) { P (m.group ());//digital letters have //p (M.group (1));//Only numbers //p ( M.group (2));//Only Letters }
Second, the regular expression simple use
Java Regular Expression Application
Getting started with Java regular Expressions 1