Recently in many places to use the regular expression, has been not very familiar with this piece, today write some about the regular expression of knowledge, one is to summarize their knowledge of learning, and then forget the future can be timely review.
In Java, to apply the benefits of regular expressions, you must first understand the two classes, and the following are the two underlying classes:
One, Pattern
API Introduction:
A compiled representation of a regular expression.
A regular expression, specified as a string, must first be compiled to an instance of this class. The resulting pattern can then is used to create a Matcher
object that can match arbitrary character sequences< /c3> against the regular expression. All of the state involved in performing a match resides in the Matcher, so many matchers can share the same pattern.
The compiled representation of the regular expression.
A regular expression that is specified as a string must first be compiled into an instance of this class. The resulting pattern can then be used to create an Matcher
object that, according to the regular expression, can match any
sequence of characters
. All the states involved in performing a match reside in the match, so multiple matches can share the same pattern.
Two, Matcher
API Introduction:
A matcher is created from a pattern by invoking the pattern ' s matcher
method. Once created, a matcher can be used to perform three different kinds of match operations:
The matches
method attempts to match the entire input sequence against the pattern.
The lookingAt
method attempts to match the input sequence, starting at the beginning, against the pattern.
The find
method scans the input sequence looking for the next subsequence that matches the pattern.
Creates a match from a pattern by invoking the mode's matcher
method. After you create a match, you can use it to perform three different matching operations:
matches
Method attempts to match the entire input sequence to the pattern.
lookingAt
Attempts to match the input sequence to the pattern from the beginning.
find
method to scan the input sequence to find the next subsequence that matches the pattern.
The application of regular expressions:
Generates a string object used to store a sequence of strings for the specified regular expression:
1.String regular= "[a-z]{3}";//3-bit string consisting of a-Z;
2.Pattern p= Pattern.compile (Regular);//generate corresponding mode;
3.Matcher m=p.matches ("ASD");//matches the ASD string and stores the resulting state generation in the returned Matcher object;
Corresponding to the generated Matcher object, a series of operations can be performed.
code example:
Basic application of 1.Mathcer class
Packageregularexpression;ImportJava.util.regex.Matcher;ImportJava.util.regex.Pattern; Public classRegularExpression { Public Static voidMain (string[] args) {//TODO auto-generated Method StubPattern p = pattern.compile ("Cat"); Matcher m= P.matcher ("One cat, cats in the Yard"); PR ("Matches method call, returns a Boolean value that matches the entire string" +m.matches ()); while(M.find ()) {PR ("Find method, look for substrings matching the corresponding pattern until the end of the string is returned to false"); PR ("Call the group method and return the found substring:" +M.group ()); PR ("Call the start and end methods to return the start and end indexes of the substring throughout the string:" +m.start () + "+"m.end ()); } } Public Static voidPR (String str) {System.out.println (str); }}
2. Advanced application, String substitution modification
Pattern p = pattern.compile ("cat"); = P.matcher ("One cat, cats in the Yard"); PR (M.replaceall ("dog")); // Print One dog dogs in the yard
ReplaceAll (String) is simple, but not flexible, because he has to replace all of the matching objects, and if you want to replace them it is difficult to implement, so you can use a method that is flexible to call replacements:
The two methods of Appendreplacement () and appendtail () implement flexible substitution strings.
Pattern p = pattern.compile ("Cat"); Matcher m= P.matcher ("One cat, cats in the Yard"); intIndex=0; StringBuffer SB=NewStringBuffer (); while(M.find ()) {if(index==0) {m.appendreplacement (SB,"Dog"); Index++; } Else{m.appendreplacement (SB),"Duck"); }} m.appendtail (SB);//add trailer data to SBPR (SB);//One dog Ducks in the yard
This enables a flexible replacement, very convenient, very powerful.
3. Finally attach a self-written code statistics tool (Statistical code lines, blank lines, comment lines (only write the//type of comments,/**/lazy to write!))
Codecount.java PackageCodecount;ImportJava.util.regex.Matcher;ImportJava.util.regex.Pattern; Public classCodecount { Public Static FinalString regulars_annotation= "^[\\t]*[/]{2}.*"; Public StaticString regulars_blank= "[\\t]*"; Public StaticString regulars_code= "[\\t]*[^/]+[/]?"; Public Static Booleanjudge (String str,string regex) {Pattern P=pattern.compile (regex); Matcher m=P.matcher (str); returnm.matches ();} }
Test.java PackageCodecount;ImportJava.io.BufferedReader;Importjava.io.FileNotFoundException;ImportJava.io.FileReader;Importjava.io.IOException;ImportJava.io.Reader; Public classTest { Public Static voidMain (string[] args) {//TODO auto-generated Method StubTry{bufferedreader br=NewBufferedReader (NewFileReader ("C:\\users\\java\\desktop\\code.java")); String str; intBlank=0; intCode=0; intAnnotation=0; while(NULL! = (Str=br.readline ())){ if(Codecount.judge (str, codecount.regulars_annotation)) annotation++; if(Codecount.judge (str, codecount.regulars_blank)) blank++; if(Codecount.judge (str, codecount.regulars_code)) {code++; System.out.println (str);} } System.out.println ("annotation=" +annotation+ "line."); System.out.println ("blank=" +blank+ "line."); System.out.println ("Code=" +code+ "line."); } Catch(FileNotFoundException e) {//TODO auto-generated Catch blocke.printstacktrace ();} Catch(IOException e) {//TODO auto-generated Catch blocke.printstacktrace ();} }}
Enclose the rules for regular expressions:
Character
X character X
\ \ backslash Character
\0n characters with octal value 0 N (0 <= n <= 7)
\0nn character nn with octal value 0 (0 <= n <= 7)
\0mnn characters with octal value 0 mnn (0 <= m <= 3, 0 <= n <= 7)
\xhh character hh with hexadecimal value of 0x
\uhhhh characters with a hexadecimal value of 0x HHHH
\ t tab (' \u0009 ')
\ n New Line (newline) character (' \u000a ')
\ r return character (' \u000d ')
\f page Break (' \u000c ')
\a Alarm (Bell) symbol (' \u0007 ')
\e Escape character (' \u001b ')
\cx the control that corresponds to X
Character class
[ABC] A, B or C (simple Class)
[^ABC] Any character except A, B, or C (negation)
[A-za-z] A to Z or A to Z, the letters at both ends are included (range)
[A-d[m-p]] A to D or M to P:[a-dm-p] (set)
[A-z&&[def]] D, E or F (intersection)
[A-Z&&[^BC]] A to Z, except B and C:[ad-z] (minus)
[A-z&&[^m-p]] A to Z, not M to P:[a-lq-z] (minus)
Predefined character classes
. Any character (may or may not match the line terminator)
\d number: [0-9]
\d non-numeric: [^0-9]
\s whitespace characters: [\t\n\x0b\f\r]
\s non-whitespace characters: [^\s]
\w Word character: [a-za-z_0-9]
\w non-word characters: [^\w]
Greedy number of words
X? X, not once or once
X* X, 0 or more times
x+ X, one or more times
X{n} X, exactly n times
X{n,} X, at least n times
X{n,m} X, at least n times, but no more than m times
Boundary Matching Device
^ The beginning of the line
End of the $ line
\b Word boundaries
\b Non-word boundaries
\a the beginning of the input
\g the end of the previous match
\z the end of the input, only for the last terminator (if any)
\z End of input
\ t tab
\ nthe line break
\ r Enter
RegularExpressions (Regular expression)