Transferred from: Http://www.itzhai.com/java-notes-regex-matches-and-lookingat.html#read-more
1. Basic Syntax 2, string built-in regular expression function 2.1, String class comes with regular expression tool 2.1.1, Split method 2.1.2, String substitution Replacefirst and ReplaceAll Method 3, create regular expression Formula: 3.1, pattern and Matcher 3.1.2, Matches () and Lookingat () use 3.2, Group 3.3, pattern tag 3.4, split () method 3.5, replace operation 3.6, R ESET () method 4, regular expressions and Java I/O
The main content of this article:
- regular expression the most basic syntax;
- string built-in regular expression function, common method: Split (), Replacefirst (), ReplaceAll ();
- using Java to create regular expressions, related classes: Pattern, Matcher, Group;
- the description of the pattern tag, using Pattern.complie (String regex, int flag) to specify the pattern for the regular expression;
- the split method in pattern, similar to split in string;
- matcher Replace operation, common method: ReplaceAll (String replacement), Replacefirst (string Replacement), Appendreplacement (StringBuffer SB, String replacement), Appendtail (StringBuffer sb);
- Matcher the use of the Reset () method;
- Use regular Expressions in Java I/O.
A long time ago, regular expressions were integrated into UNIX toolsets, such as SED and awk, and were integrated into programming languages like Python and Perl. String operations in Java are also primarily focused on the String,stringbuffer and StringTokenizer classes, which provide a simple and limited function compared to regular expressions.
1. Basic syntax:
For a detailed syntax of regular expressions in Java, refer to the JDK documentation.
\ \ In other languages \ \ means inserting a normal backslash in a regular expression, while in Java it represents a backslash that inserts a regular expression, so the character after it has a special meaning.
\\d represents a single digit
\\\\ represents a normal backslash
\n\t for line breaks and tabs, just use a DSLR slash
-?\\d+ indicates that there may be a minus sign followed by one or more digits
2. String built-in regular expression function:
String str = " -123"; boolean isnum = Str.matches ("(-|\\+)? \\d+"); // integers that may contain a minus sign or a positive number System.out.println (Isnum); // output:true
2.1. The string class comes with regular expression tool: 2.1.1, split method:
This is a very useful regular expression tool provided in string to separate the strings from the regular expression matches.
// Split Method Demo String content = "Hello, this was an iPad." = Content.split ("\\w+"); // splitting a string with non-word characters System.out.println (arrays.tostring (items)); // [Hello, this, was, an, IPad] can be found that the comma is also removed as a delimiter
// the overloaded version of the Split method, which allows you to limit the number of string splits items = content.split ("\\w+", 3); System.out.println (arrays.tostring (items)); // [Hello, this, was an iPad.]
2.1.2, string substitution Replacefirst, and ReplaceAll methods:
// Replace the string with the Replacefirst and ReplaceAll method string str = "Make me cry, make me smile." ; System.out.println (Str.replacefirst ("m\\w+", "Music")); // You music me cry, make me smile. System.out.println (Str.replaceall ("Make|me", "Music")); // You music music cry, music music smile.
3. Create a regular expression:3.1. Pattern and Matcher:
For more convenient use of regular expressions, it is highly recommended to browse the contents of the Java.util.regex.Pattern page in the JDK document.
Here is an example of creating a regular expression in Java:
String content = "one step tototoo far." ; // compile a regular expression to generate a pattern object Pattern P =pattern.compile ("(To) {2,}"); // retrieves a string using the Matcher () method of the Pattern object, generating a Matcher object Matcher m = p.matcher (content); while (M.find ()) {System.out.println ("Match \" "+ m.group () +" \ "ad position" + M.start () + "-" + (M.end ()-1)); // output:match "Tototo" ad position 9-14}
Using the various methods on Matcher, you can determine whether the various types of matches are successful:
Boolean matches ():
Try to match the entire region to the pattern.
If the match succeeds, you can get more information through the start, end, and group methods.
Return:
Returns true if and only if the entire region sequence matches the pattern of this match.
Boolean Lookingat ():
Attempts to match the input sequence starting at the beginning of the zone with the pattern.
Like the matches method, this method always starts at the beginning of the range, and unlike it, it does not need to match the entire region.
If the match succeeds, you can get more information through the start, end, and group methods.
Return:
Returns true if and only if the prefix of the input sequence matches the pattern of this match.
Boolean Find ():
Attempts to find the next subsequence of the input sequence that matches the pattern.
This method starts at the beginning of the match area, and if the previous invocation of the method succeeds and the match has not been reset since then, the first character that did not match the previous match operation begins.
If the match succeeds, you can get more information through the start, end, and group methods.
Return:
Returns true if and only if the subsequence of the input sequence matches the pattern of this match.
Boolean find (int start):
Resets the match and then attempts to find the next subsequence of the input sequence that matches the pattern, starting at the specified index.
If the match succeeds, more information can be obtained through the start, end, and group methods, and subsequent calls to the Find () method begin with the first character that does not match the matching operation.
Return:
Returns true if and only if the subsequence of the input sequence starting at the given index matches the pattern of this match.
Thrown:
indexoutofboundsexception-If the start point is less than 0 or greater than the length of the input sequence.
Here is an example of using the Boolean find (int start):
// boolean find (int start) String content = "Wings You is the hero~"=pattern.compile ("\\w+"= p.matcher (content); int i=0; while (M.find (i)) {System.out.print (M.group ()+ ""); I+ +;}
The output is:
Wings Ings NGS GS S you ou U is are re e the the the He e hero hero Ero ro O
. . . Please view the original article. .
3.5, replace the operation:
The following methods are mainly seen in Matcher:
ReplaceAll
ReplaceAll (String replacement)
The
-
replacement pattern matches each subsequence of the input sequence with the given replacement string.
Replacefirst
Replacefirst (String replacement)
-
replaces the first subsequence of an input sequence that matches a given replacement string.
Appendreplacement
appendreplacement (StringBuffer SB, String replacement)
-
Implement non-terminal add and replace steps. This method performs the following actions:
- It reads the characters from the input sequence starting at the add location and adds them to the given string buffer. After the last character before the match (that is, the character at index
start()
- 1 ) is read, it stops.
- It adds the given replacement string to the string buffer.
- It sets the add position of this match to the index of the last matching position plus 1, that is
end()
.
Appendtail
Appendtail (StringBuffer SB)
-
implement terminal Add and replace steps. This method reads the character from the input sequence starting at the point where it was added and adds it to the given string buffer. You can call a method after one or more calls
appendReplacement
to copy the remaining input sequence.
Here is a program to demonstrate the use of these methods, and through this program, more familiar with the use of regular expressions:
String content = "/*! Long ago, the IS-a man called Jack, \ n "+" he has one boat.! */";//Pattern.dotall: This mode. Can match any character, including line breaksPattern p = pattern.compile ("/\\*! (. *)!\\*/", Pattern.dotall); Matcher m=p.matcher (content);if(M.find ())//match to content in/*!!*/Content = M.group (1);//reduce the area of more than two spaces to a single spaceContent = Content.replaceall ("{2,}", "");//turn on multiline mode, delete the space at the beginning of each line, + means match one or moreContent = Content.replaceall ("(? m) ^ +", "" ");//match to the first vowel letter in the string and replace with vowelContent = Content.replacefirst ("[Aeiou]", "vowel");//The following procedure demonstrates replacing all vowels in a string with uppercasePattern P1 = Pattern.compile ("[Aeiou]"); Matcher M1=p1.matcher (content); StringBuffer SB=NewStringBuffer (); while(M1.find ()) {//non-terminal add and replace,m1.appendreplacement (SB, M1.group (). toUpperCase ());}//Terminal additions and replacementsM1.appendtail (SB); System.out.println (SB);
The output is:
LVOWELNG LOng AgO, the was A man called JAck, HE had one bOAt.
Note that the above two substitution operations use only one replaceall (), so instead of compiling to Pattern, use the String replaceall () directly method, and the overhead is relatively small.
.
.
.
.
Java Pattern Matcher Regular application