Java Regular Expression Learning summary and some small examples _javascript tips

Source: Internet
Author: User
Tags character classes stringbuffer
From Java1.4, the Java Core API introduces the Java.util.regex package, which is a valuable foundation tool for many types of text processing, such as matching, searching, extracting, and analyzing structured content.

Java.util.regex is a class library package that uses patterns that are customized by regular expressions to match strings. It consists of two classes: pattern and matcher.
Pattern is a compiled representation of a regular expression. In Java, it is easy to determine whether a string matches a pattern by appropriately naming the patterns class. Patterns can be as simple as matching a particular string, and can be complex, requiring grouping and character classes, such as blanks, numbers, A letter or a control character. Because the Java string is based on uniform character encoding (Unicode), the regular expression also applies to internationalized applications.

a brief introduction to the methods of pattern class
Method Description
Static Pettern compile (String regex,int flag) Compilation mode , the parameter regex represents the input regular expression , Flag represents the pattern type . case_insensitive indicates case insensitive )
Matcher match (charsequence input) To get the match, the string to be processed when the input is entered
Static Boolean matches (String regex, charsequence input) Fast matching calls , matching input directly according to the pattern regex entered
String[] Split (charsequence input,int limit) Delimited string input,limit parameter can limit the number of separators


Matcher a Matcher object is a state machine that matches the string expansion matching check based on pattern objects. First, a pattern instance is used to order a compiled schema of a similar regular expression in Perl, followed by a Matcher instance that matches the string under the mode control of the given pattern instance.

a brief introduction to the methods of Matcher class
Method Description
Boolean matches () Pattern matching of the entire input string .
Boolean Lookingat () Pattern matching at the beginning of the input string
Boolean find (int start) start match mode from start
int GroupCount () Returns the number of groups after a match
String ReplaceAll (String replacement) replaces a matching part with a given replacement
String Repalcefirst (String replacement) replaces the first matching part with a given replacement
Matcher appendreplacement (stringbuffer sb,string replacement) replace the corresponding content with replacement according to the pattern and add the matching result to SB 's current position
StringBuffer Appendtail (StringBuffer SB) Adds the end string after the match in the input sequence to SB 's current position .
common wildcard characters in regular expressions:

For a single string comparison, there is no advantage in using regular expressions. The true strength of a regex is embodied in the inclusion of character classes and quantifiers (*,+,?). Of the more complex patterns.
character classes include:
Copy Code code as follows:

\d Digital
\d Non-digital
\w single character (0-9,a-z,a-z)
\w non-word characters
\s whitespace (spaces, line break, carriage return, TAB)
\s not blank
[] A custom character class created by a word characters in square brackets
. Match any single character
The following characters will be used to control the process of applying a child pattern to the number of matches.
? Repeat the previous child mode 0 times to once
* Repeat the previous child mode 0 or more times
+ Repeat previous sub-mode one to many times


Here is the Example section:

Example One:
The regular formula is the simplest pattern that matches a given string exactly, and the pattern is equivalent to the text to be matched. The static Pattern.matches method is used to compare whether a string matches a given pattern. The routines are as follows:
Copy Code code as follows:

String data= "Java";
Boolean result=pattern.matches ("Java", data);

Example Two:
Copy Code code as follows:

String[] Dataarr = {"Moon", "Mon", "Moon", "Mono"};
for (String Str:dataarr) {
String patternstr= "m (o+) n";
Boolean result = Pattern.matches (patternstr, str);
if (result) {
System.out.println ("string" +str+ "match mode" +patternstr+ "success");
}
else{
System.out.println ("string" +str+ "match mode" +patternstr+ "failed");
}
}

The pattern is "m (o+) n", which means that the o in the middle of MN can be repeated one or more times, so Moon,mon,mooon can match successfully, and Mono has an o after N, and the pattern does not match.

Note:
+ represents one or more times;? Denotes 0 or more times; * 0 or more times.
Example Three:
Copy Code code as follows:

String[] Dataarr = {"Ban", "Ben", "Bin", "Bon", "Bun", "Byn", "Baen"};
for (String Str:dataarr) {
String patternstr= "B[aeiou]n";
Boolean result = Pattern.matches (patternstr, str);
if (result) {
System.out.println ("string" +str+ "match mode" +patternstr+ "success");
}
else{
System.out.println ("string" +str+ "match mode" +patternstr+ "failed");
}
}

Note: Only a single character allowed in square brackets, the pattern "b[aeiou]n" specifies that only the first five of the array can match, and the last two elements cannot be matched, with only B beginning, n ending, and the middle being any of the a,e,i,o,u.
square brackets [] indicate that only the characters specified in them can be matched.
Example four:
Copy Code code as follows:

String[] Dataarr = {"Been", "Bean", "Boon", "Buin", "Bynn"};
for (String Str:dataarr) {
String patternstr= "B (ee|ea|oo) n";
Boolean result = Pattern.matches (patternstr, str);
if (result) {
System.out.println ("string" +str+ "match mode" +patternstr+ "success");
}
else{
System.out.println ("string" +str+ "match mode" +patternstr+ "failed");
}
}

If you need to match more than one character, then [] can not be used, where we could use () plus | To replace, () to represent a group, | The relationship between the expression or, pattern B (ee|ea|oo) n can match been,bean,boon and so on.
So the first three can match up, and then the two can't.
Example Five:
Copy Code code as follows:

String[] Dataarr = {"1", "10", "101", "1010", "100+"};
for (String Str:dataarr) {
String patternstr= "\d+";
Boolean result = Pattern.matches (patternstr, str);
if (result) {
System.out.println ("string" +str+ "match mode" +patternstr+ "success");
}
else{
System.out.println ("string" +str+ "match mode" +patternstr+ "failed");
}
}

Note: As you can see from the front, \d represents a number, and the + represents one or more times, so the pattern \d+ represents one or more digits.
So the first four can match on, the last one because the + number is not a numeric character and does not match.
[/code]
Example Six:
Copy Code code as follows:

String[] Dataarr = {"A100", "B20", "C30", "df10000", "gh0t"};
for (String Str:dataarr) {
String patternstr= "\w+\d+";
Boolean result = Pattern.matches (patternstr, str);
if (result) {
System.out.println ("string" +str+ "match mode" +patternstr+ "success");
}
else{
System.out.println ("string" +str+ "match mode" +patternstr+ "failed");
}
}

The pattern \w+\d+ represents a string that starts with multiple single character characters and ends with multiple digits, so the first four can match, and the last one will not match because the number also contains a single word character.
Example Seven:
Copy Code code as follows:

String str= "salary, title; age gender";
String[] Dataarr =str.split ("[, \s;]");
for (String Strtmp:dataarr) {
System.out.println (strtmp);
}

The split function of the string class supports regular expressions, and in the example above the pattern can match ",", single space, ";" , the split function can take any one of them as a separator, splitting a string into an array of strings.
Example Eight:
Copy Code code as follows:

String str= "December 11, 2007";
Pattern p = pattern.compile ("[Month-day]");
String[] Dataarr =p.split (str);
for (String Strtmp:dataarr) {
System.out.println (strtmp);
}

Pattern is a compiled representation of a regular expression, and its split method can effectively split the string.
Notice the difference in the use of String.Split ().
Example Nine:
Copy Code code as follows:

String str= "10 Yuan 1000 RMB 10000 yuan 100000RMB";
Str=str.replaceall ("(\d+) (Yuan | rmb | RMB) "," ¥ ");
System.out.println (str);

In the example, Mode "(\d+) (Yuan | RMB) "divided into two groups, the first group of \d+ matching single or multiple digits, the second group of matching yuan, RMB, RMB, any one, the replacement part of the first group to match the same part of the same, the rest of the group replaced by ¥.
After the replacement of STR for ¥10¥1000¥10000¥100000
Example Ten:
Copy Code code as follows:

Pattern p = pattern.compile ("m (o+) n", pattern.case_insensitive);
Generates a Matcher object using the Matcher () method of the Pattern class
Matcher m = P.matcher ("Moon Mooon Mon mooooon mooon");
StringBuffer sb = new StringBuffer ();
Finds the first matching object using the Find () method
Boolean result = M.find ();
Use loops to find patterns to match the contents of the replacement, and then add the content to SB
while (result) {
M.appendreplacement (SB, "Moon");
result = M.find ();
}
Finally, the Appendtail () method is invoked to add the remaining string after the last match to SB;
M.appendtail (SB);
System.out.println ("After the replacement content is" + sb.tostring ());

Example 11:
In addition to using + to express one or more times, * 0 or more times, to indicate 0 or more times, you can also use {} to specify the exact number of occurrences, x{2,5} indicates that X appears at least 2 times, up to 5 times; X{2,} means that x appears at least 2 times, many are not limited; X{5} indicates that x only appears exactly 5 times.
Routines:
Copy Code code as follows:

String[] Dataarr = {"Google", "Gooogle", "Gooooogle", "Goooooogle", "Ggle"};
for (String Str:dataarr) {
String patternstr = "g (o{2,5}) gle";
Boolean result = Pattern.matches (patternstr, str);
if (result) {
System.out.println ("string" + str + "match mode" + Patternstr + "success");
} else {
System.out.println ("string" + str + "match mode" + Patternstr + "fail");
}
}

Example 12:
-Express from ... To ..., as [a-e] equals [ABCDE]
Copy Code code as follows:

String[] Dataarr = {"Tan", "Tbn", "TCN", "Ton", "Twn"};
for (String Str:dataarr) {
String regex = "T[a-c]n";
Boolean result = Pattern.matches (regex, str);
if (result) {
System.out.println ("string" + str + "match mode" + regex + "success");
} else {
System.out.println ("string" + str + "match mode" + regex + "fail");
}
}

Instance 13: case-insensitive matching.
Regular expressions are case-sensitive by default, and using pattern.case_insensitive does not distinguish case.
Copy Code code as follows:

String patternstr= "AB";
Pattern Pattern=pattern.compile (PATTERNSTR, pattern.case_insensitive);
String[] Dataarr = {"AB", "AB", "AB"};
for (String Str:dataarr) {
Matcher Matcher=pattern.matcher (str);
if (Matcher.find ()) {
System.out.println ("string" + str + "match mode" + Patternstr + "success");
}
}

Example 14: Use a regular expression to split a string.
Copy Code code as follows:

Note that the complex pattern is written in front of it, otherwise the simple pattern will match first.
String input= "Job =gm salary = 50000, name = Professional Manager; Sex = male age = 45 ";
String patternstr= "(\s*,\s*) | (\s*;\s*) | (\s+) ";
Pattern Pattern=pattern.compile (PATTERNSTR);
String[] Dataarr=pattern.split (input);
for (String Str:dataarr) {
System.out.println (str);
}

Example 15: Resolves the text in a regular expression, corresponding to the group1 in the first parenthesis.
Copy Code code as follows:

String regex= "< (\w+) > (\w+) </>";
Pattern Pattern=pattern.compile (regex);
String input= "<name>Bill</name><salary>50000</salary><title>GM</title>";
Matcher matcher=pattern.matcher (input);
while (Matcher.find ()) {
System.out.println (Matcher.group (2));
}

Instance 16: capitalizes the word portion of a string that mixes the word numbers.
Copy Code code as follows:

String regex= "([a-za-z]+[0-9]+)";
Pattern Pattern=pattern.compile (regex);
String input= "Age45 salary500000 50000 title";
Matcher matcher=pattern.matcher (input);
StringBuffer sb=new stringbuffer ();
while (Matcher.find ()) {
String replacement=matcher.group (1). toUpperCase ();
Matcher.appendreplacement (SB, replacement);
}
Matcher.appendtail (SB); The
System.out.println (the replaced string is "+sb.tostring ()");
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.