Topic: use of regular expressions in Java

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In Java, to find whether a character or substring needs to be searched in a given string, to split the string, or to replace or delete some characters in the string, these functions are generally implemented through the combination of IF-Else and. As follows:

Java code

Public class test {
Public static
Void main (string ARGs []) {
String STR = "@ Shang Hai Hong Qiao Fei Ji Chang ";
Boolean rs =
False;
For (INT I = 0; I <Str. Length (); I ++ ){
Char z = Str. charat (I );
If ('A' = z |
'F' = z ){
Rs = true;
Break;
} Else {
Rs = false;
}
}
System. Out. println (RS );
}
}

Public class test {public static void main (string ARGs []) {string STR = "@ Shang Hai Hong Qiao Fei Ji Chang"; Boolean rs = false; for (INT I = 0; I <Str. length (); I ++) {char z = Str. charat (I); If ('A' = z | 'F' = z) {rs = true; break;} else {rs = false;} system. out. println (RS );}}

This method is simple and intuitive, but it is difficult to solve complicated tasks, and the amount of code will increase, which is not conducive to maintenance.

At this time, we can use regular expressions to implement these functions, and the code is easy to maintain. The following describes several common functions of the Regular Expression of strings in Java. The details are as follows (the java. util. RegEx package is used ):

1. query a character or a substring in a string in Java

Java code

String S = "@ Shang Hai Hong Qiao Fei Ji Chang ";
String RegEx = "A | f"; // indicates a or F
Pattern PAT = pattern. Compile (RegEx );
Matcher MAT = Pat. matcher (s );
Boolean rs = mat. Find ();

String S = "@ Shang Hai Hong Qiao Fei Ji Chang"; string RegEx = "A | f"; // indicates a or fpattern PAT = pattern. compile (RegEx); matcher MAT = pat. matcher (s); Boolean rs = mat. find ();

If RegEx exists in S, RS is true; otherwise, flase is used.

If you want to ignore case sensitivity during search, you can write pattern PAT = pattern. Compile (RegEx, pattern. case_insensitive );

2. Get a string in a file

Java code

String RegEx = ". + \ (. +) $ ";
String S = "C: \ test.txt ";
Pattern PAT = pattern. Compile (RegEx );
Matcher MAT = Pat. matcher (s );
Boolean rs = mat. Find ();
For (INT I = 1; I <= mat. groupcount (); I ++ ){
System. Out. println (mat. Group (I ));
}

String RegEx = ". + \(. +) $ "; string S =" C: \ test.txt "; pattern PAT = pattern. compile (RegEx); matcher MAT = pat. matcher (s); Boolean rs = mat. find (); For (INT I = 1; I <= mat. groupcount (); I ++) {system. out. println (mat. group (I ));}

Take the execution result test.txt as the preceding result, and the extracted string is stored in mat. Group (I), where the maximum I value is mat. groupcount ();

3. string segmentation

Java code

String RegEx = ":";
Pattern PAT = pattern. Compile (RegEx );
String [] rs = Pat. Split ("AA: BB: CC ");

String RegEx = ":"; pattern PAT = pattern. Compile (RegEx); string [] rs = Pat. Split ("AA: BB: CC ");

After execution, R is {"AA", "BB", "cc "}

If you use a regular expression to split the image as shown above, we generally use the following simpler method:

Java code

String S = "AA: BB: CC ";
String [] rs = S. Split (":");

String S = "AA: BB: CC"; string [] rs = S. Split (":");

4. String replacement/Deletion

Java code

String RegEx = "@ + ";
// Indicates one or more @
Pattern PAT = pattern. Compile (RegEx );
Matcher MAT = Pat. matcher ("@ aa @ B CC @@");
String S = mat. replaceall ("#");

String RegEx = "@ +"; // indicates one or more @ pattern PAT = pattern. compile (RegEx); matcher MAT = pat. matcher ("@ aa @ B CC @"); string S = mat. replaceall ("#");

The result is "# Aa # B CC ##"
　　
If you want to delete all @ in the string, you only need to replace the Null String:

Java code

String S = mat. replaceall ("");

String S = mat. replaceall ("");

The result is "AAB cc"

Note: Description of the pattern class:
1. Public final class java. util. RegEx. pattern is the expression compiled by the regular expression.

The following statement creates a pattern object and assigns it to the handle PAT: Pattern PAT = pattern. Compile (RegEx );
Interestingly, the pattern class is a final class and its constructor is private. Someone may tell you something about the design pattern, or you can check the relevant information by yourself. The conclusion here is that the pattern class cannot be inherited, and we cannot create objects of the pattern class through new.
Therefore, in the pattern class, two static methods with heavy loads are provided, and the return value is the pattern object (reference ). For example:

Java code

Public static pattern compile (string RegEx ){
Return new pattern (RegEx,
0 );
}

Public static pattern compile (string RegEx) {return New Pattern (RegEx, 0 );}

Of course, we can declare the handle of the pattern class, such as pattern PAT = NULL;

2. Pat. matcher (STR) indicates that a string 'str' is generated using pattern, and its return value is a reference of the matcher class.
We can simply use the following method: Boolean rs = pattern. Compile (RegEx). matcher (STR). Find ();

Appendix :
Common Regular Expressions:

Match a specific number:
^ [1-9] D * $ // match a positive integer
^-[1-9] D * $ // match a negative integer
^ -? [1-9] D * $ // match the integer
^ [1-9] D * | 0 $ // match a non-negative integer (positive integer + 0)
^-[1-9] D * | 0 $ // match a non-positive integer (negative integer + 0)
^ [1-9] D *. D * | 0. D * [1-9] D * $ // match the Positive floating point number
^-([1-9] D *. D * | 0. D * [1-9] D *) $ // match the negative floating point number
^ -? ([1-9] D *. D * | 0. D * [1-9] D * | 0 ?. 0 + | 0) $ // match floating point number
^ [1-9] D *. D * | 0. D * [1-9] D * | 0 ?. 0 + | 0 $ // match non-negative floating point number (Positive floating point number + 0)
^ (-([1-9] D *. D * | 0. D * [1-9] D *) | 0 ?. 0 + | 0 $ // match a non-Positive floating point number (negative floating point number + 0)
Comments: It is useful when processing large amounts of data. Pay attention to correction when handling specific applications.

Match a specific string:
^ [A-Za-Z] + $ // match a string consisting of 26 English letters
^ [A-Z] + $ // match a string consisting of 26 uppercase letters
^ [A-Z] + $ // match a string consisting of 26 lowercase letters
^ [A-Za-z0-9] + $ // match a string consisting of digits and 26 letters
^ W + $ // match a string consisting of digits, 26 English letters, or underscores

The following describes the verification functions and expressions used to verify the control using regularexpressionvalidator:

Only numbers can be entered: "^ [0-9] * $"
Only n digits can be entered: "^ d {n} $"
You can only enter at least N digits: "^ d {n,} $"
Only M-N digits can be entered: "^ d {m, n} $"
Only numbers starting with zero and non-zero can be entered: "^ (0 | [1-9] [0-9] *) $"
Only positive numbers with two decimal places can be entered: "^ [0-9] + (. [0-9] {2 })? $"
You can only enter a positive number with 1-3 decimal places: "^ [0-9] + (. [0-9] {1, 3 })? $"
Only a non-zero positive integer can be entered: "^ +? [1-9] [0-9] * $"
Only a non-zero negative integer can be entered: "^-[1-9] [0-9] * $"
Only 3 characters can be entered: "^. {3} $"
You can only enter a string consisting of 26 English letters: "^ [A-Za-Z] + $"
Only a string consisting of 26 uppercase letters can be entered: '^ [A-Z] + $"
You can only enter a string consisting of 26 lower-case English letters: "^ [A-Z] + $"
You can only enter a string consisting of a number and 26 English letters: '^ [A-Za-z0-9] + $"
Only a string consisting of digits, 26 English letters, or underscores can be entered: "^ W + $"
Verify User Password: "^ [A-Za-Z] W {5, 17} $" is in the correct format: it must start with a letter and be between 6 and 18 characters in length,

It can only contain characters, numbers, and underscores.
Check whether ^ % & ',; =? $ "And other characters:" [^ % & ',; =? $ X22] +"
Only Chinese characters can be entered: "^ [u4e00-u9fa5], {0,} $"
Verify email address: "^ W + [-+.] W +) * @ w + ([-.] W + )*. W + ([-.] W +) * $"
Verify interneturl: "^ http: // ([w-] +.) + [w-] + (/[w -./? % & =] *)? $"
Verification phone number: "^ (d {3, 4}) | D {3, 4 }-)? D {7, 8} $"

Correct format: XXXX-XXXXXXX, XXXX-XXXXXXXX, XXX-XXXXXXX ",

XXX-XXXXXXXX, xxxxxxx, XXXXXXXX ".
Verify the ID card number (15 or 18 digits): "^ d {15} | D {} 18 $"
12 months of verification: "^ (0? [1-9] | 1 [0-2]) $ "the correct format is:" 01 "-" 09 "and" 1 "" 12"
31 days of verification for a month: "^ (0? [1-9]) | (1 | 2) [0-9]) | 30 | 31) $"

The correct format is "01", "09", and "1", "31 ".

Regular Expression matching Chinese characters: [u4e00-u9fa5]
Match double byte characters (including Chinese characters): [^ x00-xff]
Regular Expression for matching empty rows: N [S |] * R
Regular Expressions matching HTML tags:/<(. *)>. * | <(. *)/>/
Regular Expression matching the first and last spaces: (^ s *) | (S * $)
Regular Expression matching the email address: W + ([-+.] W +) * @ w + ([-.] W + )*. W + ([-.] W + )*
The regular expression matching the URL: http: // ([w-] +.) + [w-] + (/[w -./? % & =] *)?

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Topic: use of regular expressions in Java

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Topic: use of regular expressions in Java

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support