Topic: use of regular expressions in Java

Source: Internet
Author: User

In Java, to find whether a character or substring needs to be searched in a given string, to split the string, or to replace or delete some characters in the string, these functions are generally implemented through the combination of IF-Else and. As follows:

Java code

  1. Public class test {
  2. Public static
    Void main (string ARGs []) {
  3. String STR = "@ Shang Hai Hong Qiao Fei Ji Chang ";
  4. Boolean rs =
    False;
  5. For (INT I = 0; I <Str. Length (); I ++ ){
  6. Char z = Str. charat (I );
  7. If ('A' = z |
    'F' = z ){
  8. Rs = true;
  9. Break;
  10. } Else {
  11. Rs = false;
  12. }
  13. }
  14. System. Out. println (RS );
  15. }
  16. }
Public class test {public static void main (string ARGs []) {string STR = "@ Shang Hai Hong Qiao Fei Ji Chang"; Boolean rs = false; for (INT I = 0; I <Str. length (); I ++) {char z = Str. charat (I); If ('A' = z | 'F' = z) {rs = true; break;} else {rs = false;} system. out. println (RS );}}

This method is simple and intuitive, but it is difficult to solve complicated tasks, and the amount of code will increase, which is not conducive to maintenance.

At this time, we can use regular expressions to implement these functions, and the code is easy to maintain. The following describes several common functions of the Regular Expression of strings in Java. The details are as follows (the java. util. RegEx package is used ):

1. query a character or a substring in a string in Java

Java code

  1. String S = "@ Shang Hai Hong Qiao Fei Ji Chang ";
  2. String RegEx = "A | f"; // indicates a or F
  3. Pattern PAT = pattern. Compile (RegEx );
  4. Matcher MAT = Pat. matcher (s );
  5. Boolean rs = mat. Find ();
String S = "@ Shang Hai Hong Qiao Fei Ji Chang"; string RegEx = "A | f"; // indicates a or fpattern PAT = pattern. compile (RegEx); matcher MAT = pat. matcher (s); Boolean rs = mat. find ();

If RegEx exists in S, RS is true; otherwise, flase is used.

If you want to ignore case sensitivity during search, you can write pattern PAT = pattern. Compile (RegEx, pattern. case_insensitive );

2. Get a string in a file

Java code

  1. String RegEx = ". + \ (. +) $ ";
  2. String S = "C: \ test.txt ";
  3. Pattern PAT = pattern. Compile (RegEx );
  4. Matcher MAT = Pat. matcher (s );
  5. Boolean rs = mat. Find ();
  6. For (INT I = 1; I <= mat. groupcount (); I ++ ){
  7. System. Out. println (mat. Group (I ));
  8. }
String RegEx = ". + \(. +) $ "; string S =" C: \ test.txt "; pattern PAT = pattern. compile (RegEx); matcher MAT = pat. matcher (s); Boolean rs = mat. find (); For (INT I = 1; I <= mat. groupcount (); I ++) {system. out. println (mat. group (I ));}

Take the execution result test.txt as the preceding result, and the extracted string is stored in mat. Group (I), where the maximum I value is mat. groupcount ();

3. string segmentation

Java code

  1. String RegEx = ":";
  2. Pattern PAT = pattern. Compile (RegEx );
  3. String [] rs = Pat. Split ("AA: BB: CC ");
String RegEx = ":"; pattern PAT = pattern. Compile (RegEx); string [] rs = Pat. Split ("AA: BB: CC ");

After execution, R is {"AA", "BB", "cc "}

If you use a regular expression to split the image as shown above, we generally use the following simpler method:

Java code

  1. String S = "AA: BB: CC ";
  2. String [] rs = S. Split (":");
String S = "AA: BB: CC"; string [] rs = S. Split (":");

4. String replacement/Deletion

Java code

  1. String RegEx = "@ + ";
    // Indicates one or more @
  2. Pattern PAT = pattern. Compile (RegEx );
  3. Matcher MAT = Pat. matcher ("@ aa @ B CC @@");
  4. String S = mat. replaceall ("#");
String RegEx = "@ +"; // indicates one or more @ pattern PAT = pattern. compile (RegEx); matcher MAT = pat. matcher ("@ aa @ B CC @"); string S = mat. replaceall ("#");

The result is "# Aa # B CC ##"
  
If you want to delete all @ in the string, you only need to replace the Null String:

Java code

  1. String S = mat. replaceall ("");
String S = mat. replaceall ("");

The result is "AAB cc"

Note: Description of the pattern class:
1. Public final class java. util. RegEx. pattern is the expression compiled by the regular expression.

The following statement creates a pattern object and assigns it to the handle PAT: Pattern PAT = pattern. Compile (RegEx );
Interestingly, the pattern class is a final class and its constructor is private. Someone may tell you something about the design pattern, or you can check the relevant information by yourself. The conclusion here is that the pattern class cannot be inherited, and we cannot create objects of the pattern class through new.
Therefore, in the pattern class, two static methods with heavy loads are provided, and the return value is the pattern object (reference ). For example:

Java code

  1. Public static pattern compile (string RegEx ){
  2. Return new pattern (RegEx,
    0 );
  3. }
Public static pattern compile (string RegEx) {return New Pattern (RegEx, 0 );}

Of course, we can declare the handle of the pattern class, such as pattern PAT = NULL;

2. Pat. matcher (STR) indicates that a string 'str' is generated using pattern, and its return value is a reference of the matcher class.
We can simply use the following method: Boolean rs = pattern. Compile (RegEx). matcher (STR). Find ();

Appendix :
Common Regular Expressions:

Match a specific number:
^ [1-9] D * $ // match a positive integer
^-[1-9] D * $ // match a negative integer
^ -? [1-9] D * $ // match the integer
^ [1-9] D * | 0 $ // match a non-negative integer (positive integer + 0)
^-[1-9] D * | 0 $ // match a non-positive integer (negative integer + 0)
^ [1-9] D *. D * | 0. D * [1-9] D * $ // match the Positive floating point number
^-([1-9] D *. D * | 0. D * [1-9] D *) $ // match the negative floating point number
^ -? ([1-9] D *. D * | 0. D * [1-9] D * | 0 ?. 0 + | 0) $ // match floating point number
^ [1-9] D *. D * | 0. D * [1-9] D * | 0 ?. 0 + | 0 $ // match non-negative floating point number (Positive floating point number + 0)
^ (-([1-9] D *. D * | 0. D * [1-9] D *) | 0 ?. 0 + | 0 $ // match a non-Positive floating point number (negative floating point number + 0)
Comments: It is useful when processing large amounts of data. Pay attention to correction when handling specific applications.

Match a specific string:
^ [A-Za-Z] + $ // match a string consisting of 26 English letters
^ [A-Z] + $ // match a string consisting of 26 uppercase letters
^ [A-Z] + $ // match a string consisting of 26 lowercase letters
^ [A-Za-z0-9] + $ // match a string consisting of digits and 26 letters
^ W + $ // match a string consisting of digits, 26 English letters, or underscores

The following describes the verification functions and expressions used to verify the control using regularexpressionvalidator:

Only numbers can be entered: "^ [0-9] * $"
Only n digits can be entered: "^ d {n} $"
You can only enter at least N digits: "^ d {n,} $"
Only M-N digits can be entered: "^ d {m, n} $"
Only numbers starting with zero and non-zero can be entered: "^ (0 | [1-9] [0-9] *) $"
Only positive numbers with two decimal places can be entered: "^ [0-9] + (. [0-9] {2 })? $"
You can only enter a positive number with 1-3 decimal places: "^ [0-9] + (. [0-9] {1, 3 })? $"
Only a non-zero positive integer can be entered: "^ +? [1-9] [0-9] * $"
Only a non-zero negative integer can be entered: "^-[1-9] [0-9] * $"
Only 3 characters can be entered: "^. {3} $"
You can only enter a string consisting of 26 English letters: "^ [A-Za-Z] + $"
Only a string consisting of 26 uppercase letters can be entered: '^ [A-Z] + $"
You can only enter a string consisting of 26 lower-case English letters: "^ [A-Z] + $"
You can only enter a string consisting of a number and 26 English letters: '^ [A-Za-z0-9] + $"
Only a string consisting of digits, 26 English letters, or underscores can be entered: "^ W + $"
Verify User Password: "^ [A-Za-Z] W {5, 17} $" is in the correct format: it must start with a letter and be between 6 and 18 characters in length,

It can only contain characters, numbers, and underscores.
Check whether ^ % & ',; =? $ "And other characters:" [^ % & ',; =? $ X22] +"
Only Chinese characters can be entered: "^ [u4e00-u9fa5], {0,} $"
Verify email address: "^ W + [-+.] W +) * @ w + ([-.] W + )*. W + ([-.] W +) * $"
Verify interneturl: "^ http: // ([w-] +.) + [w-] + (/[w -./? % & =] *)? $"
Verification phone number: "^ (d {3, 4}) | D {3, 4 }-)? D {7, 8} $"

Correct format: XXXX-XXXXXXX, XXXX-XXXXXXXX, XXX-XXXXXXX ",

XXX-XXXXXXXX, xxxxxxx, XXXXXXXX ".
Verify the ID card number (15 or 18 digits): "^ d {15} | D {} 18 $"
12 months of verification: "^ (0? [1-9] | 1 [0-2]) $ "the correct format is:" 01 "-" 09 "and" 1 "" 12"
31 days of verification for a month: "^ (0? [1-9]) | (1 | 2) [0-9]) | 30 | 31) $"

The correct format is "01", "09", and "1", "31 ".

Regular Expression matching Chinese characters: [u4e00-u9fa5]
Match double byte characters (including Chinese characters): [^ x00-xff]
Regular Expression for matching empty rows: N [S |] * R
Regular Expressions matching HTML tags:/<(. *)>. * | <(. *)/>/
Regular Expression matching the first and last spaces: (^ s *) | (S * $)
Regular Expression matching the email address: W + ([-+.] W +) * @ w + ([-.] W + )*. W + ([-.] W + )*
The regular expression matching the URL: http: // ([w-] +.) + [w-] + (/[w -./? % & =] *)?

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.