Java Regular Expressions

Source: Internet
Author: User
Tags expression engine

Java Regular Expressions

As we all know, in program development, it is inevitable that a string needs to be matched, searched, replaced, and judged. These situations are sometimes complicated. If they are solved in pure encoding mode, it often wastes the programmer's time and energy. Therefore, learning and using regular expressions have become the main means to solve this contradiction.

As we all know, regular expressions are a specification that can be used for pattern matching and replacement. a regular expression is composed of common characters (such as characters a to z) and special characters (metacharacters) it is used to describe one or more strings to be matched when the text subject is searched. A regular expression is used as a template to match a character pattern with the searched string.

I. What is a regular expression?

1. Definition:A regular expression is a regular expression that can be used for pattern matching and replacement. a regular expression consists of common characters (such as characters a to z) and special characters (metacharacters) it is used to describe one or more strings to be matched when the text subject is searched. A regular expression is used as a template to match a character pattern with the searched string.

2. Purpose:

String Matching (character matching)

String search

String replacement

String segmentation

For example:

Pull the email address from the webpage

Whether the IP address is correct

Pull the link from the webpage

3. Classes for processing Regular Expressions in java:

Java. lang. String

Java. util. regex. Pattern: Pattern class: The Pattern in which the string is to be matched. The Pattern itself has been compiled and is much more efficient to use.

Java. util. regex. Matcher: matching class: This pattern matches the results produced by a string, and there may be many results.

4: Here is a simple introduction to regular expressions through a small program.

Import java. util. regex. matcher; import java. util. regex. pattern; public class Test {public static void main (String [] args) {// matches () determines whether the String matches an expression ,". "represents any character p (" abc ". matches ("... "); // replace the number in the string" a2389a "with *, and \ d indicates the" 0-9 "number p (" a2389a ". replaceAll ("\ d", "*"); // compile any string that is a -- z with a length of 3, in this way, the matching speed can be accelerated. Pattern p = Pattern. compile ("[a-z] {3}"); // match and put the matching result in the Matcher object Matcher m = p. matcher ("abc"); p (m. matches (); // the above three lines of code can be replaced by the following line of code p ("abc ". matches ("[a-z] {3}");} public static void p (Object o) {System. out. println (o );}}

The following is the result.

True
A *****
True
True

Now we use some experiments to illustrate the matching rules of regular expressions. Here we use the Greedy method.

. Any character

A? A does not exist once or once.

A * a zero or multiple times

A + a once or multiple times

A {n }? A EXACTLY n times

A {n ,}? A must be at least n times

A {n, m }? A must be at least n times, but cannot exceed m times

// Preliminary understanding. * +?
P ("a". matches ("."); // true
P ("aa". matches ("aa"); // true
P ("aaaa". matches ("a *"); // true
P ("aaaa". matches ("a +"); // true
P ("". matches ("a *"); // true
P ("aaaa". matches ("? "); // False
P ("". matches ("? "); // True
P ("a". matches ("? "); // True
P ("1232435463685899". matches ("\ d {3,100}"); // true
P ("192.168.0.aaa ". matches ("\ d {1, 3 }\\. \ d {1, 3 }\\. \ d {1, 3 }\\. \ d {1, 3} "); // false
P ("192". matches ("[0-2] [0-9] [0-9]"); // true

[Abc] a, B, or c (simple class)

[^ Abc] any character except a, B, or c (NO)

[A-zA-Z] letters from a to z or from A to Z are included in the range)

[A-d [m-p] a to d or m to p: [a-dm-p] (union)

[A-z & [def] d, e, or f (intersection)

[A-z & [^ bc] a to z, except for B and c: [ad-z] (minus)

[A-z & [^ m-p] a to z, instead of m to p: [a-SCSI-z] (minus)

// Range

P ("a". matches ("[abc]"); // true
P ("a". matches ("[^ abc]"); // false
P ("A". matches ("[a-zA-Z]"); // true
P ("A". matches ("[a-z] | [A-Z]"); // true
P ("A". matches ("[a-z [A-Z]"); // true
P ("R". matches ("[A-Z & [RFG]"); // true

\ D Number: [0-9]

\ D non-numeric: [^ 0-9]

\ S blank character: [\ t \ n \ x0B \ f \ r]

\ S non-blank characters: [^ \ s]

\ W word character: [a-zA-Z_0-9]

\ W non-word characters: [^ \ w]

// Recognize \ s \ w \ d \
P ("\ n \ r \ t". matches ("\ s (4)"); // false
P ("". matches ("\ S"); // false
P ("a_8". matches ("\ w (3)"); // false
P ("abc888 & ^ %". matches ("[a-z] {1, 3} \ d + [& ^ # %] +"); // true
P ("\". matches ("\\\\"); // true

Boundary

^ Beginning of a row

$ End of a row

\ B word boundary

\ B Non-word boundary

\

End of a match on \ G

The end of the \ Z input. It is only used for the final terminator (if any)

\ Z input end

// Boundary match
P ("hello sir". matches ("^ h. *"); // true
P ("hello sir". matches (". * ir $"); // true
P ("hello sir". matches ("^ h [a-z] {1, 3} o \ B. *"); // true
P ("hellosir". matches ("^ h [a-z] {1, 3} o \ B. *"); // false
// Blank line: one or more (blank and non-line break) start with and end with a line break
P ("\ n". matches ("^ [\ s & [^ \ n] * \ n $"); // true

Method Analysis

Matches (): match the entire string

Find (): match the substring

LookingAt (): always starts from the beginning of the entire string.

// Email
P ("asdsfdfagf@adsdsfd.com ". matches ("[\ w [. -] + @ [\ w [. -] + \\. [\ w] + "); // true

// Matches () find () lookingAt ()
Pattern p = Pattern. compile ("\ d {3, 5 }");
Matcher m = p. matcher ("123-34345-234-00 ");

// Use the Regular Expression Engine to search for and match the entire "123-34345-234-00". When the first "-" does not match, it stops,
// But will not spit out the unmatched "-"
P (m. matches ());
// Spit out the unmatched "-"
M. reset ();

// 1: The current surface has p (m. matches (); find the substring starting from "... 34345-234-00"
// The result is that the second and second queries "34345" and "234" fail to be found. The value is false.
// 2: The current surface has p (m. matches (); and m. reset (); The substring starts from "123-34345-234-00 ".
// True, false
P (m. find ());
P (m. start () + "---" + m. end ());
P (m. find ());
P (m. start () + "---" + m. end ());
P (m. find ());
P (m. start () + "---" + m. end ());
P (m. find ());
// If it is not found, an exception occurs in java. lang. IllegalStateException.
// P (m. start () + "---" + m. end ());

P (m. lookingAt ());
P (m. lookingAt ());
P (m. lookingAt ());
P (m. lookingAt ());

String replacement: the following method is very flexible for string replacement.

// String replacement
// Pattern. CASE_INSENSITIVE case insensitive
Pattern p = Pattern. compile ("java", Pattern. CASE_INSENSITIVE );
Matcher m = p. matcher ("java Java jAva ILoveJavA youHateJAVA adsdsfd ");
// Store strings
StringBuffer buf = new StringBuffer ();
// Count the parity
Int I = 0;
While (m. find ()){
I ++;
If (I % 2 = 0 ){
M. appendReplacement (buf, "java ");
} Else {
M. appendReplacement (buf, "JAVA ");
}
}
// If this clause is not added, the string adsdsfd will be abandoned.
M. appendTail (buf );
P (buf );

Result printing:

JAVA java ILovejava youHateJAVA adsdsfd

Group

// Group, ()
Pattern p = Pattern. compile ("(\ d {3, 5}) ([a-z] {2 })");
String s = "123aa-34345bb-234cc-00 ";
Matcher m = p. matcher (s );
P (m. groupCount (); // Group 2
While (m. find ()){
P (m. group (); // There are both numbers and letters
// P (m. group (1); // only a number is allowed.
// P (m. group (2); // only letters are allowed.
}

Ii. simple use of regular expressions

Java Regular Expression

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.