Java Regular Expressions Basics Getting started--regular expressions

Source: Internet
Author: User
Tags expression engine stringbuffer

As we all know, in the process of development, it is inevitable to encounter the need to match, find, replace, judge the situation of strings, and these situations are sometimes more complex, if the use of pure coding to solve, will often waste the programmer's time and energy. Therefore, learning and using regular expressions are the main means to solve this contradiction.

As we all know, regular expressions are a specification that can be used for pattern matching and substitution. A regular expression is a literal pattern consisting of ordinary characters (such as characters A through Z) and special characters (metacharacters) that describe one or more strings to be matched when finding the body of the text. A regular expression is used as a template to match a character pattern with the string being searched for.

One: What is a regular expression

1. Definition: Regular expressions are a specification that can be used for pattern matching and substitution. A regular expression is a literal pattern consisting of ordinary characters (such as characters A through Z) and special characters (metacharacters) that describe one or more strings to be matched when finding the body of the text. A regular expression is used as a template to match a character pattern with the string being searched for.

2. Use:

String matching (character matching)

String Lookup

String substitution

String segmentation

For example:

Remove email address from web page

IP address is correct

Find links from Web pages

the class that handles regular expressions in 3.java:

Java.lang.String

Java.util.regex.Pattern: Pattern class: A pattern in which strings are matched so that the pattern itself has been compiled and used with much greater efficiency.

Java.util.regex.Matcher: Matching class: This pattern matches the results of a string, and there may be many.

4: The following through a small program to introduce the regular expression of a simple

Import Java.util.regex.Matcher;
Import Java.util.regex.Pattern;
The public class Test {public
 static void Main (string[] args) {
  //matches () determines whether the string matches an expression, "." Represents any one character
  p ("ABC". Matches ("..."));
  Replaces the number in the string "a2389a" with *, \d represents the "0--9" number
  p ("a2389a". ReplaceAll ("\\d", "*"));
  Compiles any string that is a--z with a string length of 3, which speeds up the matching speed pattern
  p = pattern.compile ("[A-z]{3}");
  To match and place the matching result in the Matcher object
  Matcher m = P.matcher ("abc");
  P (m.matches ());
  The three lines above can replace
  p ("ABC". Matches ("[A-z]{3}") with the following line of code)
 ;
 public static void P (Object o) {
  System.out.println (o);
 }

Here is the print result

True
A****a
True
True

Now there are some experiments to illustrate the matching rules of regular expressions, here is the greedy way

. Any character

A? A once or once is not

A * A 0 or more times

A + a one or more times

A{n}? A happens to be n times

A{n,}? A at least n times

A{n,m}? A at least n times, but not more than m times

       //Preliminary understanding. * +?
        P ("a". Matches (".")); /true
        p ("AA". Matches ("AA"));//true
         p ("AAAA". Matches ("A *"));//true
        P ("AAAA". Matches ("A +")); /true
        P ("". Matches ("A *"));//true
         p ("AAAA". Matches ("a?"); /false
        P ("". Matches ("A?")); /true
        P ("a". Matches ("a")); /true
        P ("1232435463685899". Matches ("\\d{3,100}"));//true
        p ("192.168.0.aaa". Matches ("\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}") "));//false
        P (" matches "(" [0-2][0-9][0-9] "));//true

[ABC] A, B, or C (simple Class)

[^ABC] Any character except A, B, or C (negation)

[A-za-z] A to Z or A to Z, and the letters at both ends are included (range)

[A-d[m-p]] A to D or M to P:[a-dm-p] (and set)

[A-z&&[def]] D, E or F (intersection)

[A-Z&&[^BC]] A to Z, except B and C:[ad-z] (minus)

[A-z&&[^m-p]] A to Z, not M to P:[a-lq-z] (minus)

Range

P ("a". Matches ("[ABC]"));//true
P ("a". Matches ("[^ABC]"));//false
P ("A". Matches ("[A-za-z]"));//true
P ("A". Matches ("[a-z]|[ A-z]);//true
P ("A". Matches ("[a-z[a-z]]");//true
P ("R". Matches ("[A-Z&&[RFG]]");//true

\d number: [0-9]

\d Non-digit: [^0-9]

\s whitespace characters: [\t\n\x0b\f\r]

\s non-whitespace characters: [^\s]

\w Word characters: [a-za-z_0-9]

\w non-word characters: [^\w]

Know \s \w \d \
P ("\n\r\t". Matches ("\\s (4)"));//false
P ("". Matches ("\\s"));//false
P ("A_8". Matches ("\\w (3)"));//false
P ("abc888&^%". Matches ("[a-z]{1,3}\\d+[&^#%]+"));//true
P ("\ \" Matches ("\\\\"));//true

Boundary Matching Device

^ The beginning of a line

$ End of line

\b Word boundaries

\b Non-word boundaries

\a the beginning of the input

\g the end of the previous match

\z the end of the input, only for the last terminator (if any)

End of \z input

Boundary matching
P ("Hello sir". Matches ("^h.*"));//true
P ("Hello sir". Matches (". *ir$"));//true
P ("Hello sir". Matches ("^h[a-z]{1,3}o\\b.*"));//true
P ("Hellosir". Matches ("^h[a-z]{1,3}o\\b.*"));//false
Blank line: One or more (blank and non-newline characters) at the beginning and ending with a newline character
P ("\ n". Matches ("^[\\s&&[^\\n]]*\\n$"));//true

Method parsing

Matches (): matches the entire string

Find (): matching substring

Lookingat (): Always start matching from the beginning of the entire string

Email
P ("Asdsfdfagf@adsdsfd.com". Matches ("[\\w[.-]]+@[\\w[.-]]+\\.[ \\w]+ "));//true

Matches () find () Lookingat ()
Pattern p = pattern.compile ("\\d{3,5}");
Matcher m = P.matcher ("123-34345-234-00");

The entire "123-34345-234-00" is found to match with the regular expression engine, and when the first "-" does not match, it stops.
But will not be mismatched "-" Spit It Out
P (m.matches ());
Will not match the "-" Spit It Out
M.reset ();

1: When the front has P (m.matches ()), find the substring from the "..." 34345-234-00 "Start
It's going to be 1th, 22, "34345" and "234", and 2 will not be found to be false.
2: When the Front has P (m.matches ()); and M.reset (); Find substring starting from "123-34345-234-00"
will be for True,true,true,false
P (M.find ());
P (M.start () + "---" +m.end ());
P (M.find ());
P (M.start () + "---" +m.end ());
P (M.find ());
P (M.start () + "---" +m.end ());
P (M.find ());
If we don't find it, we'll report an anomaly. Java.lang.IllegalStateException
P (M.start () + "---" +m.end ());

P (M.lookingat ());
P (M.lookingat ());
P (M.lookingat ());
P (M.lookingat ());

String substitution: The following method is very flexible for string substitution

String substitution
Pattern.case_insensitive not sensitive to case
Pattern p = pattern.compile ("java", pattern.case_insensitive);
Matcher m = P.matcher ("Java Java Java ilovejava Youhatejava adsdsfd");
Storing strings
StringBuffer buf = new StringBuffer ();
Count Odd even
int i = 0;
while (M.find ()) {
i++;
if (i%2 = = 0) {
M.appendreplacement (buf, "Java");
}else{
M.appendreplacement (buf, "JAVA");
}
}
Without this sentence, the string adsdsfd will be abandoned
M.appendtail (BUF);
P (BUF);

Results Print:

Java Java Java Ilovejava Youhatejava ADSDSFD

Group

Group group, grouped by ()
Pattern p = pattern.compile ("(\\d{3,5}) ([a-z]{2})");
String s = "123aa-34345bb-234cc-00";
Matcher m = P.matcher (s);
P (M.groupcount ());//2 Group
while (M.find ()) {
P (M.group ());//digital letters all have
P (M.group (1));//Only number
P (M.group (2));//Only letters
}

Second, regular expression simple use

Java Regular Expressions use the

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.