JAVA 65th-Regular Expressions

Source: Internet
Author: User
Tags character classes

JAVA 65th-Regular Expressions

Regular Expression: It is mainly used in operation strings and is embodied by some specific symbols.

Example:

QQ number verification

6 ~ 9-digit, 0 cannot begin with, must be a number

The matches method in the String class

matches(String regex)
Indicates whether the string matches the given regular expression.

Regex is the given regular expression.

Public static void checkQQ () {// the first digit is 1-9, and the second digit is 0-9, the remaining digits except the first digit range are 5 to 8 digits String regex = "[1-9] [0-9] {5, 8 }"; // Regular Expression String qq = "123459"; boolean flag = qq. matches (regex); System. out. println (qq + ":" + flag );}

PS: Regular Expressions simplify writing, but the code reading is very poor.

Symbolic Meaning

Regular Expressions are hard to contain too many symbols.

Predefined character classes
. Any character (may or may not match the line terminator)
\ D Number:[0-9]
\ D Non-numeric:[^ 0-9]
\ S Blank characters:[\ T \ n \ x0B \ f \ r]
\ S Non-blank characters:[^ \ S]
\ W Word character:A-zA-Z_0-9
\ W Non-word characters:[^ \ W]
Character class
[Abc] A,BOrC(Simple class)
[^ Abc] Any characterA,BOrC(No)
[A-zA-Z] AToZOrAToZ, Two letters included (range)
[A-d [m-p] AToDOrMToP:[A-dm-p](Union)
[A-z & [def] D,EOrF(Intersection)
[A-z & [^ bc] AToZ,BAndC:[Ad-z](Minus)
[A-z & [^ m-p] AToZ, RatherMToP:[A-SCSI-z](Minus)

Boundary
^ Start of a row
$ End of a row
\ B Word boundary
\ B Non-word boundary
\ Start of input
\ G Last matched end
\ Z The end of the input. It is only used for the last terminator (if any)
\ Z End of input

Greedy quantifiers
X? X, Neither once nor once
X* X, Zero or multiple times
X+ X, Once or multiple times
X{N} X, ExactlyNTimes
X{N,} X, At leastNTimes
X{N,M} X, At leastNTimes, but no moreMTimes

Logical operators
XY XFollowedY
X|Y XOrY
(X) X, used as a capture group

Back Reference
\N Any matchedNTh capture group

Public static void check () {String string = "aoooooz"; String regex = "ao {4,} z"; // Regular Expression boolean flag = string. matches (regex); System. out. println (string + ":" + flag );}

Common functions

1. Match 2. Cut 3. Replace 4. Get

Match: The matches method in the String class is used.

Public static void check () {// match whether the mobile phone number is correct String tel = "18753377511"; // The first is 1, the second digit is 3, 5, or 8 // String regex = "1 [358] [0-9] {9 }"; string regex = "1 [358] \ d {9}"; // \ In the String, which indicates escape, therefore, add "\" to escape "\" boolean flag = tel. matches (regex); System. out. println (tel + ":" + flag );}


Cut: it is the split (String regex) method in the String class used previously. It used to be "". Generally, non-special characters with spaces can be considered as rules.

Space

Public static void check () {// split by space. The space may appear multiple times String str = "a B c d e f"; String regex = "+ "; // String [] line = str. split (regex); for (String I: line) {System. out. println (I );}}
Point, PS: The point itself is a special match in the Regular Expression
String str = ". b. c. d. e .. f "; String regex = "\\. + ";//\. after escaping ., so add another \ String [] line = str. split (regex );

Separated by overlapping words

Regular Expressions use () to encapsulate groups

Therefore, overlapping words can be expressed as.. represents any character, (.) is encapsulated into groups, (.) \ 1, indicating that the rest is the same as that of the first group.

String str = "a@@@b####c...dtttef";String regex = "(.)\\1+";//String[] line = str.split(regex);

GROUP: (A) (B (C) which groups are there?

Number of parentheses,

(A) (B (C) 1

(A) 2

(B (C) 3

(C) 4

There are 0th groups without parentheses.


Replace:

replaceAll(String regex,String replacement)
Replace all the substrings matching the given regular expression with the given replacement.

replaceFirst(String regex,String replacement)
Replace the string with the given replacement to match the first substring of the given regular expression.

Public static void check () {// Replace the stacked word with a String str = "abgggggcffffdggggs"; String regex = "(.) \ 1 + "; // str = str. replaceAll (regex, "$1"); System. out. println (str );}

PS: dollar signs can be used to obtain existing regular rules in the previous parameter among other parameters.

public static void check() {//18753377511 -> 187****7511String str = "18753377511";String regex = "(\\d{3})\\d{4}(\\d{4})";System.out.println(str);str = str.replaceAll(regex, "$1****$2");System.out.println(str);}

Obtain

Regular Expressions are an object.

Pattern class

The regular expression specified as a string must first be compiled into an instance of this class. Then, you can use the obtained mode to createMatcherObject. According to the regular expression, this object can beCharacter SequenceMatch. All statuses involved in the execution match reside in the same pattern. Therefore, multiple matching instances can share the same pattern.

// Encapsulate regular rules into objects
// Pattern p = Pattern. compile ("a * B ");
// Associate the matcher method string of the regular object to obtain the Matcer object for string operations.
// Matcher m = p. matcher ("aaaab ");
// Operate the string using the Matcher object Method
// Boolean B = m. matches ();

Matcher class

matchesMethod to match the entire input sequence with this pattern.

lookingAtTry to match the input sequence from the beginning to the pattern.

findThe method scans the input sequence to find the next subsequence that matches the pattern.

Public static void check () {String str = "ni hao, wohao, ta ye hao "; string regex = "\ B [a-z] {3} \ B"; // \ B: Word boundary Pattern p = Pattern. compile (regex); Matcher m = p. matcher (str); while (m. find () // to obtain the {System. out. println (m. group (); System. out. println (m. start () + ":" + m. end (); // obtain the starting subscript }}

Exercise:

Change aaa... aa... aaa... bbb... B... bbb... ccc to abcd.

Public static void test () {String str = "aaa... aa .. aaa... bbb... b... bbb... ccc... ccc "; System. out. println (str); String regex = "\\. + "; str = str. replaceAll (regex, ""); // click regex = "(.) \ 1 + "; str = str. replaceAll (regex, "$1"); // deprecated System. out. println (str );}
Sort IP addresses

Public static void test () {// String str = "192.0.0.1 127.0.0.24 3.3.3.5 150.15.3.41"; // System. out. println ("ip:" + str); // String regex = "+"; // String [] strs = str. split (regex); // TreeSet
 
  
Ts = new TreeSet
  
   
(); // Automatic Sorting // for (String s: strs) {// ts. add (s); //} // for (String s: ts) {// sort by String // System. out. println (s); // so in each segment of each ip address, use two zeros to complete String str = "192.0.0.1 127.0.0.24 3.3.3.5 150.15.3.41 "; string regex = "(\ d +)"; str = str. replaceAll (regex, "00 $1"); System. out. println ("fill 0:" + str); regex = "0 * (\ d {3})"; str = str. replaceAll (regex, "$1"); System. out. println ("reserved 3 bits:" + str); regex = "+"; String [] strs = str. split (regex); TreeSet
   
    
Ts = new TreeSet
    
     
(); // Automatic Sorting for (String s: strs) {ts. add (s) ;}for (String s: ts) {System. out. println (s. replaceAll ("0 * (\ d +)", "$1 "));}}
    
   
  
 

Simple email address verification

Public static void test () {String mail = "aa_a@163.com.cn"; String regex = "\ w + (\\. [a-zA-Z] {2, 3}) + "; // + represents one or more boolean flag = mail. matches (regex); System. out. println (mail + ":" + flag );}

Note: during development, the regular expression reading is poor and will be continuously verified and encapsulated.


Exercise: Web Crawler: a program is used to obtain data that meets specified rules on the Internet.

Crawl the email address.

Public class asd {public static void main (String [] args) throws Exception {// List
 
  
List = getmail (); // Local List
  
   
List = getweb (); // network for (String I: list) {System. out. println (I) ;}} public static List
   
    
Getweb () throws Exception {// URL url = new URL ("http: // 192.168.0.1: 8080/myweb/mymail.html "); URL url = new URL ("http://news.baidu.com/"); BufferedReader brin = new BufferedReader (new InputStreamReader (url. openStream (); String mail_regex = "\ w + (\\. \ w) + "; Pattern p = Pattern. compile (mail_regex); List
    
     
List = new ArrayList
     
      
(); String line = null; while (line = brin. readLine ())! = Null) {Matcher m = p. matcher (line); while (m. find () {list. add (m. group () ;}} return list;} public static List
      
        Getmail () throws Exception {// 1. read the source file BufferedReader br = new BufferedReader (new FileReader ("g: \ mymail.html"); String mail_regex = "\ w + @ \ w + (\\. \ w) + "; Pattern p = Pattern. compile (mail_regex); List
       
         List = new ArrayList
        
          (); String line = null; // 2. Match the read data rule to obtain the data that complies with the rule while (line = br. readLine ())! = Null) {Matcher m = p. matcher (line); while (m. find () {// 3. store data that meets the rules to the collection list. add (m. group () ;}} return list ;}}
        
       
      
     
    
   
  
 




Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.