Dark Horse Programmer--Regular expression regex

Source: Internet
Author: User

A regular expression is a logical formula for a string operation, which is a "rule string" that is used to express a filter logic for a string, using predefined specific characters and combinations of these specific characters. given a regular expression and another string, we can achieve the following purposes:1. Whether the given string conforms to the filtering logic of the regular expression (called "Match");2. You can get the specific part we want from the string using a regular expression. regular expressions are characterized by:1. Flexibility, logic and functionality are very strong;2. Complex control of strings can be achieved quickly and in a very simple way. 3. For people who have just come into contact, it is more obscure and difficult to understand. because regular expressions are primarily applied to text, they are applied in a variety of text editor scenarios, ranging from the famous editor editplus to large editors such as Microsoft Word, Visual Studio, and can use regular expressions to manipulate text content.
Function: Special operation string, simple and convenient;
Advantage: It can simplify the complex operation of the string;
Cons: The more symbol definition, the longer the regular, the worse the reading.
The use of some common symbols in regular expressions:
[BCD]: Represents the first character of a string when it can be B or C or D, and can have only one character.
\d represents the number 0-9,? Table 0 or 1 times, * represents 0 or more times, + represents one or more times.

Specific operation function:
* 1. Match: String matches method that matches the entire string with a rule, as long as there is a mismatch on the end;
Demand: QQ number check, require 5~15 bit, 0 can not start, can only be numbers, the example code is as follows:
public class Regexdemo {public static void main (string[] args) {checkqq (); <span style= "Font-family:microsoft Yahei;" >                }</span> public                   static void Checkqq ()            {String qq= "32430"; String Regex = "[1-9][0-9]{4,14}"; The first digit is 1-9, the second digit is 0-9, the control is in 4-14//string Regex = "[1-9]\\d{4,14}"; Boolean flag = Qq.matches (regex); if (flag) System.out.println (qq+ "legal"); ElseSystem.out.println (qq+ "illegal");}             }


2. Cutting:
String reg = "+"; Cut by multiple spaces
String reg = "\ \."; Use. Cut
String reg = "(.)    \\1+ "; In order to let the result of the rule be reused, the rules are encapsulated into a group, with () complete, the group appears numbered, starting with 1, to use the existing group, and to get it by \ n (the number of the group).
The instance code is as follows:
public class Regexdemo {public static void main (string[] args) {              Spiltdemo ("Xiaoming     xiaoqiang   xiaohong", "  + ");//According to multiple spaces to cut <span style=" Font-family:microsoft Yahei; " >                  }</span> public static void Spiltdemo (String str,string reg) {<span style= "Font-family:microsoft Yahei; " ></span>string [] arr = Str.split (reg); System.out.println (arr.length); for  (String S:arr) {System.out.println (s);}} <span style= "Font-family:microsoft Yahei;" >}</span>

3. Replacement of ReplaceAll ();
public class Regexdemo {public static void main (string[] args) {                String str1= "sdkaj234567dsflk5678"; String regex1 = "\\d{1,}"; String replacement1 = "#";  Replace the numbers in the string with #replacealldemo (str1, Regex1, replacement1); String str2= "Sdhhhjollll"; String regex2 = "(.) \\1+ ";     . Represents any character string replacement2 = "$";  Duplicate characters are replaced with a single character  Hhh-->hreplacealldemo (Str2,regex2,replacement2);} public static void Replacealldemo (String str, string regex, string replacement)  //substitution method {str = Str.replaceall (regex, re placement); System.out.println (str);} <span style= "Font-family:microsoft Yahei;" >}</span>
Program Run Result:
sdkaj#dsflk#
Sdhjol

4. The Get function of the regular expression, which represents the substring that gets the specific rule in the string
The code is implemented as follows:
Import java.util.regex.*;p Ublic class RegexDemo2 {public static void main (string[] args) {Getdemo ();} public static void Getdemo () {String str = "wo yao Tong zhi shi Jie"; String reg = "\\b[a-z]{4}\\b";  Gets a continuous four-character word, \\b denotes the word boundary Pattern p = pattern.compile (reg);  Encapsulates a rule into an object matcher m = P.matcher (str);  Associate the regular object with the string to be played, get the match object while (M.find ())  //Apply the rule to the string, and make a matching substring lookup {System.out.println (M.group ());  Gets the match result System.out.println (M.start () + "..." +m.end ());  Gets the start and end of the Corner label}}}
Program Run Result:
Tong
7......11

Regular expressions will have a lot of practical uses, you can verify the IP address and mailbox name, here are two exercises to consolidate the application of regular expressions.
Requirement 1: Sorting IP addresses in order of address segments
IP address: 192.168.1.34 102.67.46.1 45.78.1.0 35.156.229.56
Also follow the natural order of the strings, so long as they are 3 bits each.
Idea: 1. According to each paragraph needs the most 0 to complement, then each paragraph will be guaranteed at least 3;
2. Keep only 3 bits per paragraph so that each segment of the IP address is 3 bits.
Requirement 2: Check the email address. The program is implemented as follows:
public class Regextest {public static void main (string[] args) {<span style= "Font-family:microsoft Yahei;"     > </span> ipsort (); Checkmail ();} public static void Ipsort () {String IP = "192.168.1.34 102.67.46.1 45.78.1.0 35.156.229.56"; ip = Ip.replaceall ("  (\\d+) "," 00$1 ");  Replace one or more digits with the 00 plus original digital System.out.println (IP);   ip= Ip.replaceall ("0* (\\d{3})", "$");  Replace 00192 with 192 System.out.println (IP);  string[] arr = ip.split ("+");  Cut by Space treeset<string> ts = new treeset<string> ();  Use TreeSet to sort for (String S:arr) {Ts.add (s) in natural order;  Add the cut string in arr to TS} for (string s:ts) {System.out.println (S.replaceall ("0* (\\d+)", "$")); Restore 001 to 1}/* * Requirements: Check the mailbox address * */public static void Checkmail () {String mail = "[email protected]"; String reg = "[A-za-z0-9_][email protected][a-za-z0-9]+ (\\.[  a-za-z]+) + ";  Exact Match//String reg = "\\[email protected]\\w+ (\\.\\w+) +"; A relatively coarse matching System.out.println (Mail.matches (reg)); }}
Operation Result:
35.156.229.56
45.78.1.0
102.67.46.1
192.168.1.34
True

An important application of regular expressions is to make crawler tools, in the specified text or Web page to obtain the desired information, some hackers feel haha, the program is implemented as follows:
Import Java.io.*;import java.net.*; Import Java.util.regex.matcher;import Java.util.regex.pattern;public class RegexText2 {public static void main (string[ ] args) throws Exception{getmails_1 (); Getmails_2 ();} public static void Getmails_2 () throws Exception//Gets the mailbox address {URL url = new URL in the specified Web page ("http://zhidao.baidu.com/question/51838 2624466008525.HTML?FR=IKS&AMP;WORD=%D3%CA%CF%E4%C3%FB&AMP;IE=GBK "); URLConnection conn = Url.openconnection (); BufferedReader Bufin = new BufferedReader (New InputStreamReader (Conn.getinputstream ())); String line = null; String regex = "\\[email protected]\\w+ (\\.\\w+) +"; The Pattern p = pattern.compile (regex), while (lines =bufin.readline ())!=null)//one line per read {Matcher m = p.matcher (lines); while (M.find ()) {System.out.println (M.group ());}} Bufin.close ();} public static void Getmails_1 () throws Exception//Gets the mailbox address {BufferedReader BUFR = new BufferedReader (new FileReader) in the specified file "Mail.txt")); String line = null; String regex = "\\[email protected]\\w+ (\\.\\w+) +"; Pattern p = Pattern.compile (Regex), while (lines =bufr.readline ())!=null)//one line per read {Matcher m = p.matcher (lines); while (M.find ()) {System.out.println (M.group ());}} Bufr.close ();}}







Dark Horse Programmer--Regular expression regex

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.