Some small examples of the use of regular expressions __ regular expressions

Source: Internet
Author: User
Tags alphabetic character first string logical operators
Regular Expression Syntax

Metacharacters

Describe

\

The next character marker, or a backward reference, or a octal escape character. For example, "\\n" matches \ n. "\ n" matches line breaks. The sequence "\ \" matches "\" and "\ (matches" (). That is equivalent to the concept of "escape characters" in many programming languages.

^

Matches the start position of the input string. If the multiline property of the RegExp object is set, ^ also matches the position after "\ n" or "\ r".

$

Matches the end position of the input string. If the multiline property of the RegExp object is set, the $ also matches the position before "\ n" or "\ r".

*

Matches the preceding subexpression any time. For example, zo* can match "Z", "Zo" and "Zoo". * is equivalent to {0,}.

+

Matches the preceding subexpression one or more times (greater than or equal to 1 times). For example, "zo+" can Match "Zo" and "Zoo", but cannot match "Z". + is equivalent to {1,}.

?

Match the preceding subexpression 0 times or once. For example, "Do (es)?" You can match the "do" in "do" or "does".

N

n is a non-negative integer. Matches the determined n times. For example, "o{2}" cannot match "O" in "Bob", but can match two o in "food".

{N,}

n is a non-negative integer. Match at least n times. For example, "o{2,}" cannot match "O" in "Bob", but can match all o in "Foooood". "O{1,}" is equivalent to "o+". "O{0,}" is equivalent to "o*".

{N,m}

M and n are non-negative integers, of which n<=m. Matches n times at least and matches up to M times. For example, "o{1,3}" will match the first three o in "Fooooood". "o{0,1}" is equivalent to "O?". Notice that there is no space between the comma and the two number.

?

When the character is immediately following any other qualifier (*,+,?,{n},{n,},{n,m}), the match pattern is not greedy. Non-greedy patterns match as few strings as possible, while the default greedy pattern matches as many of the searched strings as possible. For example, for the string "Oooo", "o+?" A single "O" will be matched, and "o+" will match all "O".

. Point

Matches any single character except "\ r \ n". To match any character including "\ r \ n", use a pattern like "[\s\s]".

(pattern)

Match pattern and get this match. The obtained matches can be obtained from the resulting matches collection, use the Submatches collection in VBScript, and use the $0...$9 property in JScript. To match the parentheses character, use "\ (" or "\)".

(?:p Attern)

Matches pattern but does not get a matching result, which means it is a non fetch match and is not stored for later use. This is in use or the character "(|)" It is useful to combine parts of a pattern. For example, "Industr (?: y|ies)" is an expression more abbreviated than "Industry|industries".

(? =pattern)

Forward positive check, match the lookup string at the beginning of any string matching pattern. This is a non-fetch match, that is, the match does not need to be acquired for later use. For example, the Windows (? =95|98| nt|2000) "Can match windows in Windows2000, but cannot match windows in Windows3.1." It does not consume characters, that is, after a match occurs, the next matching search begins immediately after the last match, instead of starting after the character that contains the pre-check.

(?! Pattern

Forward negation, which matches the lookup string at the beginning of any string that does not match the pattern. This is a non-fetch match, that is, the match does not need to be acquired for later use. For example, Windows (?! 95|98| nt|2000) "Can match windows in Windows3.1, but cannot match windows in Windows2000."

(? <=pattern)

The reverse positive check is similar to positive, but in the opposite direction. For example, "(? <=95|98| nt|2000) Windows can match "Windows" in "2000Windows", but it does not match "windows" in "3.1Windows".

(? <!pattern)

Reverse negation is similar to positive negation, except in the opposite direction. For example, "(? <!95|98| nt|2000) Windows can match "Windows" in "3.1Windows", but it does not match "windows" in "2000Windows".

X|y

Match x or Y. For example, "Z|food" can match "Z" or "food" or "Zood" (please be cautious here). "(z|f) Ood" matches "Zood" or "food".

[XYZ]

Character set combination. Matches any one of the characters contained. For example, "[ABC]" can Match "a" in "plain".

[^XYZ]

Negative character set combination. Matches any characters that are not included. For example, "[^ABC]" can match "Plin" in "plain".

[A-z]

The range of characters. Matches any character within the specified range. For example, "[A-z]" can match any lowercase alphabetic character in the range "a" through "Z".

Note: The range of characters can only be represented when the hyphen is inside the character group and occurs between two characters; If the beginning of a group of characters, only the hyphen itself is represented.

[^a-z]

Negative character range. Matches any character that is not in the specified range. For example, "[^a-z]" can match any character that is not in the range "a" through "Z".

\b

Match a word boundary, which means the position between the word and the space (that is, the "match" of the regular expression has two concepts, one is the matching character, the other is the matching position, the \b is the matching position). For example, "er\b" can Match "er" in "never", but cannot match "er" in "verb".

\b

Matches a non-word boundary. "er\b" can Match "er" in "verb", but cannot match "er" in "Never".

\cx

Matches the control character indicated by X. For example, \cm matches a control-m or carriage return character. The value of x must be one-a-Z or a-Z. Otherwise, c is treated as a literal "C" character.

\d

Matches a numeric character. equivalent to [0-9].

\d

Matches a non-numeric character. equivalent to [^0-9].

\f

Matches a page feed character. Equivalent to \x0c and \CL.

\ n

Matches a line feed character. Equivalent to \x0a and \CJ.

\ r

Matches a carriage return character. Equivalent to \x0d and \cm.

\s

Matches any invisible character, including spaces, tabs, page breaks, and so on. equivalent to [\f\n\r\t\v].

\s

matches any visible character. equivalent to [^ \f\n\r\t\v].

\ t

Matches a tab character. Equivalent to \x09 and \ci.

\v

Matches a vertical tab. Equivalent to \x0b and \ck.

\w

Matches any word character that includes an underscore. Similar but not equivalent to "[a-za-z0-9_]", where the word character uses the Unicode character set.

\w

Matches any non word character. Equivalent to "[^a-za-z0-9_]".

\xn

Matches n, where n is the hexadecimal escape value. The hexadecimal escape value must be a determined two digits long. For example, "\x41" matches "A". "\x041" is equivalent to "\x04&1". ASCII encoding can be used in regular expressions.

\num

Matches num, where num is a positive integer. A reference to the match that was obtained. For example, "(.) \1 "matches two consecutive identical characters.

\ n

Identifies a octal escape value or a backward reference. n is a backward reference if you have at least n obtained subexpression before \ nthe. Otherwise, if n is an octal number (0-7), then N is an octal escape value.

\nm

Identifies a octal escape value or a backward reference. NM is a backward reference if at least NM has obtained the subexpression before \nm. If there are at least N fetches before \nm, then n is a backward reference followed by a literal m. If all the preceding conditions are not satisfied, if both N and M are octal digits (0-7), then \nm will match octal escape value nm.

\nml

If n is an octal number (0-7) and both M and L are octal digits (0-7), the octal escape value NML is matched.

\un

Matches n, where N is a Unicode character represented in four hexadecimal digits. For example, \u00a9 matches the copyright symbol (&copy;).

\< \> The start (\<) and End (\>) of the match word (word). For example, the regular expression \<the\> can match the "the" in the string "for the Wise", but it cannot match the "the" in the string "otherwise". Note: This meta character is not supported by all software.
\( \) The expression between \ (and \) is defined as group, and characters that match the expression are saved to a staging area (up to 9 in a regular expression), which can be referenced using \1 to \9 symbols.
| Two matching criteria are logically "or" (or) operations. For example, regular expressions (him|her) match "it belongs to him" and "it belongs to her", but they cannot match "it belongs to them." Note: This meta character is not supported by all software.
+ Match 1 or more of the character just before it. For example, regular expression 9+ matches 9, 99, 999, and so on. Note: This meta character is not supported by all software.
? Match 0 or 1 of the character just before it. Note: This meta character is not supported by all software.
{i} {i,j} Matches a specified number of characters that are defined by an expression that precedes it. For example, the regular expression a[0-9]{3} can match a string of exactly 3 numeric characters followed by the character "A", such as A123, A348, but does not match A1234. The regular expression [0-9]{4,6} matches any 4, 5, or 6 consecutive digits.

Quantity representation (x represents a set of specifications)

Specification

Description

Specification

Description

X

Must appear once

X?

can occur 0 or 1 times

x*

can occur 0 times, 1 or more times

x+

Can occur 1 or more times

X{n}

Must appear n times

X{n,}

Must appear more than n times

X{N,M}

Must appear n~m times

logical operators (X, y represent a set of specifications)

Specification

Description

Specification

Description

Xy

x specification followed by Y specification

X| Y

x specification or Y specification

X

As a capturing group specification

The above positive, if you want to drive up, you must rely on the pattern class and Matcher class.

Pattern is mainly the meaning of a rule: regular expression rules need to be used

If you want to get a pattern example, you must call the Pattern.complile (String regex) method

Matcher to represent the main method of using pattern to specify good validation rules Matcher

public Boolean matches () performs validation match

Whether the public boolean find () contains a matching string

public string Group (int groupid) Gets the string that matches the specified capturing group


The three main functions of regular expressions:

String matching

String Lookup

String substitution



Example one:

Determines whether the target string matches the regular expression matcher.matches ()

Such as: Determine whether to meet the mailbox format

public static void Main (string[] args) {
//string
str = "service@xsoftlab.net" to be validated;
Mailbox validation Rule
String regEx = [a-za-z_]{1,}[0-9]{0,}@ ([a-za-z0-9]-*) {1,}\\.)
{1,3} [a-za-z\\-] {1,} ";
Compile regular expression pattern
= Pattern.compile (regEx);
Ignores case writing/pattern
Pat = pattern.compile (regEx, pattern.case_insensitive);
Matcher Matcher = Pattern.matcher (str);
Whether the string matches the regular expression
Boolean rs = Matcher.matches ();
System.out.println (RS);



Example two:

Determines whether the target character contains a string that satisfies the regular expression

public static void Main (string[] args) {
//string
str = "baike.xsoftlab.net" to be validated;
Regular expression rule
String regEx = "baike.*";
Compile regular expression pattern
= Pattern.compile (regEx);
Ignores case writing/pattern
Pat = pattern.compile (regEx, pattern.case_insensitive);
Matcher Matcher = Pattern.matcher (str);
Find out if there is a character/string in the string that matches the regular expression
Boolean rs = Matcher.find ();
System.out.println (RS);

Example three:

The first string in the query target string that satisfies the regular expression

public static void Main (string[] args) {
		 String address = "Shenzhen Baoan District Wuhan Xixiang Street";
		 String para_city = "Shenzhen | wuhan";//This is to match string city
		 = "" with multiple rules;
	     Pattern pattern = pattern.compile (para_city); 
	     Matcher Matcher = pattern.matcher (address);
		 if (Matcher.find ()) {//is used here if, the first match returns city
			 = Matcher.group (0);
		 }  
		 System.out.println (city);
	}

Output: Shenzhen city

What will output if the Wuhan string in the target string is in front of it.

public static void Main (string[] args) {
		 String address = "Shenzhen Xixiang Street, Baoan District, Wuhan City";
		 String para_city = "Shenzhen | wuhan";
		 String city = "";
	     Pattern pattern = pattern.compile (para_city); 
	     Matcher Matcher = pattern.matcher (address);
		 if (Matcher.find ()) {City
			 = matcher.group (0);
		 }  
		 System.out.println (city);
	}

The result is: Wuhan

Describes the order in which regular expressions are matched, with the regular regex as a whole, and from the front end of the string to match backwards.


Example four:

If it's a full character lookup, you need to find the character on the last match.

Change if to a while loop

public static void Main (string[] args) {
		 String address = "Shenzhen Xixiang Street, Baoan District, Wuhan City";
		 String para_city = "Shenzhen | wuhan";
		 String city = "";
	     Pattern pattern = pattern.compile (para_city); 
	     Matcher Matcher = pattern.matcher (address);
		 while (Matcher.find ()) {City
			 = matcher.group (0);
		 }  
		 System.out.println (city);
	}

Output: Shenzhen city



Example five: Text substitution

Mode one: Match.replaceall

public static void Main (string[] args) {
		  String REGEX = ' a*b ';   
		  String INPUT = "Aabfooaabfooabfoob";   
		  String REPLACE = "-";   
		  Pattern p = pattern.compile (REGEX);   
	       Matcher m = P.matcher (INPUT);       Gets the match object   
	       INPUT = M.replaceall (REPLACE);   
	       System.out.println (INPUT);   
	}

Mode two: stirng original support Expression String.replaceall

	public static void Main (string[] args) {
		  String REGEX = ' a*b ';   
		  String INPUT = "Aabfooaabfooabfoob";   
		  String REPLACE = "-";     
	      System.out.println (Input.replaceall (REGEX, REPLACE));   
	

Mode three:

The Matcher class also provides Appendreplacement and Appendtail two methods for text substitution

   public static void Main (string[] args) {  
   String REGEX = ' a*b ';   
   String INPUT = "Aabfooaabfooabfoob";   
   String REPLACE = "-"; 
   
       Pattern p = pattern.compile (REGEX);   
       Matcher m = P.matcher (INPUT);       Get the matching object   
       stringbuffer sb = new StringBuffer ();   
       while (M.find ()) {   
	       //replace the matching value with replace and append the string with the matching substitution to SB
           m.appendreplacement (SB, replace);   
       }   
       M.appendtail (SB);//will append the string in the Matcher of the last match to SB   
       System.out.println (sb.tostring ());   
   


Example SIX:

Replace various symbols in a string

public static void Main (string[] args) {
		//Note that only the punctuation under the English input method can replace the success
		String targetstr = "Are you happy?" I am not happy! ";
		String regEx = "[' ~!@#$%^&* () \\-+={} ':;, \\[\\].<>/?¥% ... () _+| "" ";:" ". ,、。 \\s] ";
	    Pattern p = pattern.compile (regEx);
	    Matcher m = P.matcher (TARGETSTR);
	    System.out.println (M.replaceall (""). Trim ()); 
	


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.