Regular expression notes (i)

Source: Internet
Author: User

(1) quantifier Explanation:

? Occurs 0 or 1 times

* appears 0 or n times

+ occurs 1 or n times

1 greedy, inert, dominant quantifier

A greed: see if the entire string matches, and if no match is found, remove the last character match, and so on

B inertia: Just as opposed to greed, first see whether the first character matches, such as mismatch, followed by the next letter matches, and so on

C dominate quantifiers: Try to match only the entire string.


How to differentiate: for example: one alone. Is greedy, the question mark is followed by a question mark (...). ) is inert, the dominant quantifier is: in. followed by a plus sign (. +


Greedy Lazy Domination Description mode

。 。。 。 + indicates 0 or 1 occurrences

* *。 *+ indicates 0 or n occurrences

+ +. + + indicates 1 or n occurrences

{n} {n}. {n} + indicates n occurrences


	/**
	 * Greedy method:
	 * Plan to match the entire string at once, and if it finds a mismatch, remove the last character, match it, and so on.
	 * Once the match is successful, do not return immediately, but jump to the beginning of the next character, start the greedy match again *
	/public void greedy () {
		 String str= " 1ABBBAABBBAAABBB1234ABBBC "; 
	        Pattern P2 = pattern.compile ("[^1]*bbb");
	        Matcher m2 =p2.matcher (str);
	        while (M2.find ()) {
	            System.out.println (M2.group ());
	        }
	       The first character that satisfies the condition is first found, so 1 is filtered out and then the global match
	        //abbbaabbbaaabbb1234abbbc mismatch
	        //abbbaabbbaaabbb1234abbb Mismatch
	        //abbbaabbbaaabbb1234abb mismatch
	        //...
	        ABBBAABBBAAABBB          match
	       //After the expression will continue to look for the first character after the string, so the following 1 will be filtered out, then the global match, namely:
	       //234ABBBC   mismatch
	       //  234abbb    matches
	        //So you get two matching results  abbbaabbbaaabbb and 234ABBB
	}





	/**
	 * Lazy method: Starting from the first character to match, if not match, then read in a character,
	 * for the overall match, and so on *
	 * public
	Void Reluctant () {
		 String str= "1ABBBAABBBAAABBB1234ABBBC"; 
	        Pattern P2 = pattern.compile ("[^1]*?bbb");
	        Matcher m2 =p2.matcher (str);
	        while (M2.find ()) {
	            System.out.println (M2.group ());
	        }
	        The first character that satisfies the condition is found first, so 1 is still filtered, starting with a, the matching process is as follows
			//a mismatch    //ab mismatch//abb mismatch
			//abbb match
	        //  Continue reading in the next position first match character
	       //a mismatch
	        //aa
	       //mismatch
	       //AABBB match
	       // Until the end, so the results are abbb,aabbb,aaabbb,234abbb


		
	}



(2) Group: use ()

(3) Reverse reference: Each grouping is placed in a special place for future use. These special values, which are stored in the grouping, are referred to as reverse references

Expression: (A? ( B? (c?))) Can produce a reverse reference. and () related

(1) (A? ( B? (c?)))

(2) (B? ( C?)

(3) (c?)
For example: All the reverse references are saved in the RegExp function.


<script type= "Text/javascript" > 
       var str1= "#123456789";
       var renumbers=/# (\d+)/;
       Alert (Renumbers.test (STR1));
       alert (regexp.$1);//returns \d+ matching string in parentheses: 123456789
</script>


You can also use include reverse references directly in the definition grouping expression. You can escape symbols by escaping


<script type= "Text/javascript" > 

 var str2= "Dogdog";
var redog=/(dog) \1/;
alert (Redog.test (STR2)) The meaning of//\1 represents the reference to the dog phrase, equivalent to/dogdog/alert (regexp.$1);//dog

</script>


The reverse reference is applied in replace.

Implemented through $1,$2.

Example: implementation of the TRIM () function


<script type= "Text/javascript" > 

string.prototype.trim=function () {
var reg=/^\s+ (. *?) \s+$/; 
Return This.replace (Reg, "$"); 
} 
var t= "SDSDSD SDSDSD";
Alert ("{" +t.trim () + "}");//Remove the front and back spaces, leaving only the middle of the content, using $ $. Very strong

</script>


Example 2: Reversing a string using a reverse reference


<script type= "Text/javascript" > 


var str3= "1234 5678"; 
var rematch=/(\d{4}) (\d{4})/; 
var snews=str3.replace (rematch, "$ $"); 
alert (sNews);//5678 1234 Reverses the value of the grouping

</script>

(4) Candidate: Use the pipe symbol (|) to indicate or


For example:

<script type= "Text/javascript" > 

 var str4= "This is a string using Badword1 and Badword2";
var Rematch4=/bad Word|anotherbadword/ig; 
 var result=str4.replace (REMATCH4, "* *"); 
  alert (result); Returns this is a string using **1 and **2

</script>


(5) Non-capturing grouping

A grouping that creates a reverse reference is called a capturing grouping, and storing a reverse reference decreases the matching speed, and the ability to match a sequence of strings by a non-capturing grouping can be achieved without the overhead of storing the result

Rule: Add a question mark after the opening parenthesis and immediately follow the colon

<script type= "Text/javascript" > 

 var str1= "#123456789"; 
 var renumbers=/# (?: \ d+)/;
 Alert (Renumbers.test (STR1));//true 
 alert (regexp.$1);//returns the string in parentheses \d+ match: ""

</script>



Increased more than capturing groups?:, but the returned result is still true, except that the grouped values are not stored in regexp

Example 2

<script type= "Text/javascript" > 

var str3= "1234 5678"; 
var rematch=/(?: \ D{4}) (\d{4})/; 
var snews=str3.replace (rematch, "$ $");
alert (sNews);//5678 

</script>

Add it. : A non-capturing grouping that does not store grouped content, so the $ string is displayed directly, and $5678 is directly matched.

Example 3

<script type= "Text/javascript" > 


//Create the contents of the HTML tag you go to
string.prototype.scripthtml=function () {  var reg=/< (?:. | \s) *?>/ig; 
   Return This.replace (Reg, ""); 
 } 
var stest= "<br>232323 233swew23 232</br>"; 
Alert (stest.scripthtml ());//232323 233swew23 232

</script>


Remove the end-to-end label to get the content between tags


(6) Forward-looking

Sometimes, you want a special character group to appear before another string. Just to catch it.

It is divided into forward forward and negative forward. Forward-looking check is whether a particular character set appears, and negative forward checks the character set that should not appear next

Forward-looking to put the pattern in the (. = and), this is not a grouping, and the grouping does not consider a forward-looking presence (whether positive or negative). XX is followed by XX.


<script type= "Text/javascript" > 

var stomatch1= "bedroom"; 
var stomatch2= "Bedyard";
var regstr=/(=room)/;//indicates that the bed is immediately followed by the
Guest (Regstr.test (STOMATCH1));//true 
alert (regexp.$1);//bed alert (regexp.$2); "" indicates that the second parenthesis is used. = Not grouped, no value stored
alert (regstr.test (STOMATCH2));//false

</script>

Negative forward looking to put the pattern in the (.. and), which means XX cannot be followed by XX.

For example:


<script type= "Text/javascript" > 


var stomatch1= "bedroom"; 
var stomatch2= "Bedyard"; 
var regstr=/(?! )/;//indicates that the bed cannot be followed by the
regstr.test (sToMatch1);//false 
alert (regstr.test (STOMATCH2));// True alert (regexp.$1); Bed alert (regexp.$2); "" indicates that the second parenthesis is used. = Not grouped, no value stored

</script>


Note that there is no post-juniper

(7) Boundary

^ Represents the beginning of the line, $ means the end of the \b word at the end of the line, \b represents a non-word boundary

For example:

<script type= "Text/javascript" > 

var ss= "The important is a improve";
var regstr=/^ (\w+)/;//begins with a letter. Matches the       
 var regstr=/^ (. +?) \b/; Use lazy mode to start from the left to match, must be the word     
  alert (regstr.test (ss));//true     
  alert (regexp.$1);//the 

</script>


(8) Multi-line mode

<script type= "Text/javascript" > 

var ss= "The important is a improve\n Asdsds abc\n wewe 2we";
var regstr=/(\ w+) $/g;//matches the 
 alert (Ss.match (REGSTR)) ending with a non-whitespace character;//The result of the return is 2we

</script>

Because the string contains \ n newline characters, if you need to return both improve and ABC 2we.

Multiline mode requires only one M option


<script type= "Text/javascript" > 

var ss= "The important is a improve\n Asdsds abc\n wewe 2we"; 
var regstr=/(\w+) $/mg;//matches the 
alert (Ss.match (REGSTR)) ending with a non-whitespace character;//return improve,abc,2we

</script>




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.