Introduction to Regular expressions and common usage

Source: Internet
Author: User
Tags character classes truncated
<span id="Label3"></p><p class="MsoNormal"><p class="MsoNormal">Regular Expressions <span>(Regular Expression)</span><span>, also known as formal representations and regular representations, are often used in Real-world software development Projects. It uses a single string to describe, match, and obtain a series of results that conform to a certain syntactic rule. </span></p></p><p class="MsoNormal"><p class="MsoNormal">I will write this <b>regular expression</b> <b>tutorial</b> for you from the most basic part.</p></p><p class="MsoNormal"><p class="MsoNormal"></p></p><p class="MsoNormal"><p class="MsoNormal"><b>The origin of regular expressions</b><b></b></p></p><p class="MsoNormal"><p class="MsoNormal"><b></b></p></p><p class="MsoNormal"><p class="MsoNormal">1956 <span>, mathematician</span> <span>Stephen Kleene</span> <span>in</span> <span>Warren McCulloch</span> <span>and</span> <span>Walter Pitts</span> <span> Based on the work of the early nervous system, a mathematical symbolic system</span> <span>--regular sets (</span> <span>set of rules</span> <span>)</span><span>was designed, which was quickly used by computer scientists for scanning or lexical analysis of Compilers. The powerful text processing capability of regular expressions is quickly applied to</span> the <span>Unix</span> <span>tool-software</span> <span>grep</span> <span>; Since then, regular expressions have been widely</span> <span>used in Unix</span> <span>system,</span><span>Perl</span><span></span>,<span>PHP</span><span>,</span><span>Delphi</span><span>,</span> <span>JAVAScript</span> <span>,</span> <span>C # (. NET)</span> <span>,</span> <span>JAVA</span> <span>,</span> Python,<span>Ruby</span> , <span>and other languages and development environments. </span></p></p><p class="MsoNormal"><p class="MsoNormal"></p></p><p class="MsoNormal"><p class="MsoNormal"><b>Regular expression definitions</b><b></b></p></p><p class="MsoNormal"><p class="MsoNormal"><b></b></p></p><p class="MsoNormal"><p class="MsoNormal">A regular expression is a language used to describe a specific structure <span>(</span> <span>rule</span> <span>)</span> <span>of a string that is executed by the relevant ENGINE. The regular expression is visualized in Description</span> <span>1</span> <span>. </span></p></p><p class="MsoNormal"><p class="MsoNormal"><span><br></span></p></p><p class="MsoNormal"><p class="MsoNormal"><span></span></p></p><p class="MsoNormal"><p class="MsoNormal">Figure <span>1</span> <span>Description of the visualization of the regular expression</span></p></p><p class="MsoNormal"><p class="MsoNormal"></p></p><p class="MsoNormal"><p class="MsoNormal"><b>The role of regular expressions</b><b></b></p></p><p class="MsoNormal"><p class="MsoNormal"><b></b></p></p><p class="MsoNormal"><p class="MsoNormal">1. <span>Data Validation</span></p></p><p class="MsoNormal"><p class="MsoNormal">Tests the input string, whether it conforms to a certain rule, whether the input is allowed, and so On. For example, you can verify <span>email</span> <span>address legality, url, phone number, date of birth, and so On. </span></p></p><p class="MsoNormal"><p class="MsoNormal"></p></p><p class="MsoNormal"><p class="MsoNormal">2. <span>manipulating text</span></p></p><p class="MsoNormal"><p class="MsoNormal">Used to identify specific text in a document, delete the text completely, or replace it with other text or Characters.</p></p><p class="MsoNormal"><p class="MsoNormal"></p></p><p class="MsoNormal"><p class="MsoNormal">3. <span>extracting substrings</span></p></p><p class="MsoNormal"><p class="MsoNormal">Based on pattern matching, you can find specific text within a document or input fields, and often need to be extracted first when involved in a replacement operation.</p></p><p class="MsoNormal"><p class="MsoNormal"></p></p><p class="MsoNormal"><p class="MsoNormal"><b>Regular expression Basic syntax</b><b></b></p></p><p class="MsoNormal"><p class="MsoNormal"><b></b></p></p><p class="MsoNormal"><p class="MsoNormal">1. <span>Direct Volume characters</span> <span>(</span> <span>usually invisible characters and characters that match themselves</span> <span>)</span></p></p><p class="MsoNormal"><p class="MsoNormal">2. <span>character Classes</span> <span>(</span> <span>can match multiple characters</span> <span>)</span></p></p><p class="MsoNormal"><p class="MsoNormal">3. <span>Repeat</span></p></p><br></span> <p><p></p></p> <p class="MsoNormal"><p class="MsoNormal"><span></span></p></p> <p class="MsoNormal"><p class="MsoNormal">4. <span>select, group, and anchor characters</span></p></p> <p class="MsoNormal"><p class="MsoNormal">5. <span>logo</span> <span>(</span> <span>indicating the operating mode of the engine</span> <span>)</span></p></p> <p class="MsoNormal"><p class="MsoNormal"><span></span></p></p> <p class="MsoNormal"><p class="MsoNormal">Description</p></p> <p class="MsoNormal"><p class="MsoNormal">Use <span>the "(? i)"</span> indicator before the regular expression <span>, for example:</span><span>(? i) ^root$</span><span>,</span> <span>root</span><span></span>,<span>root</span><span>, </span> <span>ROOT</span> <span>meets the Requirements. </span></p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal">6. <span>Other</span></p></p> <p class="MsoNormal"><p class="MsoNormal">1) <span>match meta-character</span> <span>"([{\ ^ $ |)?" * +. "</span> <span>, need to use</span> <span>"\"</span> <span>Escaped. </span></p></p> <p class="MsoNormal"><p class="MsoNormal"><span>for example: match</span> <span>"."</span> <span>, the regular expression is</span> <span>"\."</span> <span>. </span></p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal">2) <span>greedy quantifier and lazy quantifier</span></p></p> <p class="MsoNormal"><p class="MsoNormal">The lazy quantifier only adds a <span>"?"</span> to the greedy quantifier. <span>. </span></p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal">When a greedy quantifier is used to match, it first treats the whole string as a match, if the match succeeds, if it does not match, the last character is truncated, and if it does not match, the last character is truncated to match until a character match is Reached.</p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal">When matching with an inert quantifier, it first treats the first character as a match, exits if successful, or, if it fails, the first two characters of the test, incremented until a suitable match is Encountered.</p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal">For example:<span>"\d+"</span> <span>is a greedy quantifier, and</span> <span>"\d+?"</span> <span>Non-greedy</span> <span>(</span> <span>inert</span> <span>)</span> <span>Quantifiers. </span></p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal">3) <span>Sub-match</span></p></p> <p class="MsoNormal"><p class="MsoNormal">The internal grouping matches, using <span>"()"</span> to <span>mark a grouping. </span></p></p> <p class="MsoNormal"><p class="MsoNormal">Each subgroup of a sub-match is placed in a special place for future use, and these stored values are special values in the grouping, called reverse References.</p></p> <p class="MsoNormal"><p class="MsoNormal">For example: Verify that the input is a date and then extract it to the month with the regular expression <span>"^\d{4}\-(\d{2}) \-\d{2}$"</span><span>. </span></p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal">4) forward <span>looking and negative forward looking</span></p></p> <p class="MsoNormal"><p class="MsoNormal">Forward forward:<span>(? <=</span> <span>character</span> <span>)</span> <span>or</span> <span>(? =</span> <span>character</span> <span>)</span></p></p> <p class="MsoNormal"><p class="MsoNormal">Note: Be sure to have an equal LINE.</p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal">Negative outlook:<span>(? <!</span> <span>character</span> <span>)</span> <span>or</span> <span>(?!</span> <span>character</span> <span>)</span></p></p> <p class="MsoNormal"><p class="MsoNormal">Note: Be sure to have a non-equal line.</p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal">That is, we can set ourselves where the matching boundaries are, which is often used in string Extraction.</p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal">Example:</p></p> <p class="MsoNormal"><p class="MsoNormal">Example <span>1</span><span>, we take</span> the characters before <span>"#"</span> , <span>but do not include</span> <span>"#"</span><span>, its regular expression:</span><span>[\w]+ (? = #)</span></p></p> <p class="MsoNormal"><p class="MsoNormal">Example <span>2</span><span>, we take</span> a character that is not before <span>"#</span> ", <span>but does not include</span> <span>"#"</span><span>, its regular expression: </span> <span>[\w]+ ( ?! #)</span></p></p> <p class="MsoNormal"><p class="MsoNormal">Example <span>3</span><span>, we take</span> the characters between <span>"<></span> ", <span>but do not include</span> <span>"<>"</span><span>, its regular expression: </span> <span>(? <= <) [\w]+ (?=>)</span></p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal"><b>Common usage of regular expressions</b><b></b></p></p> <p class="MsoNormal"><p class="MsoNormal"><b></b></p></p> <p class="MsoNormal"><p class="MsoNormal">1. <span>Digital</span></p></p> <p class="MsoNormal"><p class="MsoNormal">1) <span>Positive integer: </span> <span>^[1-9][0-9]*$</span></p></p> <p class="MsoNormal"><p class="MsoNormal">2) <span>non-positive integers: </span> <span>^ ((-[1-9][0-9]*) | ( 0)) $</span></p></p> <p class="MsoNormal"><p class="MsoNormal">3) <span>negative integer:</span><span>^-[1-9][0-9]*$</span></p></p> <p class="MsoNormal"><p class="MsoNormal">4) <span>integer: </span> <span>^ (0|-?[ 1-9][0-9]*) $</span></p></p> <p class="MsoNormal"><p class="MsoNormal">5) <span>non-negative floating point number:</span><span>^\d+ (\.\d+)? $</span></p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal">2. <span>Letters</span></p></p> <p class="MsoNormal"><p class="MsoNormal">1) <span>English string:</span><span>^[a-za-z]+$</span></p></p> <p class="MsoNormal"><p class="MsoNormal">2) <span>English capitalization string:</span><span>^[a-z]+$</span></p></p> <p class="MsoNormal"><p class="MsoNormal">3) <span>English lowercase string:</span><span>^[a-z]+$</span></p></p> <p class="MsoNormal"><p class="MsoNormal">4) <span>English characters number string:</span><span>^[a-za-z0-9]+$</span></p></p> <p class="MsoNormal"><p class="MsoNormal">5) <span>English numerals plus underline string:</span><span>^\w+$</span></p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal">3. <span>Other</span></p></p> <p class="MsoNormal"><p class="MsoNormal">1.e-mail <span>address:</span><span>^[\w-]+ (\.[ \w-]+) *@[\w-]+ (\.[ \w-]+) +$</span></p></p> <p class="MsoNormal"><p class="MsoNormal">2.URL<span>:</span><span>^http:\/\/[a-za-z0-9]+\.[ a-za-z0-9]+[\/=\?%\ -&_~ ' @[\]\ ': +!] * ([^<>\ "\"]) *$</span></p></p> <p class="MsoNormal"><p class="MsoNormal">3. <span>zip code:</span><span>^[1-9]\d{5}$</span></p></p> <p class="MsoNormal"><p class="MsoNormal">4. <span>chinese:</span><span>^[\u4e00-\u9fa5]+$</span></p></p> <p class="MsoNormal"><p class="MsoNormal">5. <span>Phone number:</span><span>^ ((</span>\d2,3) | ( \d{3}\-))? (0\d2,3|0\d{2,3}-)? [1-9]\d{6,7} (\-\d{1,4})? $</p></p> <p class="MsoNormal"><p class="MsoNormal">6. <span>Mobile Number:</span><span>^1\d{10}$</span></p></p> <p class="MsoNormal"><p class="MsoNormal">7. Leading and <span>trailing spaces:</span><span>(^\s+) | ( \s+$)</span></p></p> <p class="MsoNormal"><p class="MsoNormal">8. <span>Identity card:</span><span>^ (\d{15}|\d{18}) $ (</span> <span>note: China's identity card is a</span> <span></span> level <span>or</span> <span></span> <span></span> a <span>)</span></p></p> <p class="MsoNormal"><p class="MsoNormal">9. <span>Account Number:</span><span>^[a-za-z]\w{4,15}$ (</span> <span>note: The letter begins with the Allow</span> <span>5-16</span> <span>byte, allowing alphanumeric</span> underlines <span>) </span></p></p> <p class="MsoNormal"><p class="MsoNormal">10.IP<span>:</span><span>^ ([1-9]\d{0,1}|1\d{2}|2[0-4]\d|25[0-5]) (\. ( [1-9]\d{0,1}|1\d{2}|2[0-4]\d|25[0-5]) {3}$ (IP</span> <span>is a number greater than or equal</span> <span>to 0</span> <span>and less than or equal</span> to <span>255</span> <span>, </span> <span>"."</span> <span>to validate each number sub-item again and</span> <span>"."</span> <span>stitching on it is possible.</span> <span>)</span></p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal"><b>Summarize</b><b></b></p></p> <p class="MsoNormal"><p class="MsoNormal"><b></b></p></p> <p class="MsoNormal"><p class="MsoNormal">The function of regular expression is very powerful, we can feel it when we actually use it. of course, It is not a simple matter to remember so many regular expression rules at the same time. You can record common regular expressions <span>(</span> as <span>in this article</span> <span>)</span><span>and refer to them when you need Them. </span></p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal"></p></p> <p class="MsoNormal"><p class="MsoNormal">Source:<span>C</span> <span>blog</span> <span>/programmer_zhou</span> <span>'s Column</span></p></p> <br> <p><p></p></p> <br> <p><p></p></p> <p><p>Introduction to Regular expressions and common usage</p></p>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.