"JS Review notes" 05 Regular expressions

Last Update:2016-01-23 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Well, regular expression, I never had a demerit. I used to copy it on the Internet.

Here's a little bit of it, and then if you have a regular expression to use then collect it in this post. (although I don't think so, because I'm not a professional front end, I'm just going to draw the water. \ (^o^)/)

Application scope: Regular expressions are primarily used to implement find, replace, and extract operations on information in a string.

There are 6 ways to handle regular expressions:

Regexp.exec,regexp.test,string.match,string.replace,string.search and String.Split

Application reason: In JS, the regular expression has a significant performance advantage relative to the equivalent string processing.

Cons: As most people see, this thing sometimes seems complicated and difficult to understand. At least you let me have this dish to maintain a regular expression. I do not copy on the Internet, the general will be used in the form of non-regular expressions to deal with, the United States its name: code readability!

JS the expression must be written in one line, blank needs special attention.

The following paragraph is the code:

var myregexp=/^ (?:( [a-za-z]+]:)? (\/{0,3}) ([0-9.\-a-za-z]+) (?::(\d+))? (?:\ /([^?#]*))? (?:\? ([^#]*))? (?:#(.*))?$/; var url= "Http://www.ora.com:8041/goodparts?q#fragment"; var result=myregexp.exec (URL);

See above this code you know what meaning, most people do not know, know people also want to see half a day. That's why people don't want to write this stuff. All right, here's the chapter, let's go.

Even so, I'm going to write it myself, because the effect he achieves is this:

The result is the following array:

["http://www.ora.com:8041/goodparts?q#fragment", "http", "//", "www.ora.com", "8041", "Goodparts", "Q", "Fragment"]

That's why I keep on writing.

Well, let's learn the pain and the sharp grammar:

^ indicates that the string starts in the following way
(?:( [a-za-z]+]:)? must be followed by a colon to match ( Remember that the colon in this case matches the ), which is determined by the subsequent colon matches a protocol name, which is HTTP.
　　(\/{0,3}) This is the Capture Packet 2, which matches the two left slash
- \ /Represents an escape character that can be understood as \ n.
- {0,3} means/This thing will be matched 0 to 3 times
([0-9.\-a-za-z]+) This is capturing a grouping of 3, matching a www.baidu.com thing, consisting of one or more letters and numbers, as well as . and - two characters. That is to say your URL is www.baidu ...----com---is also correct
(?::(\d+)? This is a non-capturing grouping with capture Packet 4, which matches the port number. That is, the number that begins with. The colleague captures the number and puts it into the result array.
- \d represents a numeric character, and[0-9] can achieve the same effect
(?:\ /([^?#]*))? This is another, a non-capturing grouping with capture Packet 5, which captures the Goodparts
- (?:\ /(...))? Match a string with a left slash/start 0 to 1 times
- [^?#] match not ? and # all the characters,^ denotes non-meaning
- The suffix * means to be matched 0 or more times, and suffix + almost, but + is starting from 1,
(?:\? ([^#]*))? Ditto, similar, I should be able to understand it
(?:#(.*))? roughly ibid.,

. matches all characters except the line terminator

$ means that the string ends in the same way as above

To tell you the truth, I read the book and summed up the words, and I've been thinking about a problem.

When did I find the regular expression difficult?

It's when I'm super-food and I don't like to learn. See what all feel difficult, plus people also impetuous, do not want to sink down to learn, so formed a such impression. Now it seems so simple.

I'll tell you that I've basically never written a regular expression myself, I'll just copy.

But I have just one hours of study, I think I can, and I can immediately write a 6 of the regular expression, no matter how long, just need to put each capturing group to write a line, and then paste into the code when the composite line.

Sudden perception: Programmers just need a quiet heart and learning interest.

I'm not going to tell you that I'm writing a blog while reading, so let's go ahead.

At any rate, I now understand that regular expressions are not difficult, but it is still easier to write regular expressions as simply as possible.

So let's write a regular expression that matches numbers.

var myregexp=/^-?\d+ (?: \. \d*)? (?: e[+\-])? \d+)? $/i; var url= " -1.3e-3"; var result=myregexp.test (URL); // result is true

The last I of the regular expression above indicates that the case is ignored when the string is matched. So let's expand:

End With I: Indicates ignoring string case, matching
End With G: Represents the global (multiple matches). The g,string search method is not recommended for the test method to automatically ignore the G ID.
End With M: MultiRow ($ and ^ can match line terminator)

How to create a regular expression:

The simplest, just like I played on top of it.
- ```
var myregexp=/^-?\d+$/i
```
Another way is to use the RegExp constructor. The Reg constructor is suitable for situations where regular expressions must be dynamically generated at run time.

var myregexp=New RegExp ("\" (?: \ \\\.| [^\\\\\\\"]) *\ "", ' G ');

Properties of the RegExp
- Global: If the identity g is used, the value is true.
- IgnoreCase: If identity i is used, the value is true.
- LastIndex: The next exec match starts the index. The initial value is 0.
- Multiline: If identity m is used, the value is True
- Source: Regular Expression Source text
A RegExp object created with regular expression literals, sharing the same singleton. ( I measured it myself and found it was not so, so the authenticity of this statement is still to be confirmed )

About elements that make up regular expressions

branch : In |, two regular expressions can be used | and up into one, if the string matches any one of the two regular Expressions delimited by |, then this option matches.
The regular expression matches the quantifier , simply speaking is how many times matches
- {3,6} means matching 3 to 6 times
- * Equivalent to {0,}
- + equals {1,}
- ? Equivalent to {0,1}
The matching notation of the ASCII code special characters :
- ```
[!-\/:[email protected]\[-' {-~]
```
  Very ugly, and difficult to understand, so my regular expression ah, alas ~ ~ ~
Regular Expression grouping type
- Capture Type: ()
- Non-capturing type: (?:) For a simple match, the matched text is not captured. Will have a weak performance advantage.
- Forward positive match: (? =) The author says that this feature and the following feature are not good features, so I've decided to start forgetting.
- Backward negative matching: (?!)
characters that require an escape character : \/[] ()? + - * | . ^ s
At the same time some interesting escape characters
- \f Page Break
- \ n line break
- \ r return character
- \ t tab is tab
- \u allows you to specify a Unicode character to represent a 16-binary constant
- \d is equivalent to [0-9],\d the opposite, equivalent to [^0-9]
- \s is equivalent to [\f\n\r\t\u000b\u0020\u00a0\u2028\u2029]. This is an incomplete subset of Unicode whitespace characters, and \s is just the opposite
- \w is equivalent to [0-9a-z_a-z],\w the opposite, \w wants to represent the letter class but it is usually difficult to work with.
- So a simpler letter class is [A-ZA-Z\U00C0-\U1FFF\U2800-\UFFFD], which includes all Unicode letters and other non-alphabetic characters. Unicode is much larger than this, but it's too big and inefficient. So just use this simple.
- The \b is specified as a word boundary identifier, which facilitates matching of the word boundaries of the text. However, he will use \w to find the border, so it is a bad feature for many languages.
- \1 \2 \3 A reference to the text captured by the 1th, 2, and 3 groupings of the respective values
  - So using this regular expression can be used to search for the presence of duplicate words in the text that are separated by several whitespace characters:
```
var doubledword=/([a-za-z\u00c0-\u1fff\u2800-\ufffd]+) \s+\1/gi;
```

"JS Review notes" 05 Regular expressions

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

"JS Review notes" 05 Regular expressions

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

"JS Review notes" 05 Regular expressions

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support