JavaScript regular Expressions (REGEXP) Overview

Source: Internet
Author: User

First of all, let's consider the following two scenarios

    1. We use the Windows operating system, sometimes need to find a file, just good this file I do not know where to go, this time we should do?
    2. When we use Word to write a paper, we accidentally put the "order" in the "order" in the word "fixed", what should we do?

For the first case, we will use the system's File search function, enter the name of the file, the system will help us find the location of this file.
In the second case, we will use the replacement function provided by the software, first look for the wrong content, and then replace the wrong content with the correct content.

There are a lot of similar scenes.
It can be found that the two situations are similar, based on the input keywords, find the specific content, and then do further processing.

This is the most common scenario for regular expressions.

Regular expressions are usually used in computer systems to match specific characters. Matches can also be understood as lookups.

Let's take a look at the simplest regular expressions:

var pat1 =/hi/; var pat2 =/hello/; var pat3 =/a/; var pat4 =/d/; var pat5 =/e/; var pat6 =/[f]/;

These are the simplest regular expressions, and you can see that regular expressions are very simple.

As you can see from the preceding regular expressions, the regular expression is written between two slashes.
Remember, it is a slash:/, not a backslash: \
Slash left, backslash right.

Next we will recognize three parentheses:
Parentheses: ()
Brackets: []
Curly braces: {}
These three parentheses are the grammatical souls of regular expressions.


parentheses Group Regular expression content: You can group any character or block, usually when the block is qualified by another condition, such as a length constraint, such as selecting

// matches: Acd,bcd var patter1 =/(a|b) cd/; // matches: abcdefgh var patter2 =/ABCD (EF) gh/; // matches: Aefg,aefh,befg,befh,cefg,cefh,defg,defh var patter3 =/[abcd] (EF) [gh]/; var patter3 =/[abcd]ef[gh]/;

the brackets denote the range of characters, regardless of how much content is inside the brackets, and one bracket matches only one character or one subexpression (the subexpression can be interpreted as a single character)

//matches: ac,bc,ad,bdvarPatter1 =/[ab][cd]/;//matches: A,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,zvarPatter2 =/[a-z]/;//matches: 0,1,2,3,4,5,6,7,8,9varPatter3 =/[0-9]/;//matches: H,e,l,ovarPatter4 =/[helo]/;//matches any character except A,b,cvarPatter5 =/[^abc]/;//any character other than the lowercase lettervarPatter6 =/[^a-z]/;

curly braces are used to control the character length. Curly braces are used to denote the length of 1 characters or 1 sub-expressions

// match: A var p1 =/a{1}/; // matches: Abcabc var p2 =/abc{2}/; // match: Abcabc or more than 2 ABC var p3 =/abc{2,}/; // two or 2 or more 6 or less of the ABC, including 6 var p4 =/abc{2,6}/;

Let's look at some examples of regular expressions:

//match A or B or CvarP1 =/[abc]/;//match ABCvarP2 =/abc/;//matches any one letter from A to ZvarP3 =/[a-z]/;//match any one of the Arabic numeralsvarP4 =/[0-9]/;//match 1 or 2 or 3 or 4varP5 =/[1234]/;//match one or two or 3 "Hello"varP6 =/hello{1,3}/;//Match AC,ACAC,AD,ADAD,BC,BCBC,BD,BDBDvarP7 =/([AB][CD]) {1,2}/;//match an ABD or an ACDvarP8 =/(a[bc]d) {1}/;//match the Abd of one or more (2,3,4,5,6,7 ...), match one or more (1,2,3,4,5,6 ...) The ACDvarP9 =/(a[bc]d) {1,}/;//Match//ADCD,AEAE,AFAF//BDBD,BEBE,BFBF//CDCD,CECE,CFCFvarP10 =/([abc][def]) {2}/;//Match AbabvarP11 =/([a][b]) {2}/;//Match AbabvarP12 =/(AB) {2}/;//matching ABBvarP13 =/ab{2}/;//Match ABAB,CDCDvarP14 =/[(AB) (CD)]{2}/;

Through our study, we see that the core of the syntax of regular expressions is three parentheses: (), [], {}
Parentheses are used to group, sub-expression, a parenthesis can be interpreted as a character
The brackets are used to represent the characters, and the matching content is one of them
Curly braces are only used to determine the length of a character or sub-expression (which can be interpreted as a character) in three cases:
{n} = n length
{N,} indicates at least n length, no upper limit
{N,m} indicates the length between n-m

To learn the regular expression is to master the most basic grammar.

Are these the regular expressions that we're talking about? It seems to be different from what I've seen before. Don't worry, let's see some more:
Take a look at these few:

var pt1 =/^ ((https|http|ftp|rtsp|mms)?: \ /\/) [^\s]+/; var pt2 =/\w[-\w.+]*@ ([a-za-z0-9][-a-za-z0-9]+\.) +[a-za-z]{2,14}/; var pt3 =/[a-za-z0-9_\-\u4e00-\u9fa5]+/; var pt4 =/[^\x00-\xff]/; var pt5 =/[\u4e00-\u9fa5]/;

It's like these five are really regular expressions.

Yes, these are regular expressions, but why is it different from what we just learned?
Looking back, we just said that the core of the regular expression is three parentheses: (), [],{}
Looking at these regular expressions above, let's look at some of the
The first one appeared inside (), []
The second one appeared inside (), [],{}
There is a third one inside []
The fourth one has []
But what are the other seemingly chaotic things in these expressions?
such as the following: |,^,\,., +,?
Of course there are some other, not listed here.
Let's go back to the curly braces.
{0} 0
{0,1} 0 or 1
{0,} more than 0
{1} One
{1,} at least 1
Braces are limited to the length of the character, just, some people think always write curly braces, a little annoying, so created the following several equivalent characters
*, any length, 0 or more times, i.e. 1,2,3,4,5,6,7,8 ...
Equivalent to {0,}
+, one or more times (at least once)
Equivalent to {1,}
?, 0 or one time
Equivalent to {0,1}

So the above-mentioned expressions can be replaced: the above five expressions can also be written as follows.

var pt1 =/^ ((https|http|ftp|rtsp|mms) {0,1}:\/\/) [^\s]{1,}/; var pt2 =/\w[-\w.+]{0,}@ ([a-za-z0-9][-a-za-z0-9]{1,}\.) {1,} [A-za-z] {2,14}/; var pt3 =/[a-za-z0-9_\-\u4e00-\u9fa5]{1,}/; var pt4 =/[^\x00-\xff]/; var pt5 =/[\u4e00-\u9fa5]/;

Therefore, when representing the length of a character, it is possible to replace each other with the expression of the brackets in the three characters (*, +,?).

These symbols appear in the regular, which naturally makes the regular look like a mess.

Is there any alternative to the above? Of course.

Here is a table, which is a character that can be interchanged as specified in a regular expression.

\b

Match a word boundary, that is, the position between the word and the space
\b Matches a non-word boundary.
\cx Matches the control character indicated by X.
\d Matches a numeric character. equivalent to [0-9].
\d Matches a non-numeric character. equivalent to [^0-9].
\f Matches a page break. Equivalent to \x0c and \CL.
\ n Matches a line break. Equivalent to \x0a and \CJ.
\ r Matches a carriage return character. Equivalent to \x0d and \cm.
\s Matches any invisible character, including spaces, tabs, page breaks, and so on. equivalent to [\f\n\r\t\v].
\s matches any visible character. equivalent to [^ \f\n\r\t\v].
\ t Matches a tab character. Equivalent to \x09 and \ci.
\v Matches a vertical tab. Equivalent to \x0b and \ck.
\w Matches any word character that includes an underscore. Similar but not equivalent to "[a-za-z0-9_]", where the "word" character uses the Unicode character set.
\w Matches any non-word character. Equivalent to "[^a-za-z0-9_]".

Based on the table above, we can also continue to rewrite the above five regular expressions. Replace some of these characters.

Dear friends, you might as well try.

Consider a question: what if we want to match the parentheses, the brackets, or the curly braces themselves?
This time, the backslash will play a big role.
\
Escape character: "\"
The function of an escape character is to mark the next character as a special character:
\ \ Indicates matching "\"
\ (indicates a match "("
\) indicates a match ")"
\ n denotes a newline character.
\[that matches the "["

Like what:

// matching: hi[j] var patter =/hi\[j\]/; // matches: \hello var patter =/\\hello/;

The escape character is essentially present in every programming language, and as it does here, the character following the escape character is marked as a special character.

The regular expression described here is the simplest description, and there is basically no introduction to the concept of regular expressions. If you want to learn more about regular expressions, you need to learn some of the following nouns:

REGEXP, quantifier, meta-character, reverse reference, backtracking, grouping, sub-expression, 0-wide assertion, greedy mode, lazy mode, search before and after.

But in daily work, after learning the basic grammar, it can be developed. In the process of work, through the actual combat, I believe you will be more familiar with the regular expression.

The above is an introduction to regular expressions.

It is necessary to note that the regular expression is not a programming language, and almost all programming languages now have a module of regular expressions to handle string-related content. In JavaScript, for example, there are specialized regexp objects that do regular-related processing. Other languages, such as Php,java,python, have relevant content.

I am a front-end, so let me share some of my experience with using regular expressions in JavaScript in front-end development.

JavaScript has a total of seven ways to do the regular aspects of processing

There are three methods of RegExp objects:

Compile ()

EXEC ()

Test ()

There are four methods of the string object

Search ()

Match ()

Replace ()

Split ()

/There are three methods of RegExp objects://compile ()//exec ()//Test ()//There are four methods of the string object//Search ()//match ()//replace ()//Split ()//creating a String ObjectvarA_str = "Hello World, She's a beautiful girl and he's a boy";//creating a regular Expression objectvarpatter =/he/i;//Regular Object methods://returns an array that holds the results of the match. This method is very powerful. Here's just one simple example.patter.exec (A_STR);//Determines if there is a match in the string, if any, returns true if there is no return falsePatter.test (A_STR);//true//String Method//returns the starting position of the first substring matched to, or 1 if noneA_str.search (patter);//returns one or more matching valuesA_str.match (patter);//To replace a substring in a stringA_str.replace (patter, "Hello");//splitting a stringA_str.split (patter);

There is also a regular expression modifier in JavaScript that is not mentioned. There are three modifiers, global (g), ignoring case (i), multiline matching (m)

// Ignore Case var patter1 =/hello/i; // multi-line matching var patter2 =/hello/m; // Global Match var patter3 =/hello/g;

Regular expressions are a powerful tool in development. Learn the regular expression, go all over the world is not afraid.

I'm the planet Don. Welcome to the program Ape and program of the daughter come to hook.

No.; Pelligit

qq:2653807423

Giithub:www.github.com/pelligit

Good night.

JavaScript regular Expressions (REGEXP) Overview

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.