Use of Regular Expressions in Javascript and basic syntax,
Previous
Regular expressions may be a bunch of incomprehensible characters in people's minds, but these symbols implement efficient string operations. Generally, the problem is not complicated, but it becomes a big problem without a regular expression. Regular Expressions in javascript are very important knowledge. This article will introduce the basic syntax of regular expressions.
Definition
Regular Expression is a syntax specification for a simple language. It is a powerful, convenient, and efficient text processing tool. It is used in some methods, search, replace, and extract information in strings
Regular Expressions in javascript are represented by RegExp objects. There are two types of expressions: literal expressions and constructor expressions.
Regular Expressions are especially useful for processing strings. There are many places where regular expressions can be used in JavaScript. This article summarizes the basic knowledge of regular expressions and the use of regular expressions in Javascript.
The first section briefly lists the use cases of Regular Expressions in JavaScript. The second section describes the basic knowledge of Regular Expressions in detail and provides some examples for ease of understanding.
The content of this article is a summary after I read the regular expression writing and the section of the js Regular Expression in the rhino book. Therefore, the content may be omitted and not rigorous. If a great God passes by and finds a mistake in the text, thank you!
Use of Regular Expressions in Javascript
A regular expression can be considered as a feature description of a character segment, and its function is to find a sub-string that meets the conditions from a bunch of strings. For example, I define a regular expression in JavaScript:
Var reg =/hello/or var reg = new RegExp ("hello ")
Then this regular expression can be used to find the word hello from a bunch of strings. The result of the "find" action may be to locate the first hello location, replace hello with another string, and find all hello. The following describes functions that can use regular expressions in JavaScript and briefly introduces the functions of these functions. More complex usage will be introduced in the second part.
String. prototype. search Method
Used to find the index of the first occurrence of a substring in the original string. If no index is found,-1 is returned.
"abchello".search(/hello/); // 3
String. prototype. replace Method
Used to replace substrings in a string
"abchello".replace(/hello/,"hi"); // "abchi"
String. prototype. split Method
Used to split strings
"abchelloasdasdhelloasd".split(/hello/); //["abc", "asdasd", "asd"]
String. prototype. match Method
It is used to capture the substring in the string into an array. By default, only one result is captured in the array. When the regular expression has the "Global capture" attribute (the parameter g is added when the regular expression is defined), all results are captured to the array.
"abchelloasdasdhelloasd".match(/hello/); //["hello"]"abchelloasdasdhelloasd".match(/hello/g); //["hello","hello"]
When the regular expression used as the match parameter has global attributes, the expression of the match method is different, which will be discussed in the subsequent regular expression grouping.
RegExp. prototype. test Method
Used to test whether a string contains a substring.
/hello/.test("abchello"); // true
RegExp.prototype.exe c method
Similar to the string match method, this method also captures matching strings from strings to arrays, but there are two differences.
1. the exec method can capture only one part of the string to the array at a time, regardless of whether the regular expression has a global attribute.
var reg=/hello/g;reg.exec("abchelloasdasdhelloasd"); // ["hello"]
2. the regular expression object (the RegExp object in JavaScript) has a lastIndex attribute, which indicates the position from which the next capture starts. After each exec method is executed, the lastIndex will be pushed back, until no matching characters are found, null is returned, and then captured from the beginning. This attribute can be used to traverse the substrings In the captured string.
var reg=/hello/g;reg.lastIndex; //0reg.exec("abchelloasdasdhelloasd"); // ["hello"]reg.lastIndex; //8reg.exec("abchelloasdasdhelloasd"); // ["hello"]reg.lastIndex; //19reg.exec("abchelloasdasdhelloasd"); // nullreg.lastIndex; //0
Regular Expression Basics
Metacharacters
The first section above takes/hello/as an example. However, in practice, you may encounter the following requirement: match an uncertain number, the starting position of the match, the ending position of the match, and the matching blank space. In this case, metacharacters can be used.
Metacharacters:
// Match the number: \ d "ad3ad2ad ". match (/\ d/g); // ["3", "2"] // match any character except the line break :. "a \ nb \ rc ". match (/. /g); // ["a", "B", "c"] // match letters, numbers, or underscores: \ w "a5 _ Chinese Character @! -= ". Match (/\ w/g); // ["a", "5", "_"] // match blank space: \ s "\ n \ r ". match (/\ s/g); // ["", "", ""] The first result is \ n, the last result is \ r // match the start or end position of the word: \ B "how are you ". match (/\ B \ w/g); // ["h", "a", "y"] // match the start and end of the string: start ^ end $ "how are you ". match (/^ \ w/g); // ["h"]
The post-negative metacharacters are written to convert the upper lowercase letters into uppercase letters. For example, match all characters that are not numbers: \ D.
There are also some metacharacters used to represent repeated metacharacters, which will be described in the following content.
Character range
Use the symbol-in [] to indicate the character range. For example:
// Match all letters between letters a-z/[a-z] // match all characters in Unicode between numbers 0 and z/[0-z] // unicode encoding query address: // between \ u4E00 and \ u9FA5, so we can write a regular expression to determine whether a string contains Chinese characters/[\ u4E00-\ u9FA5]/. test ("test"); // true
Repetition & greed and laziness
First, repeat. When we want to match some repeated characters, we need to use some repeat-related regular expressions as follows:
// Repeat n times {n} "test12 ". match (/test \ d {3}/); // null "test123 ". match (/test \ d {3}/); // ["test123"] // repeat n times or more times {n,} "test123 ". match (/test \ d {3,}/); // ["test123"] // repeat n to m times "test12 ". match (/test \ d {3, 5}/); // null "test12345 ". match (/test \ d {3, 5}/); // ["test12345"] "test12345678 ". match (/test \ d {3, 5}/); // ["test12345"] // match the character "test" followed by a number. The number is repeated 0 or multiple times "test ". match (/test \ d */); // ["test"] "test123 ". match (/test \ d */); // ["test12 3 "] // repeat once or multiple times" test ". match (/test \ d +/); // null "test1 ". match (/test \ d */); // ["test1"] // repeat once or 0 times "test ". match (/test \ d? /); // Null "test1". match (/test \ d? /); // ["Test1"]
From the above results, we can see that when the number following the character test can be repeated 0 or multiple times, the substring captured by the regular expression will return as many numbers as possible, for example, if/test \ d */matches test123, test123 is returned instead of test or test12.
When a regular expression captures a string, it captures as many strings as possible when conditions are met. This is the so-called "greedy mode ".
The corresponding "lazy mode" is to capture as few strings as possible when conditions are met. The Lazy mode is used to add "? ", Written as follows
// Repeat the number 3 ~ Five times. If conditions are met, the minimum number "test12345". match (/test \ d {3, 5 }? /); // ["Test123"] // The number is repeated once or more. If the conditions are met, only one digit "test12345" is returned ". match (/test \ d +? /); // ["Test1"]
Character escape
In a regular expression, metacharacters have special meanings. to match the metacharacters themselves, we need to use escape characters. For example:
// \./. Test ("."); // true
Group & Branch Conditions
The regular expression can be grouped by "()". In addition to the regular expression that matches the sub-string as a whole, the regular expression segment in the group also matches the string.
Groups are assigned a number group number based on the nested relationship and the frontend and backend relationship. In some scenarios, group numbers can be used as group numbers.
In replace, match, exec functions, grouping can reflect different functions.
In the replace function, $ + number group numbers can be used in the second parameter to refer to the content of the first group, for example:
"The best language in the world is java". replace (/(java)/, "$1 script"); // "the best language in the world is javascript"
"/Static/app1/js/index. js ". replace (/(\/\ w + )\. js/, "$ 1-v0.0.1.js"); // "/static/app1/js/index-v0.0.1.js" (\/\ w +) group match is/index,
Add the version number to the second parameter.
In the match function, when a regular expression has a global attribute, all the substrings that meet the regular expression are captured.
"abchellodefhellog".match(/h(ell)o/g); //["hello", "hello"]
However, when a regular expression does not have a global attribute and a regular expression contains groups, the match function returns only the first result of matching the entire regular expression, at the same time, the strings matched by the group will also be placed in the result array:
"Abchellodefhellog ". match (/h (ell) o/); // ["hello", "ell"] // we can use the match function to break down URLs, get information like protocol, host, path, query string "http://www.baidu.com/test? T = 5 ". match (/^ (\ w +): \/([\ w \.] +) \/([^?] + )\? (\ S +) $/); // ["http://www.baidu.com/test? T = 5 "," http://www.baidu.com "," http "," www.baidu.com "," test "," t = 5 "]
The exec function is similar to the match function when there are groups in a regular expression. It only returns one result regardless of whether the regular expression has a global attribute.
/h(ell)o/g.exec("abchellodefhellog"); //["hello", "ell"]
When a regular expression needs to match several types of results, the branch condition can be used, for example
"asdasd hi asdad hello asdasd".replace(/hi|hello/,"nihao"); //"asdasd nihao asdad hello asdasd""asdasd hi asdad hello asdasd".split(/hi|hello/); //["asdasd ", " asdad ", " asdasd"]
Note that the branch condition affects all content on both sides of the branch. For example, hi | hello matches hi or hello instead of hiello or hhello.
The branch condition in the group does not affect the content outside the group.
"abc acd bbc bcd ".match(/(a|b)bc/g); //["abc", "bbc"]
Backward reference
The group of the regular expression can be referenced by the \ + number group number in the statement following it.
For example
// Match duplicate words/(\ B [a-zA-Z] + \ B) \ s + \ 1 /. exec ("asd sf hello asd"); // ["hello", "hello"]
Assertions
(? : Exp), which is defined in this way. The regular expression matches the content in the group, but no group number is assigned to the Group, the role of this group in replace, match, and other functions will also disappear. The effect is as follows:
/(Hello) \ sworld/. exec ("asdadasd hello world asdasd") // ["hello world", "hello"], the result string and the grouping string are captured normally /(? : Hello) \ sworld /. exec ("asdadasd hello world asdasd") // ["hello world"] "/static/app1/js/index. js ". replace (/(\/\ w + )\. js/, "$ 1-v0.0.1.js"); // "/static/app1/js/index-v0.0.1.js" "/static/app1/js/index. js ". replace (/(? : \/\ W +) \. js/, "$ 1-v0.0.1.js"); // "/static/app1/js $ 1-v0.0.1.js"
(? = Exp) this group is used after the regular expression to capture the characters before the exp. The group content is not captured and no group number is assigned.
/hello\s(?=world)/.exec("asdadasd hello world asdasd") // ["hello "]
(?! (Exp) is opposite to the previous assertion. It is used after a regular expression to capture non-exp characters. It also does not capture group content or assign group numbers.
/hello\s(?!world)/.exec("asdadasd hello world asdasd") //null
Processing options
Regular Expressions in javascript support three regular expressions: g, I, and m, which represent global matching, case-insensitive, and Multiline mode. Three attributes can be combined and coexist freely.
// Global match g "abchelloasdasdhelloasd ". match (/hello/); // ["hello"] "abchelloasdasdhelloasd ". match (/hello/g); // ["hello", "hello"] // case-insensitive I "abchelloasdasdHelloasd ". match (/hello/g); // ["hello"] "abchelloasdasdHelloasd ". match (/hello/gi); // ["hello", "Hello"]
In the default mode, the metacharacters ^ and $ match the start and end of the string respectively. The mode m changes the definition of the metacharacters so that they match the beginning and end of a line.
"Aadasd \ nbasdc ". match (/^ [a-z] + $/g); // a line break between the null String ^ and $, which cannot match [a-z] +, therefore, null "aadasd \ nbasdc" is returned ". match (/^ [a-z] + $/gm); // ["aadasd", "basdc"], change the meaning of ^ $, match the beginning and end of a row to get the results of the two rows.
Summary
The above section describes the use and basic syntax of Regular Expressions in Javascript. I hope it will be helpful to you. If you have any questions, please leave a message for me, the editor will reply to you in a timely manner. Thank you very much for your support for the help House website!