Js regular expression, js
Before starting the regular expression, let's have a question for everyone to think about and see what solutions are available.
Problem: Calculate the length of a string (two-byte length gauge 2, ASCII character count 1)
Var str = "Chinese character a bc ;";
Now, let's get down to the truth! Let's start with the js regular expression.
Below is a simple list of basic knowledge about regular expressions in js:
<1> I, m, g
<2> ^, $, \ w, \ d, \ s, \ B ,.
<3> *, + ,?, {N}, {n ,}, {n, m}
<4> test, exac, match, search, replace, split
//////////////////////////////////////// ///////////
<1>
I: case insensitive
M: multi-row matching
G: Global match
<2>
^: First match
$: Tail match
\ W: matching letters, numbers, and underscores
\ D: digit matching
\ S: blank match
\ B: match the beginning and end of a word
.: Any character except line break
<3>
*: Match the first character 0 or more
+: Match the first character with one or more
? : Match the first character 0 or 1
{N}: match the previous CHARACTER n times
{N ,}: match the previous character at least n times
{N, m}: match the previous CHARACTER n to m
<4>
Test: reg. test (str); // return true/false
Exec: reg.exe c (str); // returns the first matched value in the form of an array.
Match: str. match (reg); // returns an array.
Search: str. search (reg); // return number
Replace: str. replace (reg); // returns a string.
Split: str. split (reg); // returns an array.
For ease of understanding, here we will first explain the various matching modes, and then talk about 6 methods.
I. Detailed description of the matching mode and Regular Expression
1. Simple mode (only matching through the combination of common characters)
Var reg =/ab0c /;
2. symbol ^
Match the start end of a string
3. symbol $
Match the end of a string
4. symbol *
Match the first character 0 or multiple times
5. symbol +
Match the previous character once or multiple times
6. symbol?
? In general, it is used to perform 0 or 1 match for the previous character,
However, it has two special usage methods:
If it is followed by *, + ,? And {}, it indicates the minimum matching times of the original match
7. symbol.
Match any single character, except for line breaks
8. the symbol (x) x refers to a string that matches x and will remember x.
var regx=/a(b)c/;var rs=regx.exec("abcddabcd");console.log(rs);
As can be seen from the above,/a (B) c/matches "abc" in "abcddabcd". Because of (), B will also record it,
Therefore, the rs returns the following content: {abc, B}
9. symbol (? : X) x indicates a string that matches x and does not remember x.
var regx=/a(b)c/;var rs=regx.exec("abcddabcd");console.log(rs);
As shown above,/a (B) c/matches "abc" in "abcddabcd" because (? :), B will not be recorded,
Therefore, the content returned by rs is: {abc}
10. x (? = Y) matches x, only when followed by y. X is remembered, and y is not remembered.
var regx=/user(?=name)/;var rs=regx.exec("The username is Mary");
Result: The matching is successful, and the rs value is {user}
11. x (?! Y) matches x, only when followed by y. X is remembered, and y is not remembered.
var regx=/user(?!name)/;var rs=regx.exec("The user name is Mary");
Result: The matching is successful, and the rs value is {user}
12. x | y matches x or y. If both x and y match, remember only x.
var regx=/beijing|shanghai/;var rs=regx.exec("I love beijing and shanghai");
Result: The matching is successful. The rs value is {beijing}. Although shanghai matches, it is not remembered.
13. symbol {}
{N} matches the previous CHARACTER n times
{N ,}match the previous character at least n times
{N, m} matches the appearance of the previous character at least n times at most m times
14. The symbol [xyz] xyz represents a string
Indicates matching a character in []. [xyz] is equivalent to [x-z].
15. The [^ xyz] xyz symbol represents a string
Matches a character in []. [^ xyz] is equivalent to [^ x-z].
16. The symbol \ d matches a numeric character, which is equivalent to [0-9].
var regx = /user\d/;var rs = regx.exec("user145");console.log(rs);
Result: The matching is successful. The rs value is ["user1", index: 0, input: "user145"].
var regx = /user[0-9]+/;var rs = regx.exec("user145");console.log(rs);
Result: The matching is successful. The rs value is ["user145", index: 0, input: "user145"].
Note the differences between the two
17. The symbol \ r matches a carriage return.
18. The symbol \ s matches a space character.
19. The symbol \ w matches a number, _, or an alphabetic character equivalent to [A-Za-z0-9 _]
20. The symbol \ xhh matches a character expressed by two hexadecimal numbers.
21. The symbol \ uhhhh matches the characters expressed by a four-digit hexadecimal number.
22. The [\ u4E00-\ u9FA5] symbol matches Chinese Characters
23. The symbol [^ \ x00-\ xff] matches two-byte characters
Ii. Matching Method
<1> test () method:
Reg. test (str); // return true/false
var str = "12abc678abc2hh";var reg = /abc/;console.log(reg.test(str)); //truereg = /rgy/;console.log(reg.test(str)); //false
////
<2> exec () method:
Reg.exe c (str); // returns the first matched value in the form of an array.
var str = "12abc678abc2hh";var reg = /abc/;console.log(reg.exec(str)); // ["abc", index: 2, input: "12abc678abc2hh"]
////
<3> match () method:
Str. match (reg); // returns an array
If there is only one match, --> [matched string, index, input]
If multiple strings are matched, an array consisting of multiple matching strings is returned. Let's see the following example.
Var str = "12abc678abc2hh"; var reg =/abc/; console. log (str. match (reg); // ["abc", index: 2, input: "12abc678abc2hh"] var reg =/abc/g; // global match console. log (str .); // ["abc", "abc"]
////
<4> search () method:
Str. search (reg); // return number
var str = "12abc678abc2hh";var reg = /abc/;console.log(str.search(reg)); // 2var reg = /abc/g;console.log(str.search(reg)); // 2
Obviously, search () matches only the first one and returns its index.
////
<5> replace () method:
Str. replace (reg); // returns a string
var str = "12abc678abc2hh";var reg = /abc/g;console.log(str.replace(reg, "ABC"));// 12ABC678ABC2hhconsole.log(str.replace(/(\d+)([a-z]+)/g, "$2[$1]"));// abc[12]abc[678]hh[2]
The latter is a relatively advanced usage and requires a good understanding.
(\ D +) and ([a-z] +) represent two domains, respectively, corresponding to $1 and $2. The rest is easy to understand.
////
<6> split () method:
Str. split (); // returns the array formed by the delimiter.
var str = "12abc678abc2hh";console.log(str.split(/[a-z]+/));// ["12", "678", "2", ""]
Here we demonstrate the process of generating arrays separated by strings, and also the strength of regular expressions.
//////////////////////////////////////// /////////////////////////////////
Well, I'm sure you have some idea about the problem you raised at the beginning. Here we will provide a solution using regular expressions.
String. prototype. len = function () {return this. replace (/[^ \ x00-\ xff]/g, "aa "). length;} var str = "Chinese character a bc;"; console. log (str. len (); // 9