View the performance of js Regular Expressions by using trim prototype functions

Source: Internet
Author: User
Tags net regex expression engine

Generally, the regular expression is as follows:

[Ctrl + A select all Note: If you need to introduce external Js, You need to refresh it to execute]
If you encounter a variable-length string of big data, you will find that this is resource-consuming. Efficiency is not high, and sometimes it cannot be tolerated.
<! Doctype html public "-// W3C // dtd html 4.01 // EN" "http://www.w3.org/TR/html4/strict.dtd"> <ptml> <pead> <meta http-equiv = "Content-Type" content = "text/html; charset = UTF-8 "/> <title> </title> <meta http-equiv =" Pragma "content =" no-cache "/> <meta http-equiv =" Cache- control "content =" no-cache "/> <meta http-equiv =" Expires "content =" 0 "/> <meta http-equiv =" ImageToolbar "content =" no "/> <style type =" text/css "tit Le = "default" media = "screen">/* <! [CDATA [* // *]> */</style> </pead> <body> <pre class = "code"> enter enough space or tab character. </Textarea> </body> </ptml>
[Ctrl + A select all Note: If you need to introduce external Js, You need to refresh it to execute]
When explaining this reason, I think of the previous descriptions in the master regular expression. The NFA and DFA engines are different. Js/perl/php/java/. net are both NFA engines.
The difference between the DFA and NFA mechanisms has five impacts:
1. DFA only needs to scan each character in a text string once, which is faster, but has fewer features. NFA needs to overwrite and vomit characters, which is slow, but has rich features, therefore, it is widely used. Today's major regular expression engines, such as Perl, Ruby, and Python re modules, Java, and. NET regex library, all of which are NFA.
2. Only NFA supports features such as lazy and backreference;
3. NFA is eager to offer rewards. Therefore, the leftmost subregularizedregular expression is matched first, so the best matching result is occasionally missed. DFA is "the longest left subregularizedregular expression is matched first ".
4. NFA uses greedy quantifiers by default (/. * //,/\ w +/. This pattern repeats n times and is greedy to match as many characters as possible until it cannot be stopped ), NFA matches quantifiers first.
5. NFA may fall into the trap of recursive calling and has poor performance.

Backtracking)
When NFA finds itself eating too much, it spams back one by one and finds matching. This process is called backtracking. Because of this process, in the NFA matching process, especially in the preparation of unreasonable Regular Expression matching, the text is repeatedly scanned, and the efficiency loss is not small. Understanding this truth is helpful for writing efficient regular expressions.

Locate/analyze the cause
When interpreting the trim prototype method above. After testing, let alone whether the results are correct. There are several ways to resolve the number of retries of the js nfa engine.
A. Remove the specified quantifiers and change them
Copy codeThe Code is as follows:
String. prototype. trim = function (){
Return this. replace (/^ [\ s \ t] + | [\ s \ t] $/g ,'');
}

B. Remove the string tail match. Change:
Copy codeThe Code is as follows:
String. prototype. trim = function (){
Return this. replace (/^ [\ s \ t] +/g ,'');
}

C. Add multi-row matching. Change:
Copy codeThe Code is as follows:
String. prototype. trim = function (){
Return this. replace (/^ [\ s \ t] + | [\ s \ t] + $/mg ,'');
}

From the above three methods combined with the NFA data at the beginning of the article, we can probably know the cause of trim performance problems
The quantifiers are matched first.
At the end of the keyword limit, the Regular Expression Engine of JS may keep moving back and forth, and there is a trap of recursion. The depth of recursion is too deep. If the string is larger, Stack Overflow may occur.
Since multiple rows can be matched, and the performance consumption is not large. There is no performance problem. From the perspective of a person who writes this regular program, there are much more empty strings to be replaced by multiple rows than a single row. So the second conclusion should be correct.
Improvement
First, it is not efficient to determine the start regular expression of matching strings. When the matching ends, performance problems may occur. Therefore, you can use regular expressions and traditional expressions to improve the trim performance.
For example:

[Ctrl + A select all Note: If you need to introduce external Js, You need to refresh it to execute]

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.