How regular expressions work

Source: Internet
Author: User
Tags expression engine

writing regular expressions is a major cause of performance bottlenecks, but there are still many ways to improve the efficiency of regular expressions. Matching the same text with two regular expressions does not mean they have the same speed.
many factors affect the efficiency of regular expressions. First, the text adapted to regular expressions varies widely, and the time required for partial matching is longer than that used for full non-matching. The Regular Expression Engine of Each browser also has different internal optimizations.
to effectively use regular expressions, it is important to understand their working principles. The following are the basic steps for processing a regular expression:
Step 1: compile
when you create a regular expression object (using a regular expression or Regexp constructor), the browser checks whether your template is correct, it is then converted into a local Code routine for matching. If you assign a regular expression to a variable, you can avoid repeating this step.
Step 2: Set the start position

When a regular expression is put into use, you must first determine the position where the search starts in the target string. It is the starting position of the string, or specified by the lastindex attribute of the regular expression, but when it is returned from step 4 (because the attempt to match fails ), this position is located at the start of the last attempt to push the next character.
The browser optimizes the Regular Expression Engine by skipping unnecessary work through early prediction at this stage. For example, if a regular expression starts with "^", ie and chrome usually determine whether a match can be made at the starting position of the string, and then avoid stupid searches for subsequent positions. Another example is to match the character string with the third letter "X". A clever way is to first find X and then backtrack the start position to two characters.
Step 3: match the characters of each regular expression
Once a regular expression finds its starting position, it will scan the target text and regular expression templates one by one. When a specific character fails to match, the regular expression will attempt to go back to the position before scanning, and then enter other possible paths of the regular expression.
Step 4: Match successful or failed
If a full match is found at the current position of the string, the regular expression is declared successful. If all possible paths of the regular expression have been tried, but the match fails, the Regular Expression Engine returns to step 2 and retries from the next character of the string. Only when each character in the string (and the position after the last character) has gone through this process and the matching has not been successful, then the regular expression will be declared as a complete failure.
Keeping this process in mind will help you identify the types that affect the performance of Regular Expressions wisely.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.