PHP extension text processing-PCRE regular expression syntax-some of the performance patterns may be more efficient than others. For example, the use of character classes such as [aeiou] is more efficient than the optional path (a | e | I | o | u. In general, it is the most funny to describe the demand with the simplest possible structure. The Jeffrey Friedl book (proficient in regular expressions) contains a lot of discussions about the performance of regular expressions.
When a mode starts with. * and the PCRE_DOTALL option is set, the mode is implicitly anchored through PCRE because it can match the start of a string. However, if PCRE_DOTALL is not set, PCRE cannot perform this optimization because. metacharacters cannot match line breaks. if the target string contains line breaks, the pattern may start matching from the end of a line break, rather than the start position. For example, the mode (. *) second matches the target string "first \ nand second" (\ n is a line break). The first capture sub-group result is "and ". To do this, PCRE tries to match each line break in the target string.
If you use the pattern to match the target string without a linefeed, you can explicitly specify the ING to get the best performance by setting PCRE_DOTALL or starting with the pattern ^. This saves the PCRE time to start scanning and searching for linefeeds along the target string.
Infinite repeated nesting in careful mode. This may cause a long running time when applying a unmatched string. Consider the mode fragment (a + )*.
This mode can be used to match "aaaa" in 33 ways, and the number will rapidly increase with the length of the string. (* repeat can match 0, 1, 2, 3, and 4 times. in addition to 0, each condition has a matched number of times ). When the remaining part of the pattern causes the entire match to fail, PCRE tries every possible change in principle, which will be very time-consuming.
In some simple cases, optimization is like (a +) * B followed by using the original string .. Before getting started with the formal match, PCRE checks whether the target string contains the "B" character. if not, it immediately fails. However, this optimization is not available when there are no original characters. You can compare the behavior differences between (a +) * \ d and the above pattern. The former reports failures almost immediately when applying a string consisting of "a" to the entire line, while the latter reports a considerable time consumption when the target string is longer than 20 characters.