·Replace greed with laziness
One possible solution for correcting the above problems is to use"+"Inertia replaces greed. You can+"Followed by a question mark"?To achieve this. "*","{}And?This scheme can also be used for repeated expressions. Therefore, in the above example, we can use<. +?>". Let's take a look at the processing process of the Regular Expression Engine.
again, the regular expression marks" " matches the first ' ' of the string ". The next regular mark is ". ". This is a lazy " + " to repeat the previous character. This tells the Regular Expression Engine to repeat the previous character as few as possible. Therefore, the engine matches ". "and character" E ", and then use " " matched " m ", the result failed. The engine performs backtracking, which is different from the previous example. Because it is a inertia repetition, the engine expands the inertia repetition instead of reduces, so " <. + "is now extended to" ". The engine continues to match the next tag ' ". A successful match is obtained this time. The engine reported " " as a successful match. The entire process is roughly the same.
·An alternative to inert Scaling
We also have a better alternative. You can use a greedy repeat with an anti-Character Set:"<[^>] +>". This is a better solution. When the inertia repeat is used, the engine will backtrack each character before finding a successful match. However, you do not need to perform backtracking when using the anti-character set.
Here is the original article link