PS: In all examples, regular expression matching results are included between [and] in the source text. In some examples, java is used. If the regular expression is used in java, it will be described in the corresponding area. All java examples have passed the test under JDK1.6.0 _ 13.
I. Problem Introduction
First, let's look at an example. Some phrases such as Windows 2000 are composed of multiple words, but they are actually a whole. Non-line break spaces (& nbsp; that is, non-breaking space) enables it to be displayed on a row in the browser. Now, multiple such spaces are matched:
Text: Your operation systemis Windows & nbsp; 2000.
Regular Expression: nbsp; {2 ,}
Result: Your operation systemis Windows & nbsp; 2000.
Analysis: here we use the pattern to match two or more non-wrap spaces, but we can see from the result that nothing is matched, because nbsp; {2 ,} this mode can only match text that starts with nbsp and has two or more consecutive semicolons.
Because the repeated match mentioned above is repeated multiple times next to the character before the repeated match metacharacters. However, if we want to perform multiple matches on a string, what should I do?
Ii. subexpressions
From the above, we will introduce the subexpression. A subexpression is a part of a large expression. The purpose of dividing an expression into multiple subexpressions is to use these subexpressions as an independent element. The subexpression must be enclosed by (and. Therefore, the regular expression in the preceding example should be written as (nbsp;) {2 ,}.
Let's look at a regular expression matching a valid year:
Text: 1988-11-13
Regular Expression: (19 | 20) \ d {2}
Result: [1988]-11-13
Analysis: In this example, to exclude meaningless years, limit the first two digits of the year to 19 or 20. | it is a regular expression or operator. 19 | 20 must be placed in a subexpression, that is, (19 | 20). Otherwise, it can only match the year starting with 20,
3. subexpression nesting
Child expressions allow nesting and multi-layer nesting. There is no theoretical limit on nested layers.
In expression (A) (B (C), the following subexpressions exist:
1 (A) (B (C )))
2 ()
3 (B (C ))
4 (C)
A total of four, 0th always represent the entire expression. In the backtracing reference, we will introduce the reference of a subexpression through \ n (n is the number of the subtable type.