I. Problems
Use commas (,) to separate a numeric string from the right to the left. For example, replace "123456789" with "123,456,789 ".
Ii. Analysis
In essence, the replacement algorithm groups strings from the right to the left, with each 3 characters in a group, and then inserts a comma between the group and the group.
Iii. Syntax
Regular Expressions introduce the concepts of "prediction" and "review". They do not match strings, but match a "position". The prediction is positive (from left to right, review is a reverse view (from right to left.
(The parentheses and their contents in the following Table match only the position but do not match the text)
Regular Expression |
Name |
Description |
Example |
(? = ...) |
Prediction |
Assertion subexpression to the right of this position |
ABCD (? =/D) |
Match the ABCD followed by the number on the right |
(?!...) |
Negative Prediction |
Assertion subexpression is not at the right of this position |
ABCD (?! /D) |
Match the ABCD followed by a number on the right |
(? <= ...) |
Review |
Assertion subexpression on the left of this position |
(? <=/D) ABCD |
Match the ABCD of the number on the left. |
(? <!...) |
Negative Review |
Assertion subexpression is not on the left of this position |
(? <! /D) ABCD |
Match the ABCD with no digits on the left |
Prediction and review only matchOne location, Where the child expression is not matched. For example, if the character "abcdefg" is present, the regular expression is used.
(? = EFG )(? <= ABCD) or (? <= ABCD )(? = EFG)
It matches the position between D and E, instead of EFG, ABCD, or any other string. Note that the prediction and review order above is not important because it is only an assertion of a position. The two are also matching the position between D and E. The meaning of the former is "match a location, the right side of the former is EFG, and the left side is ABCD". The latter means "match a location, and the left side is ABCD, the right side is EFG. The following Perl inserts a comma between ABCD and EFG:
$foo = "ABCDEFG";
$foo =~ s/(?<=ABCD)(?=EFG)/,/;
print $foo;
Assume that all Jeffs in a text section are replaced by Jeff's (matching the word boundary), there are several solutions as follows:
Solution |
Description |
S // bjeffs/B/Jeff's/g |
This is the most intuitive method. Replace the matched Jeffs with Jeff's. |
S // B (Jeff) (s)/B/$1 '$2/g |
Grouping Structure, insert single quotes between two groups |
S // bjeff (? = S/B)/Jeff '/g |
Prediction: matches only the right position of Jeff, who follows s and is at the word boundary. (Others, such as Jeff in Jeffrey, are not matched) |
S /(? <=/Bjeff )(? = S/B)/'/g |
Use both prediction and review to insert single quotation marks at matching positions |
S /(? = S/B )(? <=/Bjeff)/'/g |
The behavior is exactly the same as the above. Because prediction and review only match the position rather than match the string, the order is not important. |
Iv. Solving Problems
Using prediction and review, we can solve the problem of "inserting a comma in a value. Insert a number from the right to the left, each of which is divided into a group, and a comma is inserted between the group and the group. That is to say, as long as these locations are matched, a comma can be inserted in these locations.
$foo = "123456789";
$foo =~ s/(?=/d/d/d)/,/g;
print $foo;
The above result is ", 1, 2, 3, 4, 5, 6, 789", obviously not desired.
Regular Expression |
Result |
Description |
S /(? =/D/d)/,/g S /(? = (/D) +)/,/g |
, 789 |
1. the position on the left is matched because there are 3 numbers (123) on the right of the position) Similarly, the position on the right of 1 is also matched, and there are 3 numbers (234) on the right of the position )...... The right position of 7 is no longer matched because there are no three numbers on the right. |
S /(? =/D/d $)/,/g |
123456,789 |
Only the position on the left of 7 is matched because it meets the following two conditions: There are three numbers on the right, And the three numbers are at the end of the row (there is nothing on the right ). |
S /(? = (/D) + $)/,/g |
, 123,456,789 |
1. the position on the left meets the condition that "N groups (three numbers in each group) exist on the right and the last group is at the end of the row "; 2. the position on the Left does not meet this condition, because it is followed by 234, 567, and 89, and the last 89 cannot form a group; Others ...... |
S /(? = (/D) + $ )(? <=/D)/,/g |
123,456,789 |
1 The position on the left is no longer matched because (? <=/D) indicates that the left side of the position must be a number, The position on the left of 1 is obviously not consistent. |
This regular expression can be used to replace a single numeric string. However, if you want to replace 281421906 in the "the population of 281421906 is growing" with 281,421,906, you will be powerless, the reason is that $ specifies that the last (3) number must end at the row. Therefore, you need to change the condition to "the right of the last number is not a number ", this requires "negative prediction ":
S /(? = (/D) + (?! /D ))(? <=/D)/,/g
(The COM Object iregexp2 provided by Microsoft VBScript. dll, which supports the "Review" syntax, throws an exception in actual use, somehow)