Are you sure you want to use the java replaceAll function ?, Javareplaceall
The replace, replaceAll, and replaceFirst functions will be used by java users. I have used them for more than two years. But do we really understand them?
The following describes how to use these three methods:
· Replace (CharSequence target, CharSequence replacement)Replacement is used to replace all targets. Both parameters are strings.
· ReplaceAll (String regex, String replacement)Replacement is used to replace all regex matching items. regex is obviously a regular expression, and replacement is a string.
· ReplaceFirst (String regex, String replacement), Basically the same as replaceAll, the difference is that only the first match is replaced.
The next simple requirement is to replace a in the source string with \ a. The Code is as follows:
1 System.out.println("abac".replace("a", "\\a")); //\ab\ac2 System.out.println("abac".replaceAll("a", "\\a")); //abac3 System.out.println("abac".replaceFirst("a", "\\a")); //abac
The results were surprising. After so many years of replacement, it was a bit confusing.
The source string is "abac", and then we find "a", replace it with \ a, because \ is a java escape character, therefore, to express \ a, it must be written as "\ a". The first backslash converts the second backslash to a normal string.
The results of the three replace expressions are correct only when the first replace function is used. What is the problem?
ReplaceAll and replaceFirst require the first parameter to be a regular expression. "a" can be interpreted as a string or a regular expression, so the first parameter is correct.
The problem lies in the second parameter. if you carefully read the comments of the replaceAll function, you will find the following descriptions:
Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string; see Matcher.replaceAll. Use java.util.regex.Matcher.quoteReplacement to suppress the special meaning of these characters, if desired.
Because the first parameters of replaceAll and replaceFirst are regular expressions, we can do some small tricks in the second parameter. For example, we have the following requirement: replace a in the source string with the character next to a. The Code is as follows:
1 System.out.println("abac".replaceAll("a(\\w)", "$1$1")); //bbcc2 System.out.println("abac".replaceFirst("a(\\w)", "$1$1")); //bbac
The meaning of the regular expression can be understood by the reader. In the second parameter, you can use the $ symbol to get the group content. In this example, $1 is used to get the content of the first group, that is, the character next to.
Therefore, the $ symbol has a special meaning in the second parameter, and an error will be returned if it is gibberish:
1 System.out.println("abac".replaceAll("a(\\w)", "$")); //Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 1
What if I want to replace it with $? This requires escape characters:
1 System.out.println("abac".replaceAll("a", "\\$")); //$b$c
At this point, the reader may suddenly realize that the original backslash also has a special meaning (escape) in the second parameter, so if we want to express the backslash, We must escape it again:
1 System.out.println("abac".replaceAll("a", "\\\\a")); //\ab\ac2 System.out.println("abac".replaceFirst("a", "\\\\a")); //\abac
To put it simply, the backslash at the front of "\\\\\ a" is used to escape the backslash at the back, so that the backslash at the back is a common string, in this way, the string seen in the java memory is "\ a". Then, when the replaceAll function is processing, use the backslash at the front side to escape the backslash at the back end, the backslash is a common string, not used to escape $, and the final memory string is "\ a", so that a can be replaced with \.
The escape issue is indeed tangled. Through this article, I hope that readers will stay awake when using these functions, be aware of the special characters in parameters, and avoid writing time bombs.