Copy codeThe Code is as follows: public String reverse (char [] value ){
For (int I = (value. length-1)> 1; I> = 0; I --){
Char temp = value [I];
Value [I] = value [value. length-1-I];
Value [value. length-1-I] = temp;
}
Return new String (value );
}
Such code has no problems with algorithms. However, when viewing the StringBuffer source code today, we found that the source code of the reverse method is very subtle. The source code is as follows:
Copy codeThe Code is as follows: public AbstractStringBuilder reverse (){
Boolean hasSurrogate = false;
Int n = count-1;
For (int j = (n-1)> 1; j> = 0; -- j ){
Char temp = value [j];
Char temp2 = value [n-j];
If (! HasSurrogate ){
HasSurrogate = (temp> = Character. MIN_SURROGATE & temp <= Character. MAX_SURROGATE)
| (Temp2> = Character. MIN_SURROGATE & temp2 <= Character. MAX_SURROGATE );
}
Value [j] = temp2;
Value [n-j] = temp;
}
If (hasSurrogate ){
// Reverse back all valid surrogate pairs
For (int I = 0; I <count-1; I ++ ){
Char c2 = value [I];
If (Character. isLowSurrogate (c2 )){
Char c1 = value [I + 1];
If (Character. isHighSurrogate (c1 )){
Value [I ++] = c1;
Value [I] = c2;
}
}
}
}
Return this;
}
This method is defined in the parent class AbstractStringBuilder of StringBuffer, so the return value of this method is AbstractStringBuilder. The method called in the subclass is as follows:Copy codeThe Code is as follows: public synchronized StringBuffer reverse (){
Super. reverse ();
Return this;
}
From the content of the method, the basic idea in the source code is the same. It also traverses half of the string and exchanges each character with its corresponding character. However, the difference is that you must determine whether each Character is between Character. MIN_SURROGATE (\ ud800) and Character. MAX_SURROGATE (\ udfff. If this is found in the entire string, traverse the string from the beginning to the end again and determine whether value [I] meets Character. isLowSurrogate (). If yes, continue to judge whether value [I + 1] meets Character. isHighSurrogate (). If this condition is also met, the characters between the I-bit and the I + 1-bit are exchanged. Some may wonder why it is necessary to do so, because the characters in Java already use Unicode code, and each character can be placed with a Chinese character. Why?
A complete Unicode character is called CodePoint, while a Java char is called code unit. The String object stores Unicode characters in a UTF-16 and NEEDS 2 characters to represent the Chinese character of an oversized character set. This representation is called Surrogate. The first character is Surrogate High, and the second is Surrogate Low. Note the following:
Determine whether a char is a Character in the Surrogate area. Use the Character's isHighSurrogate ()/isLowSurrogate () method to determine whether it is a Character in the Surrogate area. Returns a complete Unicode CodePoint from two Surrogate High/Low characters using the Character. toCodePoint ()/codePointAt () method.
A Code Point may require one or two char representation, so CharSequence cannot be used directly. the length () method returns the number of Chinese characters in a String. codePointCount ()/Character. codePointCount ().
To locate the nth Character in a String, N cannot be used as the offset directly. Instead, you must traverse the String header in sequence and use the String/Character. offsetByCodePoints () method.
Find the previous character from the current character of the String, and you cannot directly use offset -- to implement it. Instead, you must use String. codePointBefore ()/Character. codePointBefore (), or use String/Character. offsetByCodePoints ()
Find the next Character from the current Character, which cannot be directly implemented using offset ++. You need to determine the length of the current CodePoint before calculation, or use String/Character. offsetByCodePoints ().