How to reverse the encoding of strings and Related Characters in Java

Source: Internet
Author: User

Copy codeThe Code is as follows: public String reverse (char [] value ){
For (int I = (value. length-1)> 1; I> = 0; I --){
Char temp = value [I];
Value [I] = value [value. length-1-I];
Value [value. length-1-I] = temp;
}
Return new String (value );
}

Such code has no problems with algorithms. However, when viewing the StringBuffer source code today, we found that the source code of the reverse method is very subtle. The source code is as follows:

Copy codeThe Code is as follows: public AbstractStringBuilder reverse (){
Boolean hasSurrogate = false;
Int n = count-1;
For (int j = (n-1)> 1; j> = 0; -- j ){
Char temp = value [j];
Char temp2 = value [n-j];
If (! HasSurrogate ){
HasSurrogate = (temp> = Character. MIN_SURROGATE & temp <= Character. MAX_SURROGATE)
| (Temp2> = Character. MIN_SURROGATE & temp2 <= Character. MAX_SURROGATE );
}
Value [j] = temp2;
Value [n-j] = temp;
}
If (hasSurrogate ){
// Reverse back all valid surrogate pairs
For (int I = 0; I <count-1; I ++ ){
Char c2 = value [I];
If (Character. isLowSurrogate (c2 )){
Char c1 = value [I + 1];
If (Character. isHighSurrogate (c1 )){
Value [I ++] = c1;
Value [I] = c2;
}
}
}
}
Return this;
}

This method is defined in the parent class AbstractStringBuilder of StringBuffer, so the return value of this method is AbstractStringBuilder. The method called in the subclass is as follows:Copy codeThe Code is as follows: public synchronized StringBuffer reverse (){
Super. reverse ();
Return this;
}

From the content of the method, the basic idea in the source code is the same. It also traverses half of the string and exchanges each character with its corresponding character. However, the difference is that you must determine whether each Character is between Character. MIN_SURROGATE (\ ud800) and Character. MAX_SURROGATE (\ udfff. If this is found in the entire string, traverse the string from the beginning to the end again and determine whether value [I] meets Character. isLowSurrogate (). If yes, continue to judge whether value [I + 1] meets Character. isHighSurrogate (). If this condition is also met, the characters between the I-bit and the I + 1-bit are exchanged. Some may wonder why it is necessary to do so, because the characters in Java already use Unicode code, and each character can be placed with a Chinese character. Why?
A complete Unicode character is called CodePoint, while a Java char is called code unit. The String object stores Unicode characters in a UTF-16 and NEEDS 2 characters to represent the Chinese character of an oversized character set. This representation is called Surrogate. The first character is Surrogate High, and the second is Surrogate Low. Note the following:
Determine whether a char is a Character in the Surrogate area. Use the Character's isHighSurrogate ()/isLowSurrogate () method to determine whether it is a Character in the Surrogate area. Returns a complete Unicode CodePoint from two Surrogate High/Low characters using the Character. toCodePoint ()/codePointAt () method.
A Code Point may require one or two char representation, so CharSequence cannot be used directly. the length () method returns the number of Chinese characters in a String. codePointCount ()/Character. codePointCount ().
To locate the nth Character in a String, N cannot be used as the offset directly. Instead, you must traverse the String header in sequence and use the String/Character. offsetByCodePoints () method.
Find the previous character from the current character of the String, and you cannot directly use offset -- to implement it. Instead, you must use String. codePointBefore ()/Character. codePointBefore (), or use String/Character. offsetByCodePoints ()
Find the next Character from the current Character, which cannot be directly implemented using offset ++. You need to determine the length of the current CodePoint before calculation, or use String/Character. offsetByCodePoints ().

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.