The front-end string passed to the background for processing, found a strange problem: a space in the string (ASCII: 32) is encoded by the UTF-8 into a strange character (ASCII: 194 and 160 )! However, there are still spaces in the background.
There is a special character in the UTF-8 encoding, its encoding is "0xc2 0xa0", converted into a character when the expression is a halfwidth space, with the general halfwidth space (ASCII 0x20) the difference is that its width is not compressed, so it is often used in typographical layout. But gb2312, Unicode and so on do not have such a character, so after conversion, the foreground will be displayed as "?" It is displayed as a question mark rather than a real question mark, so it cannot be replaced!
Convert the two seemingly identical strings to confirm that the original string is: "# '% $ ()_-{}. B "the escaped string is:" # '% $ ()_-{}. B "[Note: Double quotation marks are not counted. After being converted to a byte array, we can see that the original value is 16 bytes, and the subsequent value is 17 Bytes: this is the result of 32 --> 194 160]
StringTmp1 ="# '% $ () _-{}. B";StringTmp2 ="# '% $ () _-{}. B";Byte[] O1 =Encoding. utf8.getbytes (tmp1 );Byte[] O2 = encoding. utf8.getbytes (tmp2 );
After knowing the reason, it's easy to write.CodeThe combination of 194 and 160 can be converted back.Program(C #) as follows:
Private String Changeutf8space (String Targetstr ){ Try { String Currentstr = String . Empty; Byte [] Utf8space = New Byte [] { 0xc2 , 0xa0 }; String Tempspace = encoding. getencoding ( " UTF-8 " ). Getstring (utf8space); currentstr = Targetstr. Replace (tempspace, " " ); Return Currentstr ;} Catch (Exception ex ){ Return Targetstr ;}}
For more encoding formats, see this document.Article: Http://www.utf8-chartable.de/unicode-utf8-table.pl? Utf8 = Dec