Conversion between a hexadecimal Unicode encoded string and a Chinese String

Source: Internet
Author: User

Conversion between a hexadecimal Unicode encoded string and a Chinese String

The URL obtained from a library client project is as follows:

 String baseurl =   "http://innopac.lib.xjtu.edu.cn/availlim/search~S1*chx?/X{u848B}{u4ECB}{u77F3}&searchscope=1&SORT=DZ/X{u848B}{u4ECB}{u77F3}&searchscope=1&SORT=DZ&extended=0&SUBKEY=%E8%92%8B%E4%BB%8B%E7%9F%B3/51%2C607%2C607%2CB/browse"


If you directly use this URL to send an httpget request, an exception is thrown: invalid characters. That is, the URL cannot contain {}

{} What is the content in the brackets, and finally found that it is the hexadecimal Unicode encoding of Chinese characters, the above {u848B} {u4ECB} {u77F3} is the Chinese character "Chiang Kai-shek ".

In this case, you need to convert the hexadecimal Unicode encoded string into a Chinese string. The Code is as follows:

/*** Convert a Chinese String to a hexadecimal Unicode encoded String ** @ param s * Chinese String * @ return */public static String stringToUnicode (String s) {String str = ""; for (int I = 0; I <s. length (); I ++) {int ch = (int) s. charAt (I); if (ch> 255) str + = "\ u" + Integer. toHexString (ch); elsestr + = "\" + Integer. toHexString (ch);} return str;}/*** convert a hexadecimal Unicode encoded string to a Chinese string and convert \ u848B \ u4ECB \ u77F3 to Chiang Kai-shek, note the format ** @ param str * eg: \ u848B \ u4ECB \ u77F3 * @ return Chiang Kai-shek */public static String unicodeToString (String str) {Pattern pattern = Pattern. compile ("(\ u (\ p {XDigit} {4})"); Matcher matcher = pattern. matcher (str); char ch; while (matcher. find () {ch = (char) Integer. parseInt (matcher. group (2), 16); str = str. replace (matcher. group (1), ch + "");} return str ;}


Then, it is easy to process the URL. First, replace "}" in the URL with "", and then replace "{" with "\". then, convert \ u848B \ u4ECB \ u77F3 into Chinese characters.

 
<Pre name = "code" class = "java">/*** replace {} in the URL \, then convert Unicode to Chinese characters ** @ param baseUrl * String baseurl = * "http://innopac.lib.xjtu.edu.cn/availlim/search ~ S1 * chx? /X {u848B} {u4ECB} {u77F3} & searchscope = 1 & SORT = DZ/X {u848B} {u4ECB} {u77F3} & searchscope = 1 & SORT = DZ & extended = 0 & SUBKEY = % E8 % 92% 8B % E4 % BB % 8B % E7 % 9F % B3/51% 2C607% 2C607% 2CB/browse "*; * @ return */public static String replaceUni2Chinese (String baseUrl) {Log. d (TAG, "original URL -->" + baseUrl); if (baseUrl. contains ("{") {Log. d (TAG, "the original URL contains Chinese characters"); String removeLast = baseUrl. replace ("}", ""); // System. out. println ("Remove parentheses -->" + removeLast); String replaceBefore = removeLast. replace ("{", "\"); // System. out. println ("Replace the brackets -->" + replaceBefore); String result = unicodeToString (replaceBefore); Log. d (TAG, "After unicode is converted to a string: -->" + result); return result;} else {Log. d (TAG, "no Chinese characters in the original URL"); return baseUrl ;}}


 


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.