Solution to Chinese garbled characters in URL addresses: url Chinese

Source: Internet
Author: User

Solution to Chinese garbled characters in URL addresses: url Chinese

Solution to Chinese garbled characters in URL addresses

Introduction: In Restful service design, users often encounter the need to use Chinese as a parameter in the URL address. In this case, generally, you must correctly set and encode Chinese character information. The garbled problem arises. How can this problem be solved? Let's take a look at the details in this article.

1. Problem Introduction

In Restful service design, when querying some information, the general URL address is designed as: get/basic/service? Keyword = historical, and other URL addresses. However, in actual development and use, garbled characters occur. The keyword Information read in the background is garbled and cannot be correctly read.

2. How is garbled code produced?

Because we use URL to pass parameters, this method is dependent on the browser environment, that is to say, the parameter key-value pairs transmitted in the key = value format in the URL and URL are processed in the browser address bar and encoded before being transmitted to the background for decoding.

Because we have not processed any processing, when the URL of the javascript request and passing parameters are in Chinese (that is, when the input box is in Chinese ), the Chinese parameters of the URL are encoded according to the browser mechanism. At this time, the encoding is garbled.

3. encoding for the first time. In javascript, encodeURI () is used for encoding.

When the Chinese URL parameter is encoded using encodeURI () in javascript, the word "test" is converted to "% E6 % B5 % 8B % E8 % AF % 95 ". However, the problem persists. The cause is the encoded string information. The browser determines "%" as an escape character, the browser will transfer the escape characters between the converted Parameters "%" and "%" in the address bar URL to the background. This will cause inconsistency with the URL actually encoded by encodeURI (), because the browser mistakenly believes that "%" is an escape character and does not regard "%" as a common character.

4. Secondary encoding, using encodeURI

Operation:

encodeURI(encodeURI("/order?name=" + name));

The processed URL is not in the string "% E6 % B5 % 8B % E8 % AF % 95" after one encodeURI () conversion, but after two layers of encodeURI () the processed URL string "% 25E6% B255 % 258B % 25E8% AF % 2595" is re-encoded by re-encoding the original "%" parsed as escape characters, convert to common characters to "% 25".

At this time, the front-end javascript code has completed the URL encoding with Chinese characters, and passed the URL to the background for processing by passing parameters, the parameter "% 25E6% B255 % 258B % 25E8% AF % 2595" for Normal conversion without garbled characters is obtained by Action. the Chinese Character corresponding to this string is the "test" we entered.

5. How can I correctly parse Chinese characters in the background?

After secondary encodeURI (), the information that enters the background cannot be directly read and the correct information is obtained. Continue with the following steps:

URLDecoder.decode("chinese string","UTF-8") 

The decode (String str, String ecn) method of URLDecoder has two parameters: the first parameter is the String to be decoded, and the second parameter is the corresponding encoding during decoding.

6. encodeURI, encodeURIComponent, escape

6.1 escape () function

The escape () function can encode a string so that the string can be read on all computers.

Returned value: a copy of the encoded string. Some characters are replaced with hexadecimal escape sequences.

Note: This method does not encode ASCII letters and numbers or the following ASCII punctuation marks :-_.! ~ *'(). All other characters are replaced by escape sequences. All space characters, punctuation marks, special characters, and other non-ASCII characters will be converted into character encoding in % xx format (xx equals to the hexadecimal encoding of this character in the character set table number ). For example, the space character is encoded as % 20. Characters not encoded by this method: @ */+

6.2 encodeURI () method

Convert a URI string to an escape string in UTF-8 encoding format. Characters not encoded by this method :! @ # $ & * () = :/;? +'

6.3 encodeURIComponent () method

Convert a URI string to an escape string in UTF-8 encoding format. Compared with encodeURI (), this method will encode more characters, such. Therefore, if the string contains several parts of the URI, this method cannot be used for encoding. Otherwise, the URL will display an error after the/character is encoded.

Characters not encoded by this method :! *()'

Therefore, for a Chinese string, if you do not want to convert the string encoding format to the UTF-8 format (for example, when the charset of the original page and the target page is consistent), you only need to use escape. If your page is GB2312 or another code, and the page that accepts the parameter is UTF-8 code, use encodeURI or encodeURIComponent.

7. Another scheme to process Chinese garbled URLs

The characters in the request end include encodeURI for transcoding, for example:

   var url="/ajax?name="+encodeURI(name);

Server code:

  name=new String(name.getBytes("iso8859-1"),"UTF-8");

Note: name is the string to be obtained, the iso8859-1 is the default character encoding of the project, if the Chinese encoding gbk, gb2312 and so on, do not use this step for processing.

Analysis: the results are verified by the program and the results are feasible. It can be seen that the default encoding method of the browser itself is the iso8859-1, even if the use of the encodeURI for UTF-8 encoding, the main string content, for example, both ascii and visible characters are based on the characters of the iso8859-1 browser itself. The reason is that these characters overlap with the UTF-8 string in encoding. The escape functions such as encodeURI mainly solve the escape problem of special characters such as % and.

Thank you for reading this article. I hope it will help you. Thank you for your support for this site!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.