Introduction: In the RESTful class of service design, often encounter the need to use the URL address in Chinese as a condition, in this case, it is generally necessary to correct the setting and encoding Chinese characters information. Garbled problem arises in this, how to solve it? And listen to this article in detail.
1. The problem arises
In restful service design, when querying some information, the general URL address is designed as: Get/basic/service? Keyword= history, and the like URL address. However, in the actual development and use, there is a garbled situation, in the background read keyword information is garbled, can not be read correctly.
2. How are garbled characters generated?
Because we use URLs to pass parameters this way is dependent on the browser environment, that is, the URL and the URL contained in the various Key=value format of the pass parameter key value pair parameters are processed in the browser address bar in the processing principle of the corresponding encoding passed to the background for decoding.
Since we do not have any processing, when the JavaScript request URL and the argument exists in Chinese (that is, input box input in Chinese), the URL of the Chinese language parameters are encoded according to the browser mechanism. Encoding has a garbled problem at this time.
3. Initial encoding, JavaScript is encoded using the encodeURI () method.
When encoding Chinese URL parameters in JavaScript using encodeURI (), the word "test" is converted to "%e6%b5%8b%e8%af%95". But the problem still exists. The reason is that after the encoded string information, the browser mechanism will assume that "%" is an escape character, and the browser will pass the converted parameter "%" in the Address bar URL and the "%" of the escaped character between the passed in to the background. This will cause the actualencodeURI () encoded URL does not match, because the browser mistakenly think "%" is an escape character character, it does not think "%" is a normal character. 4. Two-time code, using encodeURIOperation: encodeURI (encodeURI ("/order?name=" + name);the processed URL is not passed through once encodeURI () converted string "%e6%b5%8b%e8%af%95", but after two levels of the previous stepencodeURI () Handles URL-processed strings "%25e6%b255%258b%25e8%af%2595" by re-encoding the original"% "that was browsed and parsed as an escape character is re-encoded, converting the normal character to"%25 ". At this time, the front-end JavaScript code with the Chinese URL encoding has been completed, and through the URL to pass the parameters passed to the background to wait for processing, the action gets to normal conversion cut no garbled parameter is "%25e6%b255%258b%25e8%af%2595", The Chinese character of this string is the word "test" we entered. 5. How does the background correctly parse Chinese character information?
After two encodeURI (), direct reading is unable to get the correct information after entering the background. You need to continue with the following:
Urldecoder.decode ("Chinese string", "UTF-8")
the Urldecoder decode (string str,string ECN) method has two parameters, the first parameter is the string to be decoded, and the second parameter is the corresponding encoding at the time of decoding.
6. encodeURI, encodeURIComponent, Escape
6.1 Escape () functionThe escape () function encodes the string so that it can be read on all computers.
Return value: A copy of the encoded string. Some of these characters are replaced with 16-binary escape sequences.
Description: The method does not encode ASCII letters and numbers, nor does it encode the following ASCII punctuation marks:-_. ! ~ * ' (). All other characters will be replaced by escape sequences. All space characters, punctuation marks, special characters, and other non-ASCII characters are converted to the character encoding in the%xx format (XX equals the encoded 16-digit number of the character in the character set table). For example, the encoding for a space character is%20. Characters that are not encoded by this method: @ */+
6.2 encodeURI () methodconverts the URI string into an escape format using the UTF-8 encoding format. Characters that will not be encoded by this method:! @ # $& * () =:/;? +
6.3 encodeURIComponent () methodConverts the URI string into an escape format using the UTF-8 encoding format. Compared to encodeURI (), this method encodes more characters, such as characters. So if the string contains several parts of the URI, it cannot be encoded in this way, otherwise the URL will display an error after the/character is encoded.
Characters that will not be encoded by this method:! * ( ) ‘
Therefore, for the Chinese string, if you do not want to convert the string encoding format into UTF-8 format (such as the original page and the target page charset is consistent), only need to use escape. If your page is GB2312 or other encoding, and the page that accepts the parameter is UTF-8 encoded, it is necessary to use encodeURI or encodeuricomponent.
7. Another Chinese garbled scheme for handling URLs
The medium character on the requester side has encodeuri to transcode once, such as:
var url= "/ajax?name=" +encodeuri (name);
Server-side code:
name=new String (name.getbytes ("iso8859-1"), "UTF-8");
Note: Name is the obtained string, iso8859-1 is the default character encoding for the project, if the Chinese encoding gbk,gb2312, etc.You do not need this step to process.
Analysis: Verified by the program, the result is feasible. Thus, the default encoding method of the browser itself is iso8859-1, even if encodeURI is used for UTF-8 encoding, the main string content, such as ASCII characters and visible characters, is based on the characters of the Iso8859-1 browser itself. The reason is that these characters are coincident on the encoding and the UTF-8 string. Escape functions such as encodeURI are primarily addressed by the escaping of characters such as special characters%,/.
Reference Documents not currently
Solving the problem of Chinese garbled characters in URL address