Complete parameter encoding solution

Source: Internet
Author: User
ArticleDirectory
    • 1. unified the default encoding format
    • 2. Specify the encoding format through encoding Parameters
    • GET request
    • POST request
This article from: http://www.cnblogs.com/zhangziqiu/archive/2009/01/20/Encoding.html
Parameter encoding Specification I. Summary

We often need to pass Chinese data on pages, but we are often confused by text encoding. sometimes I don't know whether it is about browser encoding or server encoding. this article analyzes the encoding principles of data transmitted over the Internet, and proposes a complete and easy-to-use solution.

Ii. Principles

Avoid passing Chinese characters directly when the get or post parameters. Chinese parameters must be encoded before being passed. The server must use the same encoding format for decoding.

 

Iii. incorrect ideas

1. ManyProgramThe member thinks that Chinese characters can be passed in the URL.
The URL does not contain Chinese parameters. If we enter "http: // localhost/?" in the browser /? A ", it seems that we have Chinese characters in the URL. In fact, when you press the Enter key, the browser will automatically encode the Chinese characters and then pass them to the server.

2. When garbled characters are obtained, check the encoding format of the server program first.
Many people think that the URL can pass Chinese characters and do not know that the browser has an automatic encoding behavior. Therefore, they simply think that the problem lies on the server side. in fact, even if the server finds the correct encoding format, we should not easily change the server's default encoding format.

3. encode the parameters before passing them, and decode the parameters obtained using the request object.
Many Programmers think that urlencode is used for encoding when passing parameters, and urldecode is used for decoding when receiving parameters. This is a common error,When you use the default request. querystring and request. form, a decoding is automatically performed. The format used for decoding is the default encoding format set by the server.

4. Reasons

When a Chinese character is passed, the automatic encoding and decoding format is related to the setting of the browser and the server.

Test the get method of firefox3 and IE6 to send Chinese parameters, Firefox uses the UTF-8 format to encode Chinese parameters by default, and IE6 even if you set "always send URL with UTF-8" in advanced settings ", the gb2312 Encoded chinese parameters are still automatically used.

We can freely control the decoding format on the server side. however, it is often done by changing the server configuration. for example, for ASP. net program. you can. set the encoding and decoding format of the server segment in config:

<Globalization culture = "ZH-CN" uiculture = "ZH-CN" requestencoding = "UTF-8" responseencoding = "gb2312"/>

However, we cannot control browser behavior. Users may use different browsers.

5. solution 1. Unify the default encoding format

(1) set the server-side encoding format for UTF-8

(2) The passed parameters are all encoded. The server (C #) uses the server. urlencode method, and the client (JavaScript) uses the encodeuricomponent method.

Note:

The client's JavaScript function encodeuricomponent can only use the UTF-8 encoding format. So you need to set the server side request and response to both UTF-8.

The defect is that if some partners must pass parameters in other encoding formats, the server side may obtain garbled characters. This solution is simple and suitable for most scenarios.

2. Specify the encoding format through encoding Parameters

To solve the possible problem of uniform encoding formats, we use the "encoding" parameter to display the specified encoding format. the encoding parameter must be passed in all requests, whether it is get or post.

(1) For JavaScript client encoding, The encodeuricomponent method is still used for encoding, and the value of the encoding parameter is specified as "UTF-8 ".

(2) For other encoding formats passed into the server, such as gb2312, we cannot use the default request. encode form or querystring. because the server-side encoding format may be set to UTF-8. request. form or querystring are automatically decoded using the encoding format specified by the server. therefore, you must use the following method to process the request and obtain the parameters:

 

         /// <Summary>          /// Return the request parameter set Ziqiu. Zhang 2009.1.19 Based on the specified encoding format          /// </Summary>         /// <Param name = "request"> request object of the current request </param>          /// <Param name = "encode"> encoding format string </param>          /// <Returns> A set of namevalue whose key is the parameter name and value is the parameter value </returns>          Public   Static Namevaluecollection getrequestparameters (httprequest request, String Encode) {namevaluecollection result = Null ; Encoding destencode = Null ; // Obtain the encoding object of the specified encoding format               If (! String. isnullorempty (encode )){ Try { // Obtain the specified encoding format Destencode = encoding. getencoding (encode );} Catch { // If an error occurred while obtaining the specified encoding format, set it to null. Destencode = Null ;}} // Obtain the Request Parameters Based on Different httpmethod methods. If no encoding object exists, use the default encoding on the server side.              If (Request. httpmethod = "Post" ){ If ( Null ! = Destencode) {stream resstream = request. inputstream; Byte [] Filecontent = New   Byte [Resstream. Length]; resstream. Read (filecontent, 0, filecontent. Length ); String Postquery = destencode. getstring (filecontent); Result = httputility. parsequerystring (postquery, destencode );} Else {Result = request. Form ;}} Else { If ( Null ! = Destencode) {result = system. Web. httputility. parsequerystring (request. url. query, destencode );} Else {Result = request. querystring ;}} // Return results              Return Result ;}
 
 

Through the above method, we can get parameters using the specified encoding format for both get requests and post requests. if someone thinks that writing this method is in trouble, see "2. the third in the incorrect viewpoint.

This method returns a namevaluecollection object. when determining whether a parameter exists, you cannot use the method to check whether a key value exists. instead, you need to get the value through the key and then determine whether the value is null (which is somewhat different from the list ):

// Obtain the parameter. Assume that paramlist is a namevaluecollection object.P1 = paramlist ["P1"];// Determine whether this parameter exists. If not, P1 is null.If(! (String. isnullorempty (P1 )){...}

 

In addition, if the encoding object is not passed or the passed string cannot be converted to a strong-type encoding object, the default encoding format is used on the server side (that is, the querystring and form parameters of the request object are directly used to obtain parameters ).

Vi. Javascript encoding method

The sender of the request is called the client. We often need to use JavaScript to encode Chinese parameters on the client. The following functions related to encoding in javascript:

function name

function description

explanation

Escape ()

The escape () function can encode a string so that the string can be read on all computers.

this method does not encode ASCII letters and numbers, the following ASCII punctuation marks are not encoded :-_.! ~ *'(). All other characters are replaced by escape sequences.

[ expired ] use E n codeuri () or e n codeuricomponent ()

Unescape ()

The Unescape () function can decode strings encoded by escape.

the function works like this: find the character sequence in the format of % XX and % uxxxx (X represents a hexadecimal number) and replace the character with the Unicode characters \ u00xx and \ uxxxx Character Sequence for decoding.

[ expired ] use decodeuri () or decodeuricomponent ()

Encodeuri ()

The encodeuri () function can encode a string as a URI.

 

this method does not encode ASCII letters and numbers, the ASCII punctuation marks are not encoded :-_.! ~ *'().

This method is used to complete URI encoding, therefore, the encodeuri () function will not escape the following ASCII punctuation marks with special meanings in Uris :;/?: @ & =+ $, #

[ prompt - us ">] if ur I parameter contains characters that cannot be transferred, the encodeuricomponent () method should be used to encode each parameter.

Decodeuri ()

The decodeuri () function can decode the URI encoded by the encodeuri () function.

 

 

Encodeuricomponent ()

encodeuricomponent () the function can encode a string as a URI component.

this method does not encode ASCII letters and numbers, the ASCII punctuation marks are not encoded :-_.! ~ *'().

other characters (such :;/?: @ & =+ $, # The punctuation marks used to separate URI components) are all replaced by one or more hexadecimal escape sequences.

[ prompt - us ">] This method is encoded. URI special characters in

Decodeuricomponent ()

The decodeuricomponent () function can decode the uri of the encodeuricomponent () function.

 

 

Escape and Unescape are no longer recommended in V3 standards. the encodeuri and encodeuricomponent methods should be used. for a URI (the URL is also a medium URI), if we want to send the request as a complete URL, but the URL contains Chinese characters, we should use the encodeuri method. if you want to encode the parameter, use encodeuricomponent.
The following examples illustrate the differences between the two methods:

Document. Write (encodeuricomponent ("http://www.w3school.com.cn") + "<br/> ")
Document. Write (encodeuri ("http://www.w3school.com.cn") + "<br/> ")

Result

HTTP % 3A % 2f % 2fwww.w3school.com.cn
Http://www.w3school.com.cn/

7. Automatic GET Request Encoding by the browser

For GET requests, different browsers use different encoding methods to automatically encode Chinese parameters.

For example: Firefox/3.0.5 use UTF-8, IE6 use gb2312.

POST request

For post requests, the parameter value pairs in the form are sent to the server through the request body. In this case, the browser sends the request to the server based on the contenttype ("text/html; charset = GBK") of the webpage ") and then send it to the server. In HTMLCodeAdd:
<Meta http-equiv = "Content-Type" content = "text/html; charset = gb2312"/>

Firefox/3.0.5 uses the Chinese parameters for post Encoding Based on the encoding format set in charset.

IE6 does not work.

The experiment shows that the default encoding format of the client browser is uncertain. Therefore, we need to manually encode the parameters when passing Chinese characters.

8. Summary

The purpose of this article is to remind web programmers to pay attention to the automatic coding of browsers. In a project, the solution provided in this article will avoid the garbled problem caused by the passing of Chinese parameters. after reading the "cnblogs blog typographical skills" of yjinglee's blog, I reorganized this article.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.