URL encoding and decoding

Source: Internet
Author: User

In the project encountered the AJAX parameters, background receive value garbled (such as) the problem here to record

Front desk:

Background:

Solve the problem
    • Why do I need to encode
    • How to encode
    • Solutions to problems that actually arise
1. Why do I need to encode?
URL 只能使用 ASCII 字符集来通过因特网进行发送。 也就是说URL只能使用英文字母、阿拉伯数字和某些标点符号,不能使用其他文字和符号  

This means that if there is a Chinese character in the URL, it must be encoded and used.

But the trouble is that the standard international organization does not prescribe specific coding methods, but instead gives the application (browser) its own decision.

This causes "URL coding" to become a confusing area.

1.1 Browser Encoding for Chinese

The Chrome browser is the same as the Firefox browser, for example, "Utf-8" and "chapter" of the Code are "E6 96 87" and "E7 AB A0",

The "%e6%96%87%e7%ab%a0" shown is in order, by adding% to each byte before

Edge browser and IE browser is the same, such as the encoding method I did not see, I hope the master pointing

1.2 There are several reasons why you need to encode:

Have you ever wondered if the value
of Ukey=value is included in this type of communication ? ? = do you ever think that different operating systems, browsers, different Web character sets (charset) will affect your values?

If you've thought about it, there's no doubt you knew why you needed the code.

2. How do I encode?
URL encoding is also commonly referred to as the Percent-encoding (percent-encoding), because it is encoded in a very simple way, using the% percent sign plus the two-bit character--0123456789abcdef--to represent a byte of 16 binary form for ASCII characters, The letter A in the ASCII code of the corresponding byte is 0x61, then the URL encoding is%61, the letter ABC, URL encoding is%61%62%63 for non-ASCII characters, the RFC recommends using UTF-8 to encode it to get the corresponding byte, The percent encoding is then performed on each byte. For example, "Chinese" uses the UTF-8 character set to get the byte 0xe4 0xb8 0xAD 0xe6 0x96 0x87, after URL encoding to get "%e4%b8%ad%e6%96%87".
使用Javascript先对URL编码,然后再向服务器提交,不要给浏览器插手的机会这样就能保证客户端只用一种编码方法向服务器发出请求  
3. Actual problem solving methods first of all, say the JS three encoding functions, Escape, encodeURI and encodeURIComponent

3.1.escape function:

JS encoded in the birth of the first one, not advocating use, because it does not conform to my ("how") said URL encoding principle

The real effect is:

Returns the Unicode encoded value of a character so that they can be read on all computers

The specific rules are:

All spaces, punctuation, and other non-ASCII characters are replaced with%XX encoding; For example, a space returns a character with a% 20 character value greater than 255 stored in%UXXXX format

So if you see the code for%u later, that's the escape function.

Look at the column below and you'll know exactly what the rules are.

Used in projects:

Front desk:

functionhandleraddress () {$.ajax ({type:"Get",                //using the JS escape methodURL: "handler/handler.ashx?address=" + Escape ("Da Tun Lu Dong, Chaoyang District")), ContentType:"Application/json; Charset=utf-8 ", Success:function(data) {//Todo Success Method}, Error:function(XMLHttpRequest, Textstatus, Errorthrown) {//Todo Failed method                }            })

Background:

QueryString This function is automatically decoded, so there is no need to write any decoded statements.

It is also important to note that:

Escape () does not encode "+". But we know that when we submit a form, the page will be converted to a + character if there are any spaces. When the server processes the data, the + number is processed into a space. So be careful when you use it.

3.2.encodeURI function

This function is the function that is really used to encode URLs in JavaScript.

The rule is what I said in the second part above, using utf-8 coding.

Front desk:

Background:

With this method will be garbled problem, see a lot of people ask this question, the answer is to use escape this method, is this problem solved?

If I want to use jquery's Serialize () method来获取表单值并且序列化(标准URL编码)传到后台就不方便用escape啦

Solve garbled problem:

The reason for garbled is that my Web config file has such a configuration:

<requestencoding= "gb2312"  responseencoding= "gb2312"/ >

Solution 1: Get rid of this setting or change it to Utf-8 (the stakes for this program are not to mention, especially when the project is almost finished)


Solution 2: Take advantage of the Ajax post method, or use the Get method, but must act as the data parameter of the method, so that the information received in the background is not encoded

Front desk:

$.ajax ({type:"Get",                //The encodeURI method of JS is used.URL: "Handler/handler.ashx",                //as the data parameterData: {Address:encodeuri ("Tai Tun Lu Dong, Chaoyang District")}, ContentType:"Application/json; Charset=utf-8 ", Success:function(data) {//Todo Success Method}, Error:function(XMLHttpRequest, Textstatus, Errorthrown) {//Todo Failed method                }            })

Background: Manual decoding required

string ad =httputility.urldecode (context. request["address"]);

Httputility.urldecode and Server.urldecode are different, Httputility.urldecode is overloaded, you can specify the encoding method

For example:

string adsx = Httputility.urldecode (context. request.querystring["address"],system.text.encoding.utf8);

Solution 3: Get the encoded raw data and decode it yourself

The context can be found by observing the object of the request. Request.Url.Query is the data that is not decoded, that's great.

Code:

string address= httputility.parsequerystring (context. Request.Url.Query, Encoding.UTF8) ["address"];

Solution 4 (discussion): first querystring decoded data according to his original way to encode, and then use UTF8 to decode, this method is a bit problematic, the last character will appear garbled, have not found the reason.

When encoding the data, it is not the original browser sent the encoding value, the correct is the last side should be%9c, but now it is%3f

3.3.encodeURIComponent function

The difference from encodeURI () is that it is used to encode the part of the URL individually, not the entire URL.

Therefore, "; / ? : @ & = + $, # ", these symbols that are not encoded in encodeURI () are encoded in encodeURIComponent ()

The specific coding rules are the same as the encodeURI function, as follows, encodeURI will not encode ? and @, and encodeURIComponent will

encodeURIComponent This function, like his name, is to encode a component in a URI that cannot be used for all URIs

First write the article, if there are errors or unclear expression, despite the proposed, spit groove black I can anyway I have the cheek thick

Reference: Nanyi-About URL encoding

URL encoding and decoding

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.