http URL escape characters, special characters

Source: Internet
Author: User
Tags base64 urlencode in python

Spaces-%20

"-%22

#-%23

%-%25

&-%26

(-%28

)-%29

+-%2B

,-%2c

/-%2F

:-%3a

; -%3b

<-%3c

=-%3d

>-%3e

? -%3f

@-%40

\-%5C

| -%7c



Passing special characters such as a plus sign in a URL

Some characters in the URL are escaped, such as the space is encoded into a plus sign, so the argument is clearly a plus, the obtained value is a space. How to solve it. If you pass a parameter through a URL, you should encode it as necessary.


Workaround:


Add in JavaScript

function UrlEncode (SSTR) {return Escape (SSTR). Replace (/\+/g, '%2b '). Replace (/\ '/g, '%22 '). Replace (/\ '/g, '%27 '). Replace (/\//g, '%2f '); }



The string is processed. For example:

var str=urlencode ("abc+");


Or:

Dst_fname=dst_fname.replaceall ("\\+", "%20");

RFC documentation for http: http://www.w3.org/Protocols/rfc2616/rfc2616.html

On the issue of HTTP and Chinese transmission, if not solved, in the actual network crawl process will produce a lot of secondary problems. It is particularly important to understand the encoding process of Chinese transmission.


The transfer process in Chinese may be: in-memory Unicode, encoding phase GBK, GB18030, UTF8-to-urlencode, and finally to possible base64 encoding. So what is the mechanism in charge of this conversion process? What channel does the conversion use?

UrlEncode: Encodes the string as a URL. return value: String. Function type: encoding processing. For example, when Google searches for "Chinese", the URLs are as follows:http://www.google.com.hk/search?q=%d6%d0%b9%fa%c8%cb&client=aff-360daohang& Hl=zh-cn&ie=gb2312&newwindow=1,%D6%D0%B9%FA%C8%CB part of which is urlencode. So how does this code work? There are no processing tools.




One way to try it is to:

In any case, convert the UrlEncode into a Utf-8 format for data transfer. For example, the following example: Id=java.net.urlencoder.encode (ID, "utf-8");

And then you can accept it. You must merge the methods used when Urlencoder.

Id=java.net.urlencoder.encode (ID, "utf-8");

Response.sendredirect ("personalpage.jsp?user=" +id);


After trying to succeed, there is another situation that is not considered, that is, gb2312 encoding format, the direct transmission of Chinese characters. Can the parameters be passed without the Utf-8 code.



Attach some of the information you find yourself:


UrlEncode Encoding


Used primarily to encode a string as a URL, returning a string.

How to use:

1. Usage in asp: Server.URLEncode ("content") for example:

<% Response.Write Server.URLEncode ("tool Net")%>

2. Usage in PHP: UrlEncode ("content") for example:

<? echo urlencode ("tool Net")?>

3. Usage in JSP: Urlencoder.encode ("content") for example:

<% java.net.URLEncoder.encode ("tool Net"); %>

The Chinese transmission method that the individual tries, that is, simply encode, the other end without decoding:

<%

Id=java.net.urlencoder.encode (ID, "UTF-8");

Response.sendredirect ("/index.jsp?error=" +id);

%>

4. Usage in javascript: encodeURI ("content") for example:

encodeURI ("tool Net");

5. Usage in Python:

Import Urllib2

Urllib2.quote ("Tool Net")

UrlDecode decoding

Primarily URL decoding of strings, returning decoded strings

1. Usage in asp: Server.urldecode ("content") for example:

<% Response.Write Server.urldecode ("%e5%b7%a5%e5%85%b7%e7%bd%91")%>

2. Usage in PHP: UrlDecode ("content") for example:

<? Echo UrlDecode ("%e5%b7%a5%e5%85%b7%e7%bd%91")?>

3. Usage in JSP: Urldecoder.decode ("content") for example:

<% Java.net.URLDecoder.decode ("%e5%b7%a5%e5%85%b7%e7%bd%91"); %>

4. Usage in JavaScript, for example:

decodeURI ("%e5%b7%a5%e5%85%b7%e7%bd%91");

5. Usage in Python, for example:

Import Urllib2

Urllib2.unquote ("%e5%b7%a5%e5%85%b7%e7%bd%91")

UrlEncode encoding and decoding of Gb2312 and Gb2312-Utf-8 codes

The internal code rule template between Unicode and Utf-8 code is:

Source code (16 binary) UTF-8 encoding (binary)

--------------------------------------------

0000-007f 0xxxxxxx

0080-07FF 110xxxxx 10xxxxxx

0800-FFFF 1110xxxx 10xxxxxx 10xxxxxx (Chinese characters in this interval)

......

--------------------------------------------

For example:

Baidu query "Chinese", will be the Chinese URL parameter to Gb2312 code of 16 binary representation, a medium text with 2 bytes

Http://www.baidu.com/s?wd=%D6%D0%B9%FA%C8%CB

Google query "Chinese", will be the Chinese URL parameter to the UTF-8 encoded 16 binary representation, a medium text with 3 bytes

http://www.google.cn/search?client=opera&rls=en&q=%e4%b8%ad%e5%9b%bd%e4%ba%ba&sourceid=opera& Ie=utf-8&oe=utf-8

objective-c urlencode encoding of URLs

The URLs need to be Encode when developing iOS app apps for Apple's iphone, ipad and other devices, such as http://www.baidu.com/s?wd= Chinese, Chinese, special symbols &% and spaces must be translated for proper access.

There are corresponding Encodeurl methods available in Java,. NET, and JS, in the Objective-c language, you can try

-(NSString *) stringbyaddingpercentescapesusingencoding: (nsstringencoding) enc;

To encode the complete URL (with request parameters), such as executing the following code:

NSString *url=@ "http://www.baidu.com/s?wd= Chinese";

NSString *encodedvalue = [url stringbyaddingpercentescapesusingencoding:nsutf8stringencoding];

The above code converts the Encodedvalue to:

Http://www.baidu.com/s?wd=%D6%D0%B9%FA%C8%CB

Visible, it does not convert the URL of the?%& symbol, which is also normal, because it does not distinguish which & is the parameter of the connection symbol or parameter value, you can encode the parameters separately, and then stitching into the URL before the value of the attribute parameter?%& and other symbols are replaced by the corresponding encoding.

Base64 Encoding Method:

Base64 C # Cryptographic functions

C # code public static string Encrypt (String ptoencrypt) {byte[] Barray=system.text.unicode                Encoding.Unicode.GetBytes (Ptoencrypt);         Return convert.tobase64string (Barray); }


Base64 C # decryption function


C # code public string Decrypt (string ptodecrypt) {byte[] Mingwen = convert.frombase64string (ptodecr                YPT);                String str = System.Text.UnicodeEncoding.Unicode.GetString (Mingwen);            return str; }



The encrypted string, if there is a "/" "+" "=", in the transmission of the web (including the request of the action) will change, corresponding to the

"/" Changes to "2F" on the client

"+"  ..........." "

"=" ... "%3d". "

So the client should revert to the correct base64 code before the string is decrypted, the following is the encoding in ASP

Java Code str=replace (str, "", "+") str=replace (str, "%2f", "/") Str=replace (str, "%3d", "=")








Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.