Spaces-%20
"-%22
#-%23
%-%25
&-%26
(-%28
)-%29
+-%2B
,-%2c
/-%2F
:-%3a
; -%3b
<-%3c
=-%3d
>-%3e
? -%3f
@-%40
\-%5C
| -%7c
Passing special characters such as a plus sign in a URL
Some characters in the URL are escaped, such as the space is encoded into a plus sign, so the argument is clearly a plus, the obtained value is a space. How to solve it. If you pass a parameter through a URL, you should encode it as necessary.
Workaround:
Add in JavaScript
function UrlEncode (SSTR) {return Escape (SSTR). Replace (/\+/g, '%2b '). Replace (/\ '/g, '%22 '). Replace (/\ '/g, '%27 '). Replace (/\//g, '%2f '); }
The string is processed. For example:
var str=urlencode ("abc+");
Or:
Dst_fname=dst_fname.replaceall ("\\+", "%20");
RFC documentation for http: http://www.w3.org/Protocols/rfc2616/rfc2616.html
On the issue of HTTP and Chinese transmission, if not solved, in the actual network crawl process will produce a lot of secondary problems. It is particularly important to understand the encoding process of Chinese transmission.
The transfer process in Chinese may be: in-memory Unicode, encoding phase GBK, GB18030, UTF8-to-urlencode, and finally to possible base64 encoding. So what is the mechanism in charge of this conversion process? What channel does the conversion use?
UrlEncode: Encodes the string as a URL. return value: String. Function type: encoding processing. For example, when Google searches for "Chinese", the URLs are as follows:http://www.google.com.hk/search?q=%d6%d0%b9%fa%c8%cb&client=aff-360daohang& Hl=zh-cn&ie=gb2312&newwindow=1,%D6%D0%B9%FA%C8%CB part of which is urlencode. So how does this code work? There are no processing tools.
One way to try it is to:
In any case, convert the UrlEncode into a Utf-8 format for data transfer. For example, the following example: Id=java.net.urlencoder.encode (ID, "utf-8");
And then you can accept it. You must merge the methods used when Urlencoder.
Id=java.net.urlencoder.encode (ID, "utf-8");
Response.sendredirect ("personalpage.jsp?user=" +id);
After trying to succeed, there is another situation that is not considered, that is, gb2312 encoding format, the direct transmission of Chinese characters. Can the parameters be passed without the Utf-8 code.
Attach some of the information you find yourself:
UrlEncode Encoding
Used primarily to encode a string as a URL, returning a string.
How to use:
1. Usage in asp: Server.URLEncode ("content") for example:
<% Response.Write Server.URLEncode ("tool Net")%>
2. Usage in PHP: UrlEncode ("content") for example:
<? echo urlencode ("tool Net")?>
3. Usage in JSP: Urlencoder.encode ("content") for example:
<% java.net.URLEncoder.encode ("tool Net"); %>
The Chinese transmission method that the individual tries, that is, simply encode, the other end without decoding:
<%
Id=java.net.urlencoder.encode (ID, "UTF-8");
Response.sendredirect ("/index.jsp?error=" +id);
%>
4. Usage in javascript: encodeURI ("content") for example:
encodeURI ("tool Net");
5. Usage in Python:
Import Urllib2
Urllib2.quote ("Tool Net")
UrlDecode decoding
Primarily URL decoding of strings, returning decoded strings
1. Usage in asp: Server.urldecode ("content") for example:
<% Response.Write Server.urldecode ("%e5%b7%a5%e5%85%b7%e7%bd%91")%>
2. Usage in PHP: UrlDecode ("content") for example:
<? Echo UrlDecode ("%e5%b7%a5%e5%85%b7%e7%bd%91")?>
3. Usage in JSP: Urldecoder.decode ("content") for example:
<% Java.net.URLDecoder.decode ("%e5%b7%a5%e5%85%b7%e7%bd%91"); %>
4. Usage in JavaScript, for example:
decodeURI ("%e5%b7%a5%e5%85%b7%e7%bd%91");
5. Usage in Python, for example:
Import Urllib2
Urllib2.unquote ("%e5%b7%a5%e5%85%b7%e7%bd%91")
UrlEncode encoding and decoding of Gb2312 and Gb2312-Utf-8 codes
The internal code rule template between Unicode and Utf-8 code is:
Source code (16 binary) UTF-8 encoding (binary)
--------------------------------------------
0000-007f 0xxxxxxx
0080-07FF 110xxxxx 10xxxxxx
0800-FFFF 1110xxxx 10xxxxxx 10xxxxxx (Chinese characters in this interval)
......
--------------------------------------------
For example:
Baidu query "Chinese", will be the Chinese URL parameter to Gb2312 code of 16 binary representation, a medium text with 2 bytes
Http://www.baidu.com/s?wd=%D6%D0%B9%FA%C8%CB
Google query "Chinese", will be the Chinese URL parameter to the UTF-8 encoded 16 binary representation, a medium text with 3 bytes
http://www.google.cn/search?client=opera&rls=en&q=%e4%b8%ad%e5%9b%bd%e4%ba%ba&sourceid=opera& Ie=utf-8&oe=utf-8
objective-c urlencode encoding of URLs
The URLs need to be Encode when developing iOS app apps for Apple's iphone, ipad and other devices, such as http://www.baidu.com/s?wd= Chinese, Chinese, special symbols &% and spaces must be translated for proper access.
There are corresponding Encodeurl methods available in Java,. NET, and JS, in the Objective-c language, you can try
-(NSString *) stringbyaddingpercentescapesusingencoding: (nsstringencoding) enc;
To encode the complete URL (with request parameters), such as executing the following code:
NSString *url=@ "http://www.baidu.com/s?wd= Chinese";
NSString *encodedvalue = [url stringbyaddingpercentescapesusingencoding:nsutf8stringencoding];
The above code converts the Encodedvalue to:
Http://www.baidu.com/s?wd=%D6%D0%B9%FA%C8%CB
Visible, it does not convert the URL of the?%& symbol, which is also normal, because it does not distinguish which & is the parameter of the connection symbol or parameter value, you can encode the parameters separately, and then stitching into the URL before the value of the attribute parameter?%& and other symbols are replaced by the corresponding encoding.
Base64 Encoding Method:
Base64 C # Cryptographic functions
C # code public static string Encrypt (String ptoencrypt) {byte[] Barray=system.text.unicode Encoding.Unicode.GetBytes (Ptoencrypt); Return convert.tobase64string (Barray); }
Base64 C # decryption function
C # code public string Decrypt (string ptodecrypt) {byte[] Mingwen = convert.frombase64string (ptodecr YPT); String str = System.Text.UnicodeEncoding.Unicode.GetString (Mingwen); return str; }
The encrypted string, if there is a "/" "+" "=", in the transmission of the web (including the request of the action) will change, corresponding to the
"/" Changes to "2F" on the client
"+" ..........." "
"=" ... "%3d". "
So the client should revert to the correct base64 code before the string is decrypted, the following is the encoding in ASP
Java Code str=replace (str, "", "+") str=replace (str, "%2f", "/") Str=replace (str, "%3d", "=")