A program is being studied. If you enter a keyword, you can send it to Google, yahoo, and other search engines to search for it and then open the result webpage. The principle is simple. For example, in Google search for China, the search result page URL is "http://www.google.com/search? Hl = zh-CN & q = China & lr = ". You only need to replace the red content to search by different keywords.
However, if the keyword is Chinese, the problem may occur. For example, when google searches for "China", the Url is "http://www.google.com/search? Hl = zh-CN & newwindow = 1 & q = % E4 % B8 % AD % E5 % 9B % BD & lr = ". The Chinese character "China" is encoded in the UTF-8 format.
Not only Chinese characters are encoded, but some special characters are also encoded. For example, to search for "C #", the URL is "http://www.google.com/search? Hl = zh-CN & newwindow = 1 & q = C % 23 & lr = ".
In general, foreign websites are according to the UTF-8 code, and "Baidu" is according to "GB2312" encoding. For example, if you search for "China", the URL is "http://www.baidu.com/s? Wd = % D6 % D0 % B9 % FA & cl = 3"
Let's compare: C # China Code
Encoding result website
UTF-8 C % 23% E4 % B8 % AD % E5 % 9B % BD Google
GB2312 C % 23% D6 % D0 % B9 % FA BaiDu
Summary:
In the UTF-8, a Chinese character pair should be three bytes, and a Chinese character in GB2312 occupies two bytes.
No matter what encoding, letters and numbers are not encoded, and the special symbol encoding occupies one byte.
// Code by UTF-8
String tempSearchString1 = System. Web. HttpUtility. UrlEncode ("C # China ");
// Encoding by GB2312
String tempSearchString2 = System. Web. HttpUtility. UrlEncode ("C # China", System. Text. Encoding. GetEncoding ("GB2312 "));
// Configure //--------------------------------------------------------------------------------------------------------------
[Switch] process the URL encoding of c # in ASP. NET
Problems to be Solved:
Upload the following URL as a parameter to other pages
1 http: // domain/de. retrial? Uid = 12 & page = 15
2. The parameters following the url contain Chinese characters, such as:... aspx? Title = crane
In the above case, an RUL encoding and decoding process must be performed; otherwise, an error may occur.
The Code is as follows:
// Pass the value
String temp = "<a href = 'add. aspx? Url = "+ Server. urlEncode (skin. page. request. url. absoluteUri) + "& title =" + Server. urlEncode (skin. page. header. title) + "'> Add to favorites </a> ");
// Obtain the value from the above in another file
If (Request. QueryString ["url"]! = Null)
{
String url = Server. UrlDecode (Request. QueryString ["url"]. ToString ());
This.txt Address. Text = url;
}
If (Request. QueryString ["title"]! = Null)
{
String title = Server. UrlDecode (Request. QueryString ["title"]. ToString ());
This.txt Title. Text = title;
}
//-----------------------------------------------
URL encoding table
1. string s = System. Web. HttpUtility (byte [] data );
Here, s is the converted URL encoding. Note that the byte array must be an ASCII array. text. encoding. default. getBytes (str. toCharArray (); is incorrect and cannot be escaped correctly!
2. Write a small program based on URL encoding rules
* ***** String UrlEncode (byte [] byt)
{
String desstr = "";
For (int I = 0; I <byt. Length; I ++)
{
Desstr + = "% ";
Desstr + = byt [I]. ToString ("X2 ");
}
Return desstr;
}
The URL encoding table is as follows:
Backspace % 08
Tab % 09
Linefeed % 0A
Creturn % 0D
Space % 20
! % 21
"% 22
# % 23
$ % 24
% 25
& % 26
'% 27
(% 28
) % 29
* % 2a
+ % 2B
, % 2C
-% 2D
. % 2e
/% 2f
0% 30
1% 31
2% 32
3% 33
4% 34
5% 35
6% 36
7% 37
8% 38
9% 39
: % 3A
; % 3B
<% 3C
= % 3D
> % 3E
? % 3F
@ % 40
A % 41
B % 42
C % 43
D % 44
E % 45
F % 46
G % 47
H % 48
I % 49
J % 4A
K % 4B
L % 4C
M % 4D
N % 4E
O % 4f
P % 50
Q % 51
R % 52
S % 53
T % 54
U % 55
V % 56
W % 57
X % 58
Y % 59
Z % 5A
[% 5b
\ % 5c
] % 5D
^ % 5E
_ % 5f
'% 60
A % 61
B % 62
C % 63
D % 64
E % 65
F % 66
G % 67
H % 68
I % 69
J % 6A
K % 6B
L % 6C
M % 6D
N % 6E
O % 6F
P % 70
Q % 71
R % 72
S % 73
T % 74
U % 75
V % 76
W % 77
X % 78
Y % 79
Z % 7A
{% 7b
| % 7c
} % 7D
~ % 7E
Snapshot % A2
Certificate % A3
¥ % A5
| % A6
§ % A7
? % AB
? % AC
Memory % AD
O % B0
± % B1
A % B2
, % B4
μ% B5
? % BB
? % BC
? % BD
? % BF
A' % C0
A' % C1
A ^ % C2
A ~ % C3
A' % C4
A ° % C5
? % C6
C? % C7
E' % C8
E '% C9
E ^ % CA
E? % CB
I '% CC
I '% CD
I ^ % CE
¨ % CF
D % D0
N ~ % D1
O' % D2
O '% D3
O ^ % D4
O ~ % D5
O? % D6
? % D8
U' % D9
U' % DA
U ^ % DB
U? % DC
Y' % DD
T % DE
? % DF
A' % E0
A' % E1
A ^ % E2
A ~ % E3
A' % E4
A ° % E5
? % E6
C? % E7
E' % E8
E' % E9
E ^ % EA
E? % EB
I '% EC
I '% ED
I ^ % EE
I evaluate % EF
E % F0
N ~ % F1
O' % F2
O' % F3
O ^ % F4
O ~ % F5
O? % F6
Failed % F7
? % F8
U' % F9
U' % fa
U ^ % FB
U evaluate % FC
Y' % FD
T % Fe
Y' % FF
//--------------------------------------------------------------
Http://www.baidu.com? WD = % Ca % C0 % BD % E7 % B1 % ad & CL = 3
% Ca % C0 % BD % E7 % B1 % ad can be decoded into World Cup by using system. Web. httputility. urldecode
But in http://www.google.com input World Cup http://www.google.com/search? Hl = ZH-CN & Q = % E4 % B8 % 96% E7 % 95% 8C % E6 % 9d % AF & LR = I also use system. web. httputility. urldecode but cannot be decoded into the World Cup. It is garbled.
How can we decode % E4 % B8 % 96% E7 % 95% 8C % E6 % 9d % af to the World Cup?
This article from the csdn blog, reproduced please indicate the source: http://blog.csdn.net/zhongzhengfeng/archive/2008/11/06/3236551.aspx