Encodeuri/decodeuri and urlencode/urldecode, nightmare continues

Source: Internet
Author: User

 

**************************************** ****************************
*Copyright Notice
*
* This document uses the Creative Commons release. Please strictly abide by this authorization protocol.
* This article was first published onBlogThis statement is an integral part of this article.
* Author's network name: prodigal son
* Author email: Dayichen (AT) 163.com
* Author's blog: Http: // www. cnblogs. com/walkingboy
*
**************************************** ****************************

Encodeuri/decodeuri and urlencode/urldecode, nightmare continues

-Written by prodigal son @ cnblogs.com (07-04-11)

Abstract:

The standards seem to be defined as follows:

"Replace % 20 with spaces. Replace % ASCII with % ASCII if there are other characters. Replace % ASCII with two % ASCII characters if there are four bytes including Chinese characters"

However, Ms has always been unconventional. Following the question of the Chinese parameter of the URL of encodeuri, the nightmare of encodeuri continues to hit ......

1. Damn Space

Recently, two pages of data exchange were conducted. when pagea initiates an Ajax request to pageb, pageb reads data from the database and returns the data to pagea. Due to the fear that special characters may cause JS failure, urlencode is used for Uri encoding, decodeuri decoding is performed on the client.

The result shows that space cannot be correctly identified. urlencode encodes the space into +, while decodeuri only recognizes spaces represented by 20%. It is preliminarily determined that the urlencode encoding format is inconsistent with encodeuri. To verify this opinion, some special symbols on the keyboard are extracted for encoding comparison:

 

 
 
character :~! @ # $ % ^ & * () _ +-=
urlencode: % 7E! % 40% 23% 24% 25% 5E % 26 * () _ % 2b-% 3d
encodeuri :~! @ # $ % 25% 5E & * () _ +-=
character: {} [] \ | ';: " /? ., <>
urlencode: % 7b % 7D % 5B % 5d % 5c % 7C '% 3B % 3A % 22% 2f % 3f. % 2C % 3C % 3E
encodeuri: % 7b % 7D % 5B % 5d % 5c % 7c';: % 22 /? ., % 3C % 3E

 

These comparisons show that there is a big difference between the two encodings, especially for handling special characters.

Therefore, in the question of the URL Chinese parameter of encodeuri, I think that Asp.net has made two encoding judgments on form actions, which should be wrong, in fact, no encoding is performed twice, but Asp.net uses urldecode to decode after receiving the action encoded by encodeuri, and then uses urlencode to encode and write the code into HTML again. The encoding format is inconsistent, therefore, JS cannot use decodeuri to correctly decode the URI after PostBack.

From this we can know that if you use an encodeuri encoded string, it can be decoded through urldecode. That is to say, urldecode can recognize two encoding formats: encodeuri (JS) and urlencode (C. we can think of it. When designing this class library, Ms has considered that it will accept the encoding of encodeuri. If we think of it as common sense, we will naturally consider encoding in consideration of decoding, that is, urlencode should provide a format that can be encoded into decodeuri for decoding. however, I have never been able to find this method. I don't know whether it's a joke by the designer, or leave some flaws so that we don't get bored with the same code work, concept ......

Ii. Make SP more violent

Because of the inconsistency of this encoding, if yourProgramIf you need more server-client data communication, you can only use other methods (JSON, XML, and other non-Uris), even if it is just a simple string, you also need to add a lot of extra data to suit your format.

Like a lot of Ms software, SP is everywhere, it seems I HAVE TO DO SP myself.

 

The differences in encoding under analysis are basically concentrated on the processing of special characters, and the processing of Chinese characters seems to be consistent (there is no test of travel differences yet ). therefore, we set the "use encodeuri/decodeuri to process Chinese characters, and others to perform manual processing" scheme, and modified the previous JSCode:

 

 
 
Kinn. util. encodeuri =Function(Unzipstr, iscusencode ){
If(Iscusencode ){
VaRZiparray =New Array();
VaRZipstr ="";
VaRLens =New Array();
For(VaRI = 0; I <unzipstr.Length; I ++ ){
VaRAC = unzipstr. charcodeat (I );
Zipstr + = ac;
Lens = Lens. Concat (AC.Tostring().Length);
}
Ziparray = ziparray. Concat (zipstr );
Ziparray = ziparray. Concat (lens. Join ("O"));
ReturnZiparray. Join ("N");
}Else{
// Return encodeuri (unzipstr );
VaRZipstr ="";
VaRStrspecial ="! \ "# $ % & '() * +,/:; <=>? [] ^ '{| }~ %";
VaRTt ="";
For(VaRI = 0; I <unzipstr.Length; I ++ ){
VaRCHR = unzipstr. charat (I );
VaRC = kinn. util. stringtoascii (CHR );
Tt + = CHR +":"+ C +"N";
If(Parseint("0x"+ C)> 0x7f ){
Zipstr + = encodeuri (unzipstr. substr (I, 1 ));
}Else{
If(CHR =" ")
Zipstr + ="+";
Else If(Strspecial. indexof (CHR )! =-1)
Zipstr + ="%"+ C.Tostring(16 );
Else
Zipstr + = CHR;
}
}
ReturnZipstr;
}
}
Kinn. util. decodeuri =Function(Zipstr, iscusencode ){
If(Iscusencode ){
VaRZiparray = zipstr. Split ("N");
VaRZipsrcstr = ziparray [0];
VaRZiplens;
If(Ziparray [1]) {
Ziplens = ziparray [1]. Split ("O");
}Else{
Ziplens.Length= 0;
}
VaRUzipstr ="";
For(VaRJ = 0; j <ziplens.Length; J ++ ){
VaRCharlen =Parseint(Ziplens [J]);
Uzipstr + =String. Fromcharcode (zipsrcstr. substr (0, charlen ));
Zipsrcstr = zipsrcstr. Slice (charlen, zipsrcstr.Length);
}
ReturnUzipstr;
}Else{
// Return decodeuri (zipstr );
VaRUzipstr ="";
For(VaRI = 0; I <zipstr.Length; I ++ ){
VaRCHR = zipstr. charat (I );
If(CHR ="+"){
Uzipstr + =" ";
}Else If(CHR ="%"){
VaRASC = zipstr. substring (I + 1, I + 3 );
If(Parseint("0x"+ ASC)> 0x7f ){
Uzipstr + = decodeuri ("%"+ ASC.Tostring() + Zipstr. substring (I + 3, I + 9 ).Tostring());;
I + = 8;
}Else{
Uzipstr + = kinn. util. asciitostring (Parseint("0x"+ ASC ));
I + = 2;
}
}Else{
Uzipstr + = CHR;
}
}
ReturnUzipstr;
}
}
Kinn. util. stringtoascii =Function(STR ){
ReturnStr. charcodeat (0 ).Tostring(16 );
}
Kinn. util. asciitostring =Function(Asccode ){
Return String. Fromcharcode (asccode );
}

 

 

Iii. Prodigal words:

it's strange, why is there always some minor issues in Asp.net? I don't know if it's a matter of ignoring the design, or is it true to improve the boring programming life of our programmers?

the encoding in Java is the same as that in encodeuri.

the standard may be used to break the rule. for IE, ASP. NET is still like this...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.