How to return garbled characters when XMLHttpRequest reads a Chinese webpage
XMLHttpRequest transfers data with a UTF-8 by default. When the returned data from the server is UTF-8 encoding, it works very well (developing Web applications, when the server and client and database unified use of UTF-8 coding can effectively avoid garbled problem ). If the server sets the correct Content-Type response header and encoding information, XMLHttpRequest can also work correctly.
However, when XMLHttpRequest is used to read Chinese webpage content, if the Content-Type response header is not set in the program on the server side, or the encoding type is not set in the header, then we may encounter garbled characters when accessing the responsetext attribute. The following code uses XMLHttpRequest to obtain the homepage of the constellation station on Yahoo China Website:
XMLHTTP = getxmlhttprequest ();
VaR url = "http://cn.astrology.yahoo.com /";
XMLHTTP. Open ("get", URL, true );
XMLHTTP. onreadystatechange = function (){
If (XMLHTTP. readystate = 4)
If (XMLHTTP. Status = 200)
Alert (XMLHTTP. responsetext );
};
XMLHTTP. Send (null );
Even for professional websites like Yahoo China, the support for web standards is not thorough. the pop-up HTML source code is filled with HTML tags that do not comply with web standards, and of course there are foreseeable garbled characters.
Unfortunately, Firefox and IE are both solutions.
Firefox
The XMLHTTPRequest object of Firefox supports the overridemimetype method. You can specify the encoding type of the returned data. This method can solve Chinese garbled characters. The preceding code is modified as follows:
XMLHTTP = getxmlhttprequest ();
VaR url = "http://cn.astrology.yahoo.com /";
XMLHTTP. Open ("get", URL, true );
XMLHTTP. overridemimetype ("text/html; charset = gb2312"); // sets to identify data in gb2312 encoding.
XMLHTTP. onreadystatechange = function (){
If (XMLHTTP. readystate = 4)
If (XMLHTTP. Status = 200)
Alert (XMLHTTP. responsetext );
};
XMLHTTP. Send (null );
Internet Explorer
IE does not support the overridemimetype method, and can only be solved in a very bad way. In this case, you need to introduce a hybrid function:
Function gb2utf8 (data ){
VaR glbencode = [];
Gb2utf8_data = data;
ExecScript ("gb2utf8_data = midb (gb2utf8_data, 1)", "VBScript ");
VaR T = escape (gb2utf8_data ). replace (/% u/g ,""). replace (/(. {2 })(. {2})/g, "%$ 2% $1 "). replace (/% ([A-Z].) % (. {2})/g, "@ $1 $2 ");
T = T. Split ("@");
VaR I = 0, j = T. length, K;
While (++ I <j ){
K = T [I]. substring (0, 4 );
If (! Glbencode [k]) {
Gb2utf8_char = eval ("0x" + k );
ExecScript ("gb2utf8_char = CHR (gb2utf8_char)", "VBScript ");
Glbencode [k] = escape (gb2utf8_char). substring (1, 6 );
}
T [I] = glbencode [k] + T [I]. substring (4 );
}
Gb2utf8_data = gb2utf8_char = NULL;
Return Unescape (T. Join ("% "));
} XMLHTTP = getxmlhttprequest ();
VaR url = "http://cn.astrology.yahoo.com /";
XMLHTTP. Open ("get", URL, true );
XMLHTTP. onreadystatechange = function (){
If (XMLHTTP. readystate = 4)
If (XMLHTTP. Status = 200)
Alert (gb2utf8 (XMLHTTP. responsebody); // be sure to use responsebody here.
};
XMLHTTP. Send (null );
The gb2utf8 function directly parses the binary data returned by XMLHttpRequest. execScript is used to execute the VBScript function. So it is a hybrid function. Thanks 500) {This. resized = true; this. style. width = 500;} ">" src = "http://ajaxcn.org/theme/images/Icon-Extlink.png" border = 0> algorithm provided by the blueidea forum.
Although there is a solution, the form is ugly and does not comply with web standards. So it should be avoided in programming, if it is the development of Web applications, should try to use UTF-8 encoding, or set the correct encoding information on the server. As for the above examples, it is not recommended to steal content from other websites.