測試通過系統:WinXP 中文Pro, XML4.0 SP2,C#
嘗試過XMLHTTP作用戶端,然後嘗試與伺服器端ASP互動的程式員,我認為都很有思路,當然這也是在自誇:)。但最頭疼的問題恐怕就是中文亂碼的問題,查了很多資料,MSDN,互連網上的,嘗試了很多方法都不太奏效,還好沒有氣餒,現在,最新的最簡單的解決辦法閃亮登場:
把用戶端要傳輸的XML的頭由:
<?xml version="1.0" encoding="gb2312" ?>
改為:
<?xml version="1.0" encoding="utf-8" ?>
伺服器端的ASP程式發送給用戶端XML結果時需要加:
Response.ContentType = "text/xml"
Response.CharSet = "gb2312"
用戶端的程式取返回結果用XmlDom.loadXml(xmlhttp.responseText)就可以了。
============================================================================
以下分析可能的原因:
可能是我們的作業系統本身使用UTF-8編碼的原因。
把所有Request.ServerVariables寫到一個文字檔中你會發現類似這些:
ALL_HTTP:HTTP_ACCEPT:*/*
HTTP_ACCEPT_LANGUAGE:zh-cn
HTTP_CONNECTION:Keep-Alive
HTTP_HOST:localhost
HTTP_USER_AGENT:Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1)
HTTP_COOKIE:ASPSESSIONIDAQBCSQRA=FNEHNOCCMHECCOPIOKKECEFM
HTTP_CONTENT_LENGTH:94
HTTP_CONTENT_TYPE:text/xml;charset=gb2312
HTTP_ACCEPT_ENCODING:gzip, deflate
HTTP_CACHE_CONTROL:no-cache
ALL_RAW:Accept: */*
Accept-Language: zh-cn
Connection: Keep-Alive
Host: localhost
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1)
Cookie: ASPSESSIONIDAQBCSQRA=FNEHNOCCMHECCOPIOKKECEFM
Content-Length: 94
Content-Type: text/xml;charset=gb2312
Accept-Encoding: gzip, deflate
Cache-Control: no-cache
APPL_MD_PATH:/LM/W3SVC/1/Root/zdqs
APPL_PHYSICAL_PATH:C:/Inetpub/systems/ZDS/qry/
AUTH_PASSWORD:
AUTH_TYPE:
AUTH_USER:
CERT_COOKIE:
CERT_FLAGS:
CERT_ISSUER:
CERT_KEYSIZE:
CERT_SECRETKEYSIZE:
CERT_SERIALNUMBER:
CERT_SERVER_ISSUER:
CERT_SERVER_SUBJECT:
CERT_SUBJECT:
CONTENT_LENGTH:94
CONTENT_TYPE:text/xml;charset=gb2312
GATEWAY_INTERFACE:CGI/1.1
HTTPS:off
HTTPS_KEYSIZE:
HTTPS_SECRETKEYSIZE:
HTTPS_SERVER_ISSUER:
HTTPS_SERVER_SUBJECT:
INSTANCE_ID:1
INSTANCE_META_PATH:/LM/W3SVC/1
LOCAL_ADDR:127.0.0.1
LOGON_USER:
PATH_INFO:/zdqs/QURY.asp
PATH_TRANSLATED:C:/Inetpub/systems/ZDS/qry/QURY.asp
QUERY_STRING:
REMOTE_ADDR:127.0.0.1
REMOTE_HOST:127.0.0.1
REMOTE_USER:
REQUEST_METHOD:POST
SCRIPT_NAME:/zdqs/QURY.asp
SERVER_NAME:localhost
SERVER_PORT:80
SERVER_PORT_SECURE:0
SERVER_PROTOCOL:HTTP/1.1
SERVER_SOFTWARE:Microsoft-IIS/5.1
URL:/zdqs/QURY.asp
HTTP_ACCEPT:*/*
HTTP_ACCEPT_LANGUAGE:zh-cn
HTTP_CONNECTION:Keep-Alive
HTTP_HOST:localhost
HTTP_USER_AGENT:Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1)
HTTP_COOKIE:ASPSESSIONIDAQBCSQRA=FNEHNOCCMHECCOPIOKKECEFM
HTTP_CONTENT_LENGTH:94
HTTP_CONTENT_TYPE:text/xml;charset=gb2312
HTTP_ACCEPT_ENCODING:gzip, deflate
HTTP_CACHE_CONTROL:no-cache
猜測一:網路傳輸過程中所用的編碼方式是gb2312
然後,請看另外MSXML4 SDK中一個協助: Enforcing Character Encoding with DOM
In some cases, an XML document is passed to and processed by an application—for example, an ASP page—that cannot properly decode rare or new characters. When this happens, you might be able to work around the problem by relying on DOM to handle the character encoding. This bypasses the incapable application.
For example, the following XML document contains the character entity ("€") that corresponds to the Euro currency symbol (€). The ASP page, incapable.asp, cannot process currency.xml.
XML Data (currency.xml)
<?xml version="1.0" encoding="utf-8"?><currency> <name>Euro</name> <symbol>€</symbol> <exchange> <base>US___FCKpd___0lt;/base> <rate>1.106</rate> </exchange></currency>
ASP Page (incapable.asp)
<%@language = "javascript"%><% var doc = new ActiveXObject("Msxml2.DOMDocument.4.0"); doc.async = false; if (doc.load(Server.MapPath("currency.xml"))==true) { Response.ContentType = "text/xml"; Response.Write(doc.xml); }%>
When incapable.asp is opened from a Web browser, an error such as the following results:
An invalid character was found in text content. Error processing resource 'http://MyWebServer/MyVirtualDirectory/incapable.asp'. Line 4, Position 10
This error is caused by the use of the Response.Write(doc.xml) instruction in the incapable.asp code. Because it calls upon ASP to encode/decode the Euro currency symbol character found in currency.xml, it fails.
However, you can fix this error. To do so, replace this Response.Write(doc.xml) instruction in incapable.asp with the following line:
doc.save(Response);
With this line, the error does not occur. The ASP code does produce the correct output in a Web browser, as follows:
<?xml version="1.0" encoding="utf-8" ?> <currency> <name>Euro</name> <symbol>€</symbol> <exchange> <base>US$</base> <rate>1.106</rate> </exchange> </currency>
The effect of the change in the ASP page is to let the DOM object (doc)—instead of the Response object on the ASP page—handle the character encoding.
請看最後一句:上例中ASP的改變在於讓DOM對象(doc)——而不是ASP中的Response對象——處理字元編碼。
所以得出:
猜想二:你可以視Request或Response對象為一個檔案控制代碼,如果是用DOM對象的load與save方法時。
由猜想一、猜想二得出
猜想三:用戶端編譯的系統使用的字串本身就是採用GB2312編碼的,而使用XMLHTTP傳輸資料時自動轉換為GB2312,伺服器端用DOM對象load時由於相當於載入一個位元組流,然後一看xml頭中的encoding就是GB2312,所以就沒做轉換,直接把位元組流視為字串。。。不好意思是它的確忘記了一件事就是,這個字串在我的系統顯示時卻認為是UTF-8編碼的,所以只有強制xml轉換以下就行了,好像見別人的解決方案時也有寫gb2312到utf-8轉換函式的……
最後實踐,證實可行。。。
用一句話概括就是,用戶端發送給伺服器的XML,encoding全部為utf-8編碼的;伺服器發送給用戶端,全部指定編碼為:gb2312,一切OK。