It is very convenient for the Inet Control (Internet Transfer Control) to download HTML code from a webpage. However, there is a problem that garbled characters may occur when reading a webpage encoded with UTF-8. This is no wonder that VB supports UNICODE encoding by default, and is naturally overwhelmed when reading UTF-8 data.
(Note: If you do not know the meaning of the above mentioned character encoding methods, then you should first read this article: a variety of character encoding methods (ANSI, UNICODE, UTF-8, GB2312, GBK), if you already know, please continue)
How can we convert UTF-8 webpage data to UNICODE?
First, the data obtained by Inet must be obtained in binary format, for example:
Asynchronous call:
Private Sub commandementclick ()
Dim StrUrl As String
StrUrl = Text1.Text
Inet1.Execute StrUrl, "GET"
End Sub
Private Sub inetaskstatechanged (ByVal State As Integer)
If State = icResponseCompleted Then
Dim BinBuff () As Byte
BinBuff = Inet1.GetChunk (0, icByteArray)
End If
End Sub
Synchronous call:
Private Sub Command2_Click ()
Dim BinBuff () As Byte
Dim StrUrl As String
StrUrl = Text1.Text
BinBuff = Inet1.OpenURL (Text1.Text, icByteArray)
RichTextBox1.Text = Utf8ToUnicode (BinBuff)
End Sub
The above is the code for retrieving binary data from inet.
Add Module
'Utf-8 conversion UNICODE code
Option Explicit
Private Declare Function MultiByteToWideChar Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal encoded As Long, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) as Long
Private Const cp_utf8= 65001
Function Utf8ToUnicode (ByRef Utf () As Byte) As String
Dim lRet As Long
Dim lLength As Long
Dim lBufferSize As Long
LLength = UBound (Utf)-LBound (Utf) + 1
If lLength <= 0 Then Exit Function
LBufferSize = lLength * 2
Utf8ToUnicode = String $ (lBufferSize, Chr (0 ))
LRet = MultiByteToWideChar (CP_UTF8, 0, VarPtr (Utf (0), lLength, StrPtr (Utf8ToUnicode), lBufferSize)
If lRet <> 0 Then
Utf8ToUnicode = Left (Utf8ToUnicode, lRet)
Else
Utf8ToUnicode = ""
End If
End Function
The Utf8ToUnicode function converts the binary array received by Inet to a UNICODE string. If the input parameter is a binary array, the converted string is returned.
The above code can be:
Private Sub inetaskstatechanged (ByVal State As Integer)
If State = icResponseCompleted Then
Dim BinBuff () As Byte
BinBuff = Inet1.GetChunk (0, icByteArray)
RichTextBox1.Text = Utf8ToUnicode (BinBuff)
End If
End Sub
The returned result is displayed in RichTextBox1, And the garbled Chinese characters are displayed normally now.