Solve the problem that the inet control downloads UTF-8 webpage garbled characters

Source: Internet
Author: User

It is very convenient for the Inet Control (Internet Transfer Control) to download HTML code from a webpage. However, there is a problem that garbled characters may occur when reading a webpage encoded with UTF-8. This is no wonder that VB supports UNICODE encoding by default, and is naturally overwhelmed when reading UTF-8 data.

(Note: If you do not know the meaning of the above mentioned character encoding methods, then you should first read this article: a variety of character encoding methods (ANSI, UNICODE, UTF-8, GB2312, GBK), if you already know, please continue)

How can we convert UTF-8 webpage data to UNICODE?
First, the data obtained by Inet must be obtained in binary format, for example:
Asynchronous call:
Private Sub commandementclick ()
Dim StrUrl As String

StrUrl = Text1.Text
Inet1.Execute StrUrl, "GET"
End Sub

Private Sub inetaskstatechanged (ByVal State As Integer)
If State = icResponseCompleted Then
Dim BinBuff () As Byte

BinBuff = Inet1.GetChunk (0, icByteArray)
End If
End Sub

Synchronous call:
Private Sub Command2_Click ()
Dim BinBuff () As Byte
Dim StrUrl As String

StrUrl = Text1.Text
BinBuff = Inet1.OpenURL (Text1.Text, icByteArray)
RichTextBox1.Text = Utf8ToUnicode (BinBuff)
End Sub

The above is the code for retrieving binary data from inet.

Add Module
'Utf-8 conversion UNICODE code
Option Explicit

Private Declare Function MultiByteToWideChar Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal encoded As Long, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) as Long
Private Const cp_utf8= 65001

Function Utf8ToUnicode (ByRef Utf () As Byte) As String
Dim lRet As Long
Dim lLength As Long
Dim lBufferSize As Long
LLength = UBound (Utf)-LBound (Utf) + 1
If lLength <= 0 Then Exit Function
LBufferSize = lLength * 2
Utf8ToUnicode = String $ (lBufferSize, Chr (0 ))
LRet = MultiByteToWideChar (CP_UTF8, 0, VarPtr (Utf (0), lLength, StrPtr (Utf8ToUnicode), lBufferSize)
If lRet <> 0 Then
Utf8ToUnicode = Left (Utf8ToUnicode, lRet)
Else
Utf8ToUnicode = ""
End If
End Function

The Utf8ToUnicode function converts the binary array received by Inet to a UNICODE string. If the input parameter is a binary array, the converted string is returned.
The above code can be:
Private Sub inetaskstatechanged (ByVal State As Integer)
If State = icResponseCompleted Then
Dim BinBuff () As Byte

BinBuff = Inet1.GetChunk (0, icByteArray)
RichTextBox1.Text = Utf8ToUnicode (BinBuff)
End If
End Sub
The returned result is displayed in RichTextBox1, And the garbled Chinese characters are displayed normally now.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.