Unicode and ANSI formats in VB
The string processing in Visual Basic 32-bit adopts Unicode, that is, the string is in Unicode
.
What is Unicode? In short, each character is expressed in the form of 2-byte, and each "entity character 」
Is a "character 」. Therefore,
Len ("Hello everyone ")
Len ("ABC ")
The returned value is 3, because "big" and "A" are both a character.
However, this is a big disaster for processing some Chinese strings, such as text-only data files, because you must use byte
To locate each character, but Unicode screwed up everything. For example:
Len ("good morning") returns 12, while
Len ("The weather is good today") returns 6
For beginners, it is already a great thing to use VB to write programs, but it will be processed immediately in Chinese.
It was a big blow to the computer. But don't be afraid. In fact, you only need to know more about some commands.
To solve the problem of Chinese processing.
What command is it? The most important thing is strconv. Strconv function Syntax: strconv (to be converted
String, conversion format)
Here, the conversion format is used:
Vbunicode converts ANSI strings to Unicode
Vbfromunicode converts Unicode strings to ANSI
After the string is converted to ANSI, all string processing commands must be added with B, for example, leftb, rightb,
Midb, chrb, Region B, lenb, inputb, etc. For example, you can use these commands to process it.
After processing, you can convert it back to Unicode, so that you can use the general string processing command.
Do you understand this? If you still don't know about it, take a look at the following example:
Simple Example
Let's take a look at the basic example below. You should have some concepts about the string processing method of VB.
Private sub commandementclick ()
Dim sunicode as string
Dim sansi as string
'Unicode operations
Sunicode = "Wang Xiaoming, a123456789, 651023, No. 100, Zhongshan Road, Shanghai, (02) 2345678"
Debug. Print Len (sunicode) 'returns 44
Debug. Print mid $ (sunicode, 5, 10) 'returns a123456789
Debug. Print instr (sunicode, "Shanghai") 'returns 23
'Convert Unicode string to ANSI
Sansi = strconv (sunicode, vbfromunicode)
'Ansi operations
Debug. Print lenb (sansi) 'returns 54
Debug. Print midb $ (sansi, 8, 10) 'returns ?????, Because I forgot to convert it back to Unicode
Debug. Print strconv (midb $ (sansi, 8, 10), vbunicode) 'returns a123456789.
The Unicode return action must be performed.
Debug. Print Release B (sansi, strconv ("Shanghai", vbfromunicode) 'returns 23. Don't forget it.
To convert "Shanghai City" to ANSI, otherwise it will not be found
End sub
Read text files
One of the tips in VB is the fast file reading method:
Private sub commandementclick ()
Dim sfile as string
Open "C:/filename.txt" for input as #1
Sfile = Input $ (lof (1), #1)
Close #1
End sub
However, unfortunately, if the file you read contains text, the above program will show the input pastend
File error. Because lof returns the number of bytes of the file, and the input function reads the number of characters
The file contains Chinese characters, so the number of characters in the file will be smaller than the number of bytes, so an error occurs.
To solve this problem, we need to use the strconv and inputb functions:
Private sub commandementclick ()
Dim sfile as string
Open "C:/filename.txt" for input as #1
Sfile = strconv (inputb $ (lof (1), #1), vbunicode)
Close #1
End sub
The above correction program first reads the file in inputb, but the file read by inputb is in ANSI format,
Therefore, you must convert strconv to Unicode.
Random Data File
Many text data files are segmented by fixed bytes, such as the following data format:
Wang Xiaomin 650110, No. 100, Zhongshan Road, Shanghai (02) 1234567
Dazhang stay 660824 Hualien County DaJia town Guangdong Street No. 23 (03) 9876543
......
How do I deal with files of this type? This requires the use of type and byte array.
Private type tagrecord
Username (5) as byte 'name 6 bytes
Birthday (5) as byte 'birthday 6 bytes
Address (21) as byte address 22 bytes
Tel (11) As byte phone 12 bytes
CRLF (1) as byte 'line feed character 2 bytes
End type
Private sub commandementclick ()
Dim urecord as tagrecord
Open "C:/filename. dat" for random as #1 Len = lenb (urecord)
Get #1, 2, urecord 'get the second data
With urecord 'with... end with should be used
Debug. Print. Username 'Return ???
Debug. Print strconv (. username, vbunicode) 'returns "daemon"
End
Close #1
End sub
In this example, byte array must be used, because only byte array can correctly locate each byte
. The method for locating with strings in the past is no longer applicable. Remember! However, the byte array
The read data is in ANSI format. to process or perform operations, remember to convert the data to unicode format.
[●] Use byte array
In addition to the preceding example where bytes must be used for exact positioning, bytes array is not used for plain text processing.
. Byte array is usually used to process binary data. We will discuss this issue in another article.
Look! As long as you are familiar with strconv, You can freely change between Unicode and ANSI formats.
I believe that after reading this article, you will not have to worry about Chinese!