VBScript reads and writes files in binary mode

Source: Internet
Author: User

This problem occurs in the project, Program You need to use ADODB. the UTF-8 CSV file generated by stream is imported into the MySQL database. After the load data infile command is used to import the database, the first column of data in the first row of the imported table always has a question mark, before the import, you can use a text editor to open the file. After investigation, it is found that the file starts with three invisible characters, in the hexadecimal mode of ultraedit, we can see that it is ef bb bf three bytes (note that the correct format of the UTF-8 file can be seen only when the UE version is higher, at first, I used ue 10.2 to see two bytes of fffe, which was confused by the old version. Later, we can see the correct format in ue 14.2 .), After checking some information, we can see that this is the BOM information of the UTF-8 file. The UTF-8 file is divided into two types: BOM and non-bom, ADODB. the file generated by stream has a Bom, and these three characters are inserted into the table as common characters during Database Import, resulting in unwanted error results.
So I tried to solve this problem. Because of the UTF-8 format file, its storage sequence is consistent with the encoding sequence, therefore, the difference between BOM and BOM-free format is that ef bb bf is more than three bytes. If we read the file in binary mode, we can skip the first three bytes to remove this Bom. Perform operations in binary mode Composition You can only use the ADODB. Stream object, and write a simple test example as follows:
<textarea cols="50" rows="15" name="code" class="vb:nogutter">Set STM = Createobject ("ADODB. stream ") <br/> STM. type = 1 <br/> STM. open <br/> STM. loadfromfile "E:/tmp/apachelog.txt" <br/> STM. position = 3 <br/> bytearr = STM. read (-1) <br/> STM. close <br/> STM. open <br/> STM. write bytearr <br/> STM. savetofile "E:/tmp/apachelog22.txt", 2 <br/> STM. flush <br/> STM. close</textarea>
Http://blog.csdn.net/zmxj/archive/2009/02/27/3943742.aspx
When operating binary files, note that the type attribute must be set to 1, which is the binary method. Otherwise, the read method will fail. The value of the position attribute is set to 3, that is, the first three bytes are not read. The read method returns a byte array. to verify the content of the first three bytes, you do not need to set the position attribute first, when you break a breakpoint, you can see the content in the array. The first three bytes of ef bb bf are decimal 239,187,191. In this example, apachelog22.txt is the operated file. You can open the two files in hexadecimal mode to check whether they are correct.

For more information about the UTF-8 Bom, see:

Http://www.ruanyifeng.com/blog/2007/10/ascii_unicode_and_utf-8.html

Http://www.paulsadowski.com/WSH/getremotebinaryfile.htm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.