Operating UTF-8 files in ASP

Source: Internet
Author: User
Tags character set control characters

Note: The ASP here says

ASP support for UTF-8 is very poor due to some of its features.

For example, if you want to generate a file in a UTF-8 format, use a commonly used Scripting.FileSystemObject object.

The Scripting.FileSystemObject object creates a file function, which is the following:

Filesystemobject.createtextfile (Filename[,overwrite[,unicode]])

The Unicode properties are described in this way:

Options available. A Boolean value that indicates whether the file is created in Unicode or ASCII file format. True if the file is created in a Unicode file format, False if the file is created in an ASCII file format. If this section is omitted, the ASCII file is assumed to be created.

We cannot use this function to create a UTF-8 format file.

At this point, we can use the ADODB.stream object, using the method see below:

Set objStream = Server.CreateObject("ADODB.Stream")
With objStream
.Open
.Charset = "utf-8"
.Position = objStream.Size
.WriteText=str
.SaveToFile server.mappath("/sitemap.xml"),2
.Close
End With
Set objStream = Nothing

Report:

Introduction to ASCII, Unicode, UTF-8:

ASCII is a character set, including uppercase and lowercase letters, numbers, control characters, and so on, which are expressed in one byte, and the range is 0-127.

Since ASCII represents a very limited number of characters, each country or region proposes its own character set on this basis, such as GB2312, which is widely used in China, which provides the encoding for Chinese characters, expressed in two bytes.

These character sets are incompatible, and the same number may represent different characters, causing trouble with the exchange of information.

Unicode is a character set that maps all the characters in the world to a unique number (code point), such as a digital 0x0041 corresponding to the letter A. At present, Unicode is still in development, it contains more and more characters.

When storing Unicode-represented characters, you also need some encoding, such as UCS-2, which represents Unicode-encoded characters in two bytes. And UTF-8 is another encoding of the Unicode character set, it is variable length, up to 6 bytes, less than 127 characters in a byte, as the result of the ASCII character set, and thus very good compatibility, the ASCII code English text without modification can be used as UTF-8 coding is widely used in processing.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.