What is UTF8UTF8 is not a computer code, but a form of storage and transmission, as described above, each unicode/ucs character is stored in 2 or 4 bytes to see the following comparisons:Take "I am Chinese" as an exampleStore with ANSI: BytesStorage with UNICODE/UCS2: Bytes + 2 Bytes (header)Storage with UCS4: Bytes + 4 Bytes (header)Take the example of "I am Chinese"Store with ANSI: Ten BytesStorage with UNICODE/UCS2: Bytes + 2 Bytes (header)Storage with UCS4: Bytes + 4 Bytes (header)This shows
I am searching for it today.
Program During UTF-8 correction, garbled characters are found in the generated UTF-8 format document. The original file
Create_html.aspCodeAs follows:
Copy code The Code is as follows: Set objrs = server. Createobject ("scripting. FileSystemObject ")
Conn = server. mappath ("example.
Today in the search for the program to Utf-8 correction, found that the generated Utf-8 format documents are garbled, the original file
The create_html.asp code is as follows:
Copy Code code as follows:
Set Objrs=server.createobject ("Scripting.FileSystemObject")
Conn=server.mappath ("Example.xml")
Chinese Character | conversion
Conversion of Chinese characters to UTF-8
function Chinese2unicode (STR)Dim iDim Str_oneDim Str_unicodeFor I=1 to Len (STR)Str_one=mid (str,i,1)STR_UNICODE=STR_UNICODECHR (38)STR_UNICODE=STR_UNICODECHR (35)STR_UNICODE=STR_UNICODECHR (120)str_unicode=str_unicode Hex (AscW (Str_one))STR_UNICODE=STR_UNICODECHR (59)NextResponse.Write Str_unicodeEnd Function
Convert Chinese characters to UTF-8
function Chinese2unicode (STR)
Dim i
Dim Str_one
Dim Str_unicode
For I=1 to Len (STR)
Str_one=mid (str,i,1)
STR_UNICODE=STR_UNICODEAMP;CHR (38)
STR_UNICODE=STR_UNICODEAMP;CHR (35)
STR_UNICODE=STR_UNICODEAMP;CHR (120)
str_unicode=str_unicode Hex (AscW (Str_one))
STR_UNICODE=STR_UNICODEAMP;CHR (59)
Next
Response.Write Str_unicode
End Function
As UTF-8 is a 8-bit encoding no BOM is required and Anyu+feff character in the decoded Unicode string (even if it ' s the F Irstcharacter) is treated as a ZERO WIDTH no-break SPACE.UTF-8 is encoded in bytes, its byte order is 様 in all systems, there is no byte order problem, and therefore it does not actually require a
Chinese Character | conversion
Conversion of Chinese characters to UTF-8
function Chinese2unicode (STR)Dim iDim Str_oneDim Str_unicodeFor I=1 to Len (STR)Str_one=mid (str,i,1)STR_UNICODE=STR_UNICODECHR (38)STR_UNICODE=STR_UNICODECHR (35)STR_UNICODE=STR_UNICODECHR (120)str_unicode=str_unicode Hex (AscW (Str_one))STR_UNICODE=STR_UNICODECHR (59)NextResponse.Write Str_unicodeEnd Function
view the file header:
-If there is no special file header, the first character is the text content, which is an ANSI file.
-Unicode files starting with BOM (byte order mark. BOM can be 0 xfeff (bigendian), 0 xfffe (little endian), or0 xefbbbf (UTF-8 ).
Note that saving a UTF-8 file with
character "B", opened in Mac OS environment will be displayed as "Kui ". This case indicates that the encoding order of the UTF-16 may be obfuscated if it is not manually defined.Big-Endian,Concepts of UTF-16 be, little-EndianAnd the appendable byte sequence mark solution,Windows and Linux systems on PCs currently use UTF-16 by default for
. Then UTF (Unicode Transformation format) appears, with Utf-8,utf-16.The difference between 2.utf-8 and UTF-16UTF-16 a good understanding, that is, any character corresponding to the n
numbers into 01 strings and save them to the computer. There's gotta be a difference. Save the way. Then UTF (Unicode Transformation format) appears, with Utf-8,utf-16.The difference between 2.utf-8 and
Reference Address: http://www.cnblogs.com/kingcat/archive/2012/10/16/2726334.htmlIn Java, char types describe a unit of code with UTF-16 encodingWhy Unicode is requiredWe know that the computer is actually very stupid, it only know 0101 such a string, of course, we look at such a 01 string when it will be more dizzy, so many times in order to describe the simple are in decimal, hexadecimal, octal notation. are actually equivalent, It's not much differ
version of Unicode uses two bytes (16 bits) to represent all characters.
. In fact, this is easy to produce ambiguity. We always think that two bytes represent two bytes stored in the computer. therefore, any character stored in Unicode occupies two bytes. in fact, this statement is incorrect.
In fact, Unicode involves two steps. The first step is to define a specification and specify a unique number for all characters. This is completely a mathematical problem and can be unrelated to computers
Tags: Http File data ar problem code C ++ Concept Let's talk about the basic concept, which includes what is Unicode, What Is UTF-8, and what is UTF-16. For a complete description of Unicode, UTF-8, and UTF-16, see Wiki (UNICODE,
The perfect two php checks whether the string is a UTF-8 encoded function. The perfect two php checks whether the string is a UTF-8 encoded function. the string UTF-8 is sometimes used in php development, such as The iconv () and
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.