<SCRIPT type = "text/JavaScript">
Function getbytes (STR ){
If (! Arguments. Length |! Str)
Return NULL;
If (Str. Length = '')
Return 0;
VaR Len = Str. length;
VaR bytes = 0; // number of record bytes
For (VAR I = 0; I <Len; I ++ ){
/* Charcodeat returns the Unicode code value of the character at the specified position, and the encoded value greater than 255 is an unconventional character, such as Chinese or Japanese */
If (Str. charcodeat (I)> 255 ){
Bytes + = 2;
} Else {
Bytes ++;
}
}
Return bytes;
}
</SCRIPT>
Unicode only has one character set. The three characters in Chinese, Japanese, and Korean occupy part of Unicode 0 x to 0x9 fff Unicode is currently widely used in UCS-2, it uses two bytes to encode a character, for example, the Chinese character "jing" is encoded as 0x7ecf. Note that the character encoding is generally expressed in hexadecimal notation. to distinguish it from decimal notation, hexadecimal notation starts with 0x, and 0x7ecf is converted to decimal notation 32463, the UCS-2 uses two bytes to encode characters. The two bytes are 16-bit binary. The power of 2 is equal to 65536, so the UCS-2 can encode up to 65536 characters. The characters encoded from 0 to 127 are the same as ASCII characters. For example, the Unicode encoding of the letter "A" is 0x0061, And the decimal value is 97, the ASCII code of "a" is 0x61, and the decimal value is 97. In fact, Unicode does not support Chinese characters very well, there are a total of 60 thousands or 70 thousands Chinese Characters in simplified and Traditional Chinese, and the UCS-2 can represent up to 65536, only more than 60 thousand, so Unicode can only exclude some almost no Chinese characters, fortunately, the commonly used simplified Chinese characters but more than seven thousand, in order to express all Chinese characters, Unicode also has a UCS-4 specification, is to use 4 bytes to encode characters
In a computer's storage unit, an ascii code value occupies one byte (eight binary digits), and its highest bit (B7) is used as the parity bit. The so-called parity check refers to a method used to check whether an error occurs during code transfer. It is generally divided into two types: Odd checksum and even verification. Odd check rules: correct code must contain an odd number of 1 bytes. If the number is not an odd number, 1 is added to the highest bit B7. Even check rules: correct code: the number of 1 in a byte must be an even number. If the number is not an even number, 1 is added to the highest bit of B7.