Find a way to judge the number of bytes of String stored in the UTF-8 in JavaScript, find a lot of documents about Unicode on the internet, the most important thing is that the storage length corresponding to the character encoding value is recently entangled with JavaScript.
The character set of the database is UTF-8 and takes up byte length when you use JavaScript on the page to verify the input text is stored in the UTF-8. The String object in JavaScript has the length attribute, But it calculates the number of characters, not the number of bytes (The problem is always repeated. Remember that when I used Delphi, you have to write a program to calculate the number of characters in the String, because the length of the String in Delphi is the number of bytes ...). The best way to be lazy is to set the maximum length of the verification code to 1/3 of the length of the corresponding field in the database, but this is not suitable for accuracy.
So I want to determine the number of bytes of String stored in the UTF-8 in JavaScript, find a lot of documents about Unicode on the Internet, the most important is the storage length of the character encoding value:
UCS-2 encoding (HEX) UTF-8 byte stream (Binary)
0000-007F 0 xxxxxxx (1 byte)
0080-07FF 110 xxxxx 10 xxxxxx (2 bytes)
0800-FFFF 1110 xxxx 10 xxxxxx 10 xxxxxx (3 bytes)
The Code is as follows:
[
The Code is as follows:
Function mbStringLength (s ){
Var totalLength = 0;
Var I;
Var charCode;
For (I = 0; I <s. length; I ++ ){
CharCode = s. charCodeAt (I );
If (charCode <0x007f ){
TotalLength = totalLength + 1;
} Else if (0x0080 <= charCode) & (charCode <= 0x07ff )){
TotalLength + = 2;
} Else if (0x0800 <= charCode) & (charCode <= 0 xffff )){
TotalLength + = 3;
}
}
// Alert (totalLength );
Return totalLength;
}
In fact, characters between 0x0080 and 0x07ff are rarely used in actual user input.