In many cases, when submitting data to the database. it will be verified with javascript first. for example, the length of a Chinese javascript is 1. however, the database occupies two bytes. errors-prone JavaScript strings are string objects. You can use the length attribute of the string object to obtain the length of a string object. However, whether it is a Chinese character, a full-width symbol, or a minimum length unit in English, it is 1, this is not the same as strlen () in php.
function strlen(str) { var s = 0; for(var i = 0; i < str.length; i++) { if(str.charAt(i).match(/[u0391-uFFE5]/)) { s += 2; } else { s++; } } return s; }
If each character is captured and matched with a full-width character or a Chinese character, it is counted as two characters, and the other is counted as one character.
Script alert (fucCheckLength ("China a"); function fucCheckLength (strTemp) {var I, sum; sum = 0; for (I = 0; I
= 0) & (strTemp. charCodeAt (I) <= 255) sum = sum + 1; else sum = sum + 2;} return sum;} script
The result is: 5. What is the length of the byte? Note the differences between byte and character. The length of the byte is related to the encoding. For example, "China a", gbk/gb2312 encoding is five bytes, but if it is UTF-8, it is 7 bytes (UTF-8 is usually a Chinese character in 3 bytes ).
We can convert all the characters to gbk before performing operations, instance
function Utf8ToUnicode(strUtf8) { var bstr = ""; var nTotalChars = strUtf8.length; // total chars to be processed. var nOffset = 0; // processing point on strUtf8 var nRemainingBytes = nTotalChars; // how many bytes left to be converted var nOutputPosition = 0; var iCode, iCode1, iCode2; // the value of the unicode. while (nOffset < nTotalChars) { iCode = strUtf8.charCodeAt(nOffset); if ((iCode & 0x80) == 0) // 1 byte. { if ( nRemainingBytes < 1 ) // not enough data break; bstr += String.fromCharCode(iCode & 0x7F); nOffset ++; nRemainingBytes -= 1; } else if ((iCode & 0xE0) == 0xC0) // 2 bytes { iCode1 = strUtf8.charCodeAt(nOffset + 1); if ( nRemainingBytes < 2 || // not enough data (iCode1 & 0xC0) != 0x80 ) // invalid pattern { break; } bstr += String.fromCharCode(((iCode & 0x3F) << 6) | ( iCode1 & 0x3F)); nOffset += 2; nRemainingBytes -= 2; } else if ((iCode & 0xF0) == 0xE0) // 3 bytes { iCode1 = strUtf8.charCodeAt(nOffset + 1); iCode2 = strUtf8.charCodeAt(nOffset + 2); if ( nRemainingBytes < 3 || // not enough data (iCode1 & 0xC0) != 0x80 || // invalid pattern (iCode2 & 0xC0) != 0x80 ) { break; } bstr += String.fromCharCode(((iCode & 0x0F) << 12) | ((iCode1 & 0x3F) << 6) | (iCode2 & 0x3F)); nOffset += 3; nRemainingBytes -= 3; } else // 4 or more bytes -- unsupported break; } if (nRemainingBytes != 0) { // bad UTF8 string. return ""; } return bstr; }
How to solve this problem. This article introduces how to use js to get the length of Chinese Text
First, we define a new function getBytes () to get the number of bytes of a string. In javascript, this function is a standard function.
String. prototype. getBytes = function () {var cArr = this. match (/[^ x00-xff]/ig); return this. length + (cArr = null? 0: cArr. length);} function paramCheck (cur) {if (cur. value. getBytes ()> 64) {alert ("more than 64 characters"); return false;} return true ;}
GetBytes uses a regular expression to determine the number of Chinese characters contained in a string. All the Chinese characters contained are placed in the array cArr. In this way, the length of cArr is the total number of Chinese characters. The getBytes method returns the length plus the number of Chinese characters, which is the total number of bytes.
Only use the [^ x00-xff], this is a bit disgusting, some special characters can also be matched, such.
But if you use [^ u4E00-u9FA5], it cannot match Chinese ......
You can test the following methods:
One type:
function _length(str){ var len=0; for(var i=0;i
'~'){len+=2;}else{len++;} } return len; }
Two types:
String. prototype. gblen = function () {var len = 0; for (var I = 0; I
127 | this. charCodeAt (I) = 94) {len + = 2;} else {len ++;} return len;} String. prototype. gbtrim = function (len, s) {var str = ''; var sp = s |''; var len2 = 0; for (var I = 0; I
127 | this. charCodeAt (I) = 94) {len2 + = 2;} else {len2 ++;} if (len2 <= len) {return this;} len2 = 0; len = (len> sp. length )? Len-sp.length: len; for (var I = 0; I
127 | this. charCodeAt (I) = 94) {len2 + = 2;} else {len2 ++;} if (len2> len) {str + = sp; break ;} str + = this. charAt (I) ;}return str;} var str1 = 'World's best # % & World's Best #%& ''; document. write ('str1 = '+ str1 + ''); document. write ('length = '+ str1.gblen () + ''); document. write ('gbtrim (10) = '+ str1.gbtrim (10) + ''); document. write ('gbtrim (10 ,\'... \ ') =' + Str1.gbtrim (10 ,'... ') + ''); Document. write ('gbtrim (12, \ '-\') = '+ str1.gbtrim (12,'-') + ''); // gbtrim (len truncation length, it is calculated based on the length of English bytes, And the omitted characters After s are intercepted, such "... ") // Note: Chinese characters are counted as two lengths. Therefore, when len in gbtrim is 10, a maximum of five Chinese characters are displayed. // When the number of Chinese characters is greater than 5, because "…" is added after the truncation, Therefore, only four Chinese characters are displayed.
For more articles about javascript function code used to determine the length of Chinese characters, refer to the PHP Chinese website!