Javascript string byte length calculation function code and Efficiency Analysis (for vs regular expression)

Source: Internet
Author: User

Let's take a look at the two paragraphs. Code They use the for loop and regular expression to detect the length of the string in bytes:

For loop detection string Byte Length Method 1:

Copy code The Code is as follows: var lenfor = function (STR ){
VaR bytelen = 0, Len = Str. length;
If (STR ){
For (VAR I = 0; I <Len; I ++ ){
If (Str. charcodeat (I)> 255 ){
Bytelen + = 2;
}
Else {
Bytelen ++;
}
}
Return bytelen;
}
Else {
Return 0;
}
}

Usage
VaR strlength = lenfor (STR)
For loop detection string Byte Length Method 2:Copy codeThe Code is as follows: function Len (STR ){
VaR I, sum = 0;
For (I = 0; I <Str. length; I ++ ){
If (Str. charcodeat (I) >=0) & (Str. charcodeat (I) <= 255 ))
Sum = sum + 1;
Else
Sum = sum + 2;
}
Return sum;
}

The regular expression is used to check the byte length of a string. Method 3:
The Code is a bit concise. According to the test below, the efficiency is not high. You can use the above functions.Copy codeThe Code is as follows: var lenreg = function (STR ){
Return Str. Replace (/[^ \ x00-\ xFF]/g, '**'). length;
};

VaR strlengh2 = lenreg (STR)

I use the following code snippet to test the above two functions, mainly to test their running time:Copy codeThe Code is as follows: var S = '...'; // a long string, which is not listed here

Function (){
VaR timestart, timeend;
Timestart = new date ();
VaR S1 = lenreg (s );
Timeend = new date ();
VaR T1 = (timeend-timestart) * 1000;
Timestart = new date ();
VaR S2 = lenfor (s );
Timeend = new date ();
VaR t2 = (timeend-timestart) * 1000;
Alert ('lenreg: '+ S1 + 'time:' + t1 + '\ nlenfor:' + S2 + 'time: '+ T2 );
}
Window. onload = function (){
A ();
};

When the above Code is loaded in the browser, a warning window is displayed. There are two lines of information in the window: the first line is the length and time (× 1000) of the String Detected by regular expressions ); the second line uses the for loop to detect the length and time of the string in bytes (× 1000 ).

I get two answers:

First:

Lenreg: 25824, time: 20000

Lenfor: 25824 time: 10000

Second:

Lenreg: 48795, time: 15000

Lenfor: 48795 time: 25000

Note that the strings used for the two tests are the same string.

Why is the difference so big? What did I secretly change ?? As mentioned above, "Chinese characters occupy 2 bytes (related to encoding)" (the third section in this article). The number of bytes occupied by Chinese characters is related to encoding. Generally, in GB-2312 and UTF-8 encoding, Chinese characters occupy 2 bytes, but in iso-8859-1 encoding, Chinese characters occupy 5 bytes.

Yes, the problem is the document encoding. The encoding of the first case is charset = UTF-8, And the encoding of the second case is charset = iso-8859-1.

In Chinese Web pages, we generally do not use charset = iso-8859-1 encoding (Chinese garbled), but with charset = UTF-8 or GB-2312 encoding. The problem is here. Let's compare the first case above:
Lenreg: 25824, time: 20000
Lenfor: 25824 time: 10000
As shown in the preceding figure, the regular expression is used to detect two times of the for loop !!!! (In fact, not all tests are double after multiple tests, but most tests are double)

Why?

Str. Replace (/[^ \ x00-\ xFF]/g, '**'). length;

Take a look at the preceding statements (statements in the lenreg function ). In my personal understanding, the problem occurs here -- replace needs to traverse the string once and traverse the string again when length is called. Therefore, the entire operation needs to traverse the string twice. The for loop only needs to be traversed once-this should be the problem, but I am not very sure.

I'm not sure whether the above understanding is correct, but the analysis should be like this on the surface.

Then, use a regular expression to detectAlgorithmMore complex? Or did the above fail to take full advantage of regular expressions? I have no idea yet, so I need to further think about it. Keep in doubt. ^_^ ......

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.