Javascript string byte length calculation function code and Efficiency Analysis (for vs regular expression)

Last Update:2018-12-08 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Let's take a look at the two paragraphs. Code They use the for loop and regular expression to detect the length of the string in bytes:

For loop detection string Byte Length Method 1:

Copy code The Code is as follows: var lenfor = function (STR ){
VaR bytelen = 0, Len = Str. length;
If (STR ){
For (VAR I = 0; I <Len; I ++ ){
If (Str. charcodeat (I)> 255 ){
Bytelen + = 2;
}
Else {
Bytelen ++;
}
}
Return bytelen;
}
Else {
Return 0;
}
}

Usage
VaR strlength = lenfor (STR)
For loop detection string Byte Length Method 2:Copy codeThe Code is as follows: function Len (STR ){
VaR I, sum = 0;
For (I = 0; I <Str. length; I ++ ){
If (Str. charcodeat (I) >=0) & (Str. charcodeat (I) <= 255 ))
Sum = sum + 1;
Else
Sum = sum + 2;
}
Return sum;
}

The regular expression is used to check the byte length of a string. Method 3:
The Code is a bit concise. According to the test below, the efficiency is not high. You can use the above functions.Copy codeThe Code is as follows: var lenreg = function (STR ){
Return Str. Replace (/[^ \ x00-\ xFF]/g, '**'). length;
};

VaR strlengh2 = lenreg (STR)

I use the following code snippet to test the above two functions, mainly to test their running time:Copy codeThe Code is as follows: var S = '...'; // a long string, which is not listed here

Function (){
VaR timestart, timeend;
Timestart = new date ();
VaR S1 = lenreg (s );
Timeend = new date ();
VaR T1 = (timeend-timestart) * 1000;
Timestart = new date ();
VaR S2 = lenfor (s );
Timeend = new date ();
VaR t2 = (timeend-timestart) * 1000;
Alert ('lenreg: '+ S1 + 'time:' + t1 + '\ nlenfor:' + S2 + 'time: '+ T2 );
}
Window. onload = function (){
A ();
};

When the above Code is loaded in the browser, a warning window is displayed. There are two lines of information in the window: the first line is the length and time (× 1000) of the String Detected by regular expressions ); the second line uses the for loop to detect the length and time of the string in bytes (× 1000 ).

I get two answers:

First:

Lenreg: 25824, time: 20000

Lenfor: 25824 time: 10000

Second:

Lenreg: 48795, time: 15000

Lenfor: 48795 time: 25000

Note that the strings used for the two tests are the same string.

Why is the difference so big? What did I secretly change ?? As mentioned above, "Chinese characters occupy 2 bytes (related to encoding)" (the third section in this article). The number of bytes occupied by Chinese characters is related to encoding. Generally, in GB-2312 and UTF-8 encoding, Chinese characters occupy 2 bytes, but in iso-8859-1 encoding, Chinese characters occupy 5 bytes.

Yes, the problem is the document encoding. The encoding of the first case is charset = UTF-8, And the encoding of the second case is charset = iso-8859-1.

In Chinese Web pages, we generally do not use charset = iso-8859-1 encoding (Chinese garbled), but with charset = UTF-8 or GB-2312 encoding. The problem is here. Let's compare the first case above:
Lenreg: 25824, time: 20000
Lenfor: 25824 time: 10000
As shown in the preceding figure, the regular expression is used to detect two times of the for loop !!!! (In fact, not all tests are double after multiple tests, but most tests are double)

Why?

Str. Replace (/[^ \ x00-\ xFF]/g, '**'). length;

Take a look at the preceding statements (statements in the lenreg function ). In my personal understanding, the problem occurs here -- replace needs to traverse the string once and traverse the string again when length is called. Therefore, the entire operation needs to traverse the string twice. The for loop only needs to be traversed once-this should be the problem, but I am not very sure.

I'm not sure whether the above understanding is correct, but the analysis should be like this on the surface.

Then, use a regular expression to detectAlgorithmMore complex? Or did the above fail to take full advantage of regular expressions? I have no idea yet, so I need to further think about it. Keep in doubt. ^_^ ......

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Javascript string byte length calculation function code and Efficiency Analysis (for vs regular expression)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Javascript string byte length calculation function code and Efficiency Analysis (for vs regular expression)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support