Log2 Rounding Efficiency test

Source: Internet
Author: User
Tags cmath

There is an ST algorithm in the RMQ problem, and of course there is a standard algorithm. LCA problems can be resolved with a limited RMQ (rmq+-1) problem. Let's just say that the time complexity of these problems is to query $o (1) $. However, notice that for the RMQ (/+-1) problem, this problem has a length limit, which we remember as N. So for each query, We all have to ask for a range [l,r],1<=l<=r<=n. The length of this interval is r-l+1. Then we divide the original interval into two items on the sparse table, that is, two sub-intervals of length int_log2 (r-l+1)-1 to solve Min, That is, the information for merging two sub-ranges.

So the question is, INT_LOG2 (r-l+1)-1 is also going to take time. Some people recommend floor (log (n)/lg2-1), very much depends on the log function of the Cmath library, subconsciously think it is $o (1) $. I'm very much against this "invisible" person, So I came up with a series of algorithms to solve INT_LOG2 (n)-1. I did some tests to compare the efficiency of these algorithms.

1) Cmath log function Solution
2) Iterate iterative right shift solution
3) binary Two-point right shift solution
4) Float conversion Convert to floating-point number for bit operation solution

Here is a brief description of these solutions.

Log function Solver

The function logis available in the <cmath> Library. Call log calculation directly.

Code

inline int Ilog2_cmath (int n) {return floor (log (n+.0)/l2)-1;}

That's it. Very simple.

Iterative right-shift solver

We loop right nand exit the loop when N is 0 . Each loop adds 1to a counter.

inline int ilog2_iter (int n) {int i;for (i=0,n>>=1;n;++i) N>>=1;return i-1;}
Two-point right-shift solution

We have bits in two n .

inline int ilog2_bin (int n) {int i=0;if (n>>16) i|=16,n>>=16;if (n>>8) i|=8,n>>=8;if (n>>4 ) i|=4,n>>=4;if (n>>2) i|=2,n>>=2;if (n>>1) I|=1,n>>=1;return i-1;}
convert to floating-point number to solve the bit operation

This method is more difficult to understand.

We need to start with the construction of floating-point numbers.

Float: [1bit sign bit][8bit exponent bits][23bit mantissa bits]

00000000101000100010001000100010
^ Symbol bit
^------^ exponential bit
^---------------------^ trailing digits (valid number. 1 of the beginning omitted])

What we want to get is the information of this symbol bit.

This sign bit happens to be int_log2 (n)-1. So we don't even have to subtract 1. This is a very good nature.

Then we just need to turn an integer into Float, shift the number to the right and then use the & operation Mask.

inline int Ilog2_kf (int n) {float q= (float) n;return (* (int*) &q) >>23&31;}

The code is short. Open O3 very quickly.

Test

Practice is the only criterion to test truth.

Code

#define SIZEX 100000000#define L2 0.6931471805599453#include <cmath>inline int ilog2_cmath (int n) {return floor ( Log (n+.0)/l2)-1;} inline int ilog2_iter (int n) {int i;for (i=0,n>>=1;n;++i) N>>=1;return i-1;} inline int ilog2_bin (int n) {int i=0;if (n>>16) i|=16,n>>=16;if (n>>8) i|=8,n>>=8;if (n>>4 ) i|=4,n>>=4;if (n>>2) i|=2,n>>=2;if (n>>1) I|=1,n>>=1;return i-1;} inline int Ilog2_kf (int n) {float q= (float) n;return (* (int*) &q) >>23&31;} #include <cstdio> #include <random> #include <malloc.h> #include <sys/time.h>using namespace  Std;int *data,res;long Long mytic () {Long long result = 0.0;struct Timeval tv;gettimeofday (&AMP;TV, NULL); result = ((long Long) tv.tv_sec) *1000000 + (long long) Tv.tv_usec;return result;} #define DIC1 () DisA (generator) void gendata (int a) {mt19937 generator;uniform_int_distribution<int> DisA ( 0,2147483647); int i=0;for (; i<a;++i) Data[i]=dic1 ();} void testn (int k) {inT i;printf ("Cmath log method\n"); Long long start=mytic (); for (i=0;i<k;++i) {Res=ilog2_cmath (data[i]);} Start=mytic ()-start;printf ("%d\n", res);p rintf ("Time Usage:%lld us\n", start);} void Testu (int k) {int i;printf ("Iterate log method\n"), Long Long start=mytic (), for (I=0;i<k;++i) {Res=ilog2_iter ( Data[i]);} Start=mytic ()-start;printf ("%d\n", res);p rintf ("Time Usage:%lld us\n", start);} void testp (int k) {int i;printf ("Binary divide Log method\n"); Long long start=mytic (); for (i=0;i<k;++i) {Res=ilog2_bin (Data[i]);} Start=mytic ()-start;printf ("%d\n", res);p rintf ("Time Usage:%lld us\n", start);} void Testup (int k) {int i;printf ("Float convertion log method\n"); Long long start=mytic (); for (i=0;i<k;++i) {res=ilog2 _KF (Data[i]);} Start=mytic ()-start;printf ("%d\n", res);p rintf ("Time Usage:%lld us\n", start);} int main () {int a,b,c,i,j,k,l,m,n,n,u,p,up;data= (int*) malloc (400000000*sizeof (int)); while (printf ("0 to Quit>"), scanf ("%d", &a), a) {printf ("Cmlog Iter Bina flcv\n"), scanf ("%d%d%d%d", &n,&u,&p,&up), if (a>400000000) Continue;gendata (a), if (N) testn (a), if (U) Testu (a), if (P) TESTP (a); if (up) Testup (a );p rintf ("%d%d%d%d\n", Ilog2_cmath (a), Ilog2_iter (a), Ilog2_bin (a), ILOG2_KF (a));} Free (data); return 0;}

Test results

Data size: 4x10^8 number//cmath log19277285 us~19.3 s//iterate6197113 us~6.2 s3.1x faster than Cmath log//binary iterate2018023 us~ 2.0 s3.1x faster than iterate//float bit operation406996 us~0.41 s5.0x faster than binary iterateand47.4x faster than Cmat H Log

The data is correct. The results are correct.

(Machine data: i7 4700m 2.0GHz (16gb=15.6gib RAM DDR3 800MHz)?)
(Compile command: gcc ...-o3)

Results analysis

The fourth method is particularly fast. In fact, it can be seen from the time that each operation is almost the entire 2 clock cycle.

Thus it appears that the efficiency of the i7 int2float is 1 clock cycles.

The following-O2 data. Cmlog Iter Binary floatconvbitoperation Unit US data are randomly generated by the mt19937 random number algorithm

19301840 6186242 2379282 400056

Paste-o1 data below.

19267001 6466776 2446129 385642

Paste-o data below.

19302953 6472134 2460882 400694

Paste-os data below.

19247815 8508664 2500930 390131

The following data is affixed without the optimize option.

19198380 25362664 6802623 1290716

The following is affixed-o3 with-MARCH=COREI7-AVX data.

19301717 6196286 2010706 377347

Data analysis:

As long as with the Optimize option, FCLM are the fastest, around 0.4s. Otherwise FCLM is the fastest, 1.2s or so.

-os's ITER changed from 6.2s to 8.5s, slowing down a lot.

Not open optimize in addition to Cmath log (has been compiled direct connection) is a lot slower, 3-4x around. It's about the function call overhead!

COREI7-AVX reduces the time of FCLM.

Program Optimization notes: The place to open-o2 basically don't worry about the speed.

Evaluation Program: Modified and check the set of test program.

Programming Recommendations: Use the FCLM method to obtain INTLOG2 applications such as Highbit.
Advantages:
1) Short Code
2) No judgment. Take advantage of processor architectures.
3) The code is not easy to read.
4) can be used on IEEE754 compatiable machines. (There are almost no machines that cannot be used.)
5) Very, very fast. Almost equivalent to the speed of the lowbit, but the lowbit must be converted to an exponent using FCLM.

Disadvantages:

1) It is not available on machines that are particularly particularly old or particularly exotic.
2) dynamic type language is not available. Can be solved with native extension or biniter. Dynamic languages do not require high efficiency.

The problem is solved perfectly.

The algorithm for the 2 clock cycles cannot be considered $o (\log{\log{n}}) in any case. Apparently $o (1) $ time complexity algorithm.

Log2 Rounding Efficiency test

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.