Using BITS and operations instead of modulo operations

Source: Internet
Author: User

Using the method of constructing a hash table instead of a simple traversal lookup

is a common algorithm for optimization

When calculating the hash value based on the keyword

The final bucket subscript is usually computed using the modulo operation

To prevent the bucket from spilling.

A large number of hash tables are also used in the Linux kernel to find

In the kernel, the bucket subscript is also calculated using the modulo method

But now when the hash table is implemented in the kernel,

The bucket is usually chosen as a 2^n

Use bitwise AND (2^N-1) methods to calculate bucket subscript

The ultimate goal is to determine the barrel subscript

But why does the kernel choose the bit and the way it's worth doing?

Use a small program andmod.c to see the difference between the bit and the modulo

Int
Main (void)
{
int a = 0x11;
int b = 0x22;
int c = 0x33;
int d = 0x44;

c = A & B;
D = a% B;

return 0;
}

Use Gcc-o andmod andmod.c to compile this applet directly

Do not use the-o option to optimize

Because this piece of code is inherently useless.

will be optimized by the compiler.

Then use objdump-d andmod to view the assembly code

Special values such as 0x11 are used in the program to help analyze the disassembled code

Focus on the main function.

08048394 <main>:
8048394:55 Push%EBP
8048395:89 e5 mov%esp,%ebp
8048397:83 EC Sub $0x10,%esp
804839a:c7 F0 movl $0x11,-0x10 (%EBP) # local variable a = 0x11
80483a1:c7 f4 movl $0x22,-0xc (%ebp) # b = 0x22
80483a8:c7 F8 movl $0x33,-0x8 (%EBP) # c = 0x33
80483af:c7 FC Movl $0x44,-0x4 (%EBP) # d = 0x44
80483b6:8b f4 mov-0xc (%EBP),%eax # B is deposited in registers EAX
80483b9:8b F0 mov-0x10 (%EBP),%edx # A registers a register edx
80483bc:21 D0 and%edx,%eax # bits and operations, resulting in register EAX
80483be:89 F8 mov%eax,-0x8 (%EBP) # results deposited in C
80483c1:8b F0 mov-0x10 (%EBP),%eax # Construct dividend, a deposit register EAX
80483c4:89 C2 mov%eax,%edx # A to register edx
80483c6:c1 FA 1f SAR $0x1f,%edx # arithmetic right shift 31-bit
80483c9:f7 7d f4 idivl-0xc (%EBP) # divisor B, divisor to [%edx][%eax]
80483cc:89 FC Mov%edx,-0x4 (%EBP) # remainder deposited in D, the result of modulo operation
80483CF:B8 mov $0x0,%eax # return value 0 to register EAX
80483d4:c9 leave
80483D5:C3 ret

You can see the bit and the using and instruction to compute

And modulo is IDIVL by division operation (here is signed 32-bit division)

Take the rest of the number to calculate

According to an online instruction cycle data coding_asm_-_intel_instruction_set_codes_and_cycles.pdf

Take the Pentium data in the table as an example

Bits and computations used 2 MOV instructions and 1 times and instructions

A total of 3 CPU cycles are required

The modulo operation uses 2 mov instructions, 1 SAR instructions and 1 IDIVL instructions.

A total of 52 CPU cycles are required

Of course, the periodic table is just a reference.

As for different machines different compiler optimizations after the gap how much

We need concrete analysis of specific problems.

So look at the reason why the use of bits and operations in the kernel instead of modulo operations is self-evident

Saves CPU cycles and improves overall performance

When the bucket subscript is calculated very frequently in a hash table lookup

The resulting savings in CPU cycles are still considerable.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.