Using the method of constructing a hash table instead of a simple traversal lookup
is a common algorithm for optimization
When calculating the hash value based on the keyword
The final bucket subscript is usually computed using the modulo operation
To prevent the bucket from spilling.
A large number of hash tables are also used in the Linux kernel to find
In the kernel, the bucket subscript is also calculated using the modulo method
But now when the hash table is implemented in the kernel,
The bucket is usually chosen as a 2^n
Use bitwise AND (2^N-1) methods to calculate bucket subscript
The ultimate goal is to determine the barrel subscript
But why does the kernel choose the bit and the way it's worth doing?
Use a small program andmod.c to see the difference between the bit and the modulo
Int
Main (void)
{
int a = 0x11;
int b = 0x22;
int c = 0x33;
int d = 0x44;
c = A & B;
D = a% B;
return 0;
}
Use Gcc-o andmod andmod.c to compile this applet directly
Do not use the-o option to optimize
Because this piece of code is inherently useless.
will be optimized by the compiler.
Then use objdump-d andmod to view the assembly code
Special values such as 0x11 are used in the program to help analyze the disassembled code
Focus on the main function.
08048394 <main>:
8048394:55 Push%EBP
8048395:89 e5 mov%esp,%ebp
8048397:83 EC Sub $0x10,%esp
804839a:c7 F0 movl $0x11,-0x10 (%EBP) # local variable a = 0x11
80483a1:c7 f4 movl $0x22,-0xc (%ebp) # b = 0x22
80483a8:c7 F8 movl $0x33,-0x8 (%EBP) # c = 0x33
80483af:c7 FC Movl $0x44,-0x4 (%EBP) # d = 0x44
80483b6:8b f4 mov-0xc (%EBP),%eax # B is deposited in registers EAX
80483b9:8b F0 mov-0x10 (%EBP),%edx # A registers a register edx
80483bc:21 D0 and%edx,%eax # bits and operations, resulting in register EAX
80483be:89 F8 mov%eax,-0x8 (%EBP) # results deposited in C
80483c1:8b F0 mov-0x10 (%EBP),%eax # Construct dividend, a deposit register EAX
80483c4:89 C2 mov%eax,%edx # A to register edx
80483c6:c1 FA 1f SAR $0x1f,%edx # arithmetic right shift 31-bit
80483c9:f7 7d f4 idivl-0xc (%EBP) # divisor B, divisor to [%edx][%eax]
80483cc:89 FC Mov%edx,-0x4 (%EBP) # remainder deposited in D, the result of modulo operation
80483CF:B8 mov $0x0,%eax # return value 0 to register EAX
80483d4:c9 leave
80483D5:C3 ret
You can see the bit and the using and instruction to compute
And modulo is IDIVL by division operation (here is signed 32-bit division)
Take the rest of the number to calculate
According to an online instruction cycle data coding_asm_-_intel_instruction_set_codes_and_cycles.pdf
Take the Pentium data in the table as an example
Bits and computations used 2 MOV instructions and 1 times and instructions
A total of 3 CPU cycles are required
The modulo operation uses 2 mov instructions, 1 SAR instructions and 1 IDIVL instructions.
A total of 52 CPU cycles are required
Of course, the periodic table is just a reference.
As for different machines different compiler optimizations after the gap how much
We need concrete analysis of specific problems.
So look at the reason why the use of bits and operations in the kernel instead of modulo operations is self-evident
Saves CPU cycles and improves overall performance
When the bucket subscript is calculated very frequently in a hash table lookup
The resulting savings in CPU cycles are still considerable.