"Lao Liu Talk Algorithm 001" The arithmetic play the real--strlen analysis of the assembly implementation of the function

Source: Internet
Author: User
Tags arithmetic

First hang up the code,

The original function author for the unknown Foreigner, the source for the MASM32 development package, here to express thanks. Chinese comments Modified & added by Lao Liu: 486 model flat, stdcall option Casemap:none. codeopt    ION prologue:none OPTION epilogue:none align 4StrLen proc item:dword mov eax, [esp+4]; Gets the parameter item, which is the string pointer Lea edx, [eax+3]; edx= pointer +3 push EBP; back up EBP EDI push EDI mov ebp, 8 0808080h @@: REPEAT 3 mov edi, [eax]; edi= read 4 bytes add eax, 4; string pointer to next 4 characters Section start, i.e. +4 lea ECX, [edi-01010101h]; ecx= four bytes per byte-1 not edi; edi= four bytes logical inversion and ECX,     EDI; ecx= the inverse of each byte and does not take back 1 of each byte is logically associated with and ECX, EBP; Determines whether the binary 8th bit of each byte is 1, or 1 indicates the original byte =0 jnz                      NXT; If a null heresy occurs, jump to NXT to continue judging ENDM mov edi, [eax]; same as above add eax, 4                    Lea ECX, [edi-01010101h] not EDI and ECX, EDI   and ECX, EBP JZ @B; If there is no heresy, go back to the above loop, some words do not jump, the following is the NXT nxt:test ecx, 00008080h; test if NULL is in the first 2 bytes    JNZ @F shr ecx, 16; not in the first 2 bytes, ecx shift 16 bits, that is, 2 bytes to the right, so that the original 3rd, 4 bytes shifted to 1, 2 bytes. add eax, 2; string pointer +2 @@: SHL cl, 1; First byte logical left 1 bits, if the first byte is null, then cf=1, otherwise the second word Section null SBB eax, edx; eax=eax-edx-cf=eax-(item+3)-cf= string length pop edi pop EBP ret 4 StrLen endpoption prologue:prologuedef OPTION epilogue:epiloguedef End

The author's bottom-level knowledge is very strong, the use of bit computing,
The code isn't streamlined, but it's very efficient,
In order to achieve the above purposes, the Code of human directly difficult to understand, let us a few difficult to understand the place to analyze the Tao.

First, high efficiency--a judgment of 4 bytes
Corresponds to Line25~~42,
This code uses bit operations to determine whether there are null in 4 bytes at a time in the register.
We analyze by one byte and convert the code in mathematical language, the following decimal numbers are ubyte types.
A= the unsigned value of the byte,
So that f (x) is the corresponding function of the logical non-operation,
has f (x) =255-x
B=a-1
At this time because of overflow, there is b=a-1 (a∈[1,255]) or b=255 (a=0)
We set it as a function with B=g (a)
C=f (a)
D=b and C
When doing logic and arithmetic, the result is always not greater than two number of operations
So there's d<=b and the smaller one in C.
A∈[1,127], D<=g (a), at which time G (a) max=g (127) = 126,
A∈[128,255], D<=f (a), F (a) max=f (128) = 127,
a=0, d = 255 and 255 = 255,
E=d and 0x80
A∈[1,127], d<=126, at this point the 82nd binary bit of ∵d is set to 0,e=0
Similarly, a∈[128,255],e=0
a=0,e=0x80
That is, A is not 0 o'clock, E=0,a is 0 o'clock, e=0x80.

Second, refuse to waste register--The final calculation of the length of the string
Corresponds to Line:20, 28, 37, 48, 50, 51
After the author reads the bytes to be judged, the pointer +4 (28, 37 lines), regardless of 3721
This facilitates the cycle again, simplifying the code,
The 48 will be pointer +2,
50-row seasonal cf= pointer byte 8th digit
51 lines is the final calculation,
By one, when Byte is null, the 82nd binary bit =1 is judged
That is, when the pointer refers to a byte that is not null,cf=1
Result of 51 rows = Current pointer-string start pointer-3-1
When byte is not NULL, the other statement learns that the next byte is definitely null,
At this point the 82nd binary bit =0,cf=0
The result of 51 lines is reduced by 1, which corresponds to the actual string length.
This foreigner is a real show, in the pursuit of high efficiency when not forgetting to streamline a wave of code ~

The code also has limitations, and when double-byte characters are present, they are judged to be 2 lengths.
But for a foreigner programming in English, it can also be understood that
Code quality can be said to be very high, just let the analysis of its people very egg pain (ˉ▽ˉ;) ...
Perhaps this is the "maximum efficiency and readability cannot be combined" bar!

"Lao Liu Talk Algorithm 001" The arithmetic play the real--strlen analysis of the assembly implementation of the function

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.