GLIBC strlen source code analysis

Source: Internet
Author: User

Tony bai

Directly operating the string operation functions provided by the C standard library is risky. A slight carelessness may cause memory problems. This week, I wrote a small security string operation library in my spare time. However, after testing, I found that my implementation had major performance defects.

A simple performance comparison was initially made on Solaris. The following figure shows the obtained performance data (taking the strlen data as an example ):
When the length of the input string is 10, run the following command:
Strlen execution time is: 32762 milliseconds
My_strlen execution time is: 491836 milliseconds

When the length of the input string is 20, run the following command:
Strlen execution time is: 35075 milliseconds
My_strlen execution time is: 770397 milliseconds

Obviously, the consumption of strlen in the standard library is less than of my_strlen, and its performance consumption does not increase linearly with the increase of String Length, while that of my_strlen is obviously changed. Presumably, you can also guess that my_strlen adopts the traditional implementation method, that is, to determine whether the method is one byte, which is also consistent with the test phenomenon. In the spirit of root question, I found the source code implemented by strlen in the C standard library provided by GNU on the Internet. I want to see what skills strlen uses in GLIBC to achieve such high performance. To be honest, I have been in a relatively elementary position in performance optimization, which will also be a direction for my future efforts.

Download all the GLIBC code packages. This package is really small. Find strlen. c In the string subdirectory, Which is the source code for strlen implementation used by most UNIX platforms, Linux platforms, and the vast majority of GNU software. This code was written by Torbjorn Granlund (also implementing memcpy), and Jim Blandy and Dan Sahlin provided help and comments. Including comments, there are nearly 130 lines of code in strlen of GLIBC. You can read it carefully without understanding it. The following is a summary of strlen source code. I will write some understanding about this code later:

1/* Return the length of the null-terminated string STR. Scan
2 the null terminator quickly by testing four bytes at a time .*/
3 size_t strlen (str) const char * str;
4 {
5 const char * char_ptr;
6 const unsigned long int * longword_ptr;
7 unsigned long int longword, magic_bits, himagic, lomagic;
8
9/* Handle the first few characters by reading one character at a time.
10 Do this until CHAR_PTR is aligned on a longword boundary .*/
11
12 for (char_ptr = str; (unsigned long int) char_ptr
13 & (sizeof (longword)-1 ))! = 0;
14 ++ char_ptr)
15 if (* char_ptr =)
16 return char_ptr-str;
17
18/* All these elucidatory comments refer to 4-byte longwords,
19 but the theory applies equally well to 8-byte longwords .*/
20
21 longword_ptr = (unsigned long int *) char_ptr;
22
23 himagic = 0x80808080L;
24 lomagic = 0x010101l;
25
26 if (sizeof (longword)> 8)
27 abort ();
28
29/* Instead of the traditional loop which tests each character,
30 we will test a longword at a time. The tricky part is testing
31 if * any of the four * bytes in the longword in question are zero .*/
32
33 (;;)
34 {
35 longword = * longword_ptr ++;
36
37 if (longword-lomagic) & himagic )! = 0)
38 {
39/* Which of the bytes was the zero? If none of them were, it was
40 a misfire; continue the search .*/
41
42 const char * cp = (const char *) (longword_ptr-1 );
43
44 if (cp [0] = 0)
45 return cp-str;
46 if (cp [1] = 0)
47 return cp-str + 1;
48 if (cp [2] = 0)
49 return cp-str + 2;
50 if (cp [3] = 0)
51 return cp-str + 3;
52 if (sizeof (longword)> 4)
53 {
54 if (cp [4] = 0)
55 return cp-str + 4;
56

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.