Transferred from: http://blog.csdn.net/kailan818/article/details/6731772
English Original: http://www.gratisoft.us/todd/papers/strlcpy.html
English Todd c. Miller, Theo de Raadt
Translator: Linhai Maple
Translation Address: http://blog.csdn.net/linyt/archive/2009/07/27/4383328.aspx
Note: The copyright of this translation is owned by the translator, welcome reprint, but please specify the translator and the original, please hasty for any commercial use.
strlcpy and the strlcat-- consistent, secure string copy and threaded functions
Todd C. Miller
University of Colorado, Boulder
Theo de Raadt
OpenBSD Project
Overview
With the increase in popular buffer overflow attacks, more and more programmers are starting to use String functions with size, that is, length restrictions, such as strncpy () and Strncat (). While this trend is encouraging, the usual standard C string functions are not designed for this purpose. This article describes another intuitive, consistent, inherently secure string copy API.
There are still some security implications when functions strncpy () and strncat () are used as security versions of strcpy () and strcat (). First, the two functions deal with NUL terminator and length parameters in a different, non-intuitive way, even if an experienced programmer is confused. Second, it is not easy to check when a string truncation occurs. Finally, the strncpy () function fills the remaining target string space with zeros to incur performance degradation. Among all these problems, the confusion caused by the length parameter and the problem associated with the NUL terminator are the most severe. In reviewing potential security vulnerabilities in the OpenBSD source tree, we found rampant misuse of strncpy () and Strncat (). While not all misuse can lead to exploited security vulnerabilities, it is clear that the guidelines for enforcing secure string operations using strncpy () and Strncat () have been widely misunderstood. Two alternative functions strlcpy () and Strlcat () are proposed to address these issues by proposing a string copy-safe API (see Figure 1 function prototypes). These two functions guarantee that a string containing nul is generated, with the length of the string being the number of bytes consumed as the entry parameter, and providing an easy way to check for string truncation. Neither of them will clear 0 unused target space.
Introduction
In 1996, I worked with other members of the OpenBSD project to review the OpenBSD source tree to look for security issues and to highlight buffer overflow issues. The buffer overflow issue [1] has recently gained widespread attention in forums such as bugtraq[2] and has also been widely used. We found that a large number of overflows were caused by the use of sprintf (), strcpy (), and strcat () to create a string copy with no length bounds, and that the string length was not explicitly checked when manipulating strings in the loop is also one of the culprits. In addition, we have found that in many cases, programmers have used strncpy () and Strncat () for secure string manipulation, but failed to grasp the subtleties of these APIs.
So when we examine the code, we find it not only necessary to check for unsafe functions, such as strcpy () and strcat (), but also to check for incorrect use of functions strncpy () and strcat (). It is not always obvious to check for proper use, especially when using "static" variables or buffers allocated by calloc (), which always pre-fills the NUL terminator. We come to the conclusion that it takes a very safe function to replace strncpy () and Strncat () to radically simplify the work of the programmer and make it easier to audit the code.
size_t strlcpy (char *dst, const char *SRC, size_t size);
size_t Strlcat (char *dst, const char *SRC, size_t size);
Figure 1:strlcpy () and Strlcat () ANSI C prototypes
Common misconception
The most common misconception is that function strncpy () always produces a target string that ends in nul. However, this assertion is true only if the length of the source string is less than the size parameter. The problem arises when copying any length of user input into a fixed-size buffer. In this case, the safest way to use strncpy () is to reduce the size of the target string by 1 before passing it to the strncpy size parameter, and then manually add the NUL terminator to the target string. This guarantees that the target string will always end with NUL. Strictly speaking, if the string is a "static" variable or a variable assigned by calloc (), there is absolutely no need to manually add the NUL terminator to the string. Because these strings have been zeroed out at the time of allocation. However, relying on this feature often causes confusion for those who later maintain the code.
Another misconception is that the performance degradation caused by replacing strcpy () and strcat () in code with strncpy () and strncat () is negligible. This is true for Strncat (). This is not the case for strncpy (), because it zeroes out bytes that are not used to store strings. When the size of the target string is much larger than the length of the source character, this results in a lot of [* *] performance degradation. The behavior of Strncpy () varies depending on the CPU architecture and its implementation, so the performance degradation that it brings is also different from the behavior.
The most common error using strncat () is to use the incorrect size parameter. It is true to ensure that strncat () causes the target string to contain a null terminator, and that the parameter size must not count the space of the null character. Most importantly, the parameter size is not the size of the target string itself, but the amount of space reserved for the string. Because the parameter size is almost always a calculation, rather than a known constant, it is often incorrectly computed.
strlcpy () and the Strlcat () How is programming simplified?
The strlcpy () and Strlcat () functions provide a consistent, absolutely non-semantic API to help programmers write more secure bullet-proof code. First, and also heaviest, strlcpy () and Strlcat () both guarantee that all target strings end with a nul character, provided the size parameter is nonzero. Second, the two function takes the size parameter as the entire target character. In most cases, its value is easily computed at compile time by using the sizeof operator. Finally, both strlcpy () and Strlcat () do not give the target string 0 unused bytes (instead of using NUL to represent the end of the string).
The strlcpy () and Strlcat () functions return the length of the string they are trying to create. For strlcpy (), the length of the source string, and for Strlcat (), the length of the target string (the length before the string) plus the length of the source string. For checking for character truncation, the programmer only needs to verify that the return value is not less than the size parameter. So, even if truncation occurs, the number of bytes required to store the entire string is now known, the programmer can allocate a larger space, and then re-copy the string if needed. The return value is semantically similar to the return value of snprintf (), snprintf () is implemented by BSD and normalized by the upcoming c9x standard (note that non-and all current snprintf implementations follow c9x). If no truncation occurs, the programmer now learns the length of the resulting string. Because the usual practice is to use strncpy () and Strncat () to build the string, and then use strlen () to get the length of the resulting string, the return value semantics (strlcpy () and Strlcat ()) are useful. With strlcpy () and Strlcat (), the last step of strlen () is no longer required to get the length of the string.
Example 1a is a code snippet with a potential buffer overflow (the HOME environment variable is controlled by the user and can be any length).
strcpy (path, homedir);
strcat (Path, "/");
strcat (Path, ". Foorc");
Len = strlen (path);
Example 1a: code Snippet using strcpy () and strcat ()
Example 1b is a code snippet with the same functionality, but instead of using strncpy () and Strncat () safely (note that we had to manually set the NUL character to the target string).
strncpy (Path, homedir,sizeof (path)-1);
Path[sizeof (Path)-1] = '/0 ';
Strncat (Path, "/", sizeof (PATH)-strlen (path)-1);
Strncat (Path, ". Foorc", sizeof (PATH)-strlen (path)-1);
Len = strlen (path);
Example 1b: converted to use strncpy () and Strncat ()
Example 1c is a trivial version that uses the strlcpy ()/strlcat () API. It has the advantage of being as concise as example 1a, but does not need to take advantage of the return value of the new API.
strlcpy (path, homedir, sizeof (path));
Strlcat (Path, "/", sizeof (path));
Strlcat (Path, ". Foorc", sizeof (path));
Len = strlen (path);
Example 1c: Use strlcpy ()/strlcat ( ) the trivial version
Because the example 1c is so easy to read and understand, it is especially easy to add additional checks to it. Example 1d checks the return value to determine if there is enough space to store the source string. If there is not enough space, an error is returned. Although the program is slightly more complex than before, it is more robust and avoids the strlen () Call of the last step.
Len = strlcpy (path, homedir,sizeof (path));
if (Len >= sizeof (PATH))
return (Enametoolong);
Len = strlcat (Path, "/", sizeof (path));
if (Len >= sizeof (PATH))
return (Enametoolong);
Len = strlcat (Path, ". Foorc", sizeof (path));
if (Len >= sizeof (PATH))
return (Enametoolong);
Show Columns 1d : Detect if truncated
Design Decisions
A variety of ideas emerge when considering what semantics strlcpy () and Strlcat () should have. The original idea was to make the semantics of strlcpy () and Strlcat () and strncpy () the same as Strncat (), except that they always ensured that the target string ended in NUL. However, recalling the general usage (and misuse) of strncat (), we are convinced that the size parameter of Strlcat () should be the entire string space, not just the number of characters left unassigned. At the beginning of the decision, the return value is the number of copy characters,???. Soon we decided that the return value and snprintf () have the same semantics as this is a better option because it gives the programmer the greatest flexibility to do truncation checking and truncation recovery.
Performance
The programmer is now starting to avoid using the strncpy () function, because the function performs poorly when the target buffer is far larger than the length of the source string. For example, the Apache development team [6] invokes an intrinsic function to replace strncpy () and publishes a performance boost [7]. Similarly, the ncurses [8] package has recently removed all strncpy () function calls, resulting in a four times-fold increase in the tic tool speed. We hope that in the future more programmers will use the interface provided by strlcpy () rather than using a custom interface.
To get the perceptual knowledge of the strncpy () and strlcpy () differences in the worst case scenario, we run a test program that copies the string "This is just a test" 1000 times to a buffer size of 1024 bytes. This is a bit unfair for strncpy (), because using shorter strings and larger buffers, strncpy () must fill the buffer with the NUL character for most of the space. In practice, however, the buffers used are usually much larger than the user expects input. For example, the path name buffer has a length of Maxpathlen (1024 bytes), but most filenames are much smaller than this length. The average run time in table 1 is on machines using 25Mhz 68040CPU of machine hp9000/425t under OpenBSD 2.5 operating system and using 166Mhz of alpha CPU on the machine Dec AXPPCI166 in OpenBSD Results generated under the 2.5 operating system. Each case uses the same version of the C function, which is the "real time" portion of the results reported by the Times tool.
CPU Architecture |
Function |
Time (seconds) |
m68k |
Strcpy |
0.137 |
m68k |
Strncpy |
0.464 |
m68k |
strlcpy |
0.14 |
Alpha |
Strcpy |
0.018 |
Alpha |
Strncpy |
0.10 |
Alpha |
strlcpy |
0.02 |
Table 1: Performance timings in seconds Table 1: Performance Test-time results (seconds) |
As you can see from table 1, the timing results of strncpy () are far worse than the results of strncpy () and strlcpy (). This may not only be due to the overhead of filling in nul characters, but also because the CPU's data cache is effectively flushed by a long 0 string.
strlcpy () and the Strlcat () where you can't do it
Although strlcpy () and Strlcat () are good longer than those that handle fixed-size buffers, they still do not completely replace strncpy () and Strncat (). In some cases, you must manipulate buffers that are not really c strings (for example, strings in struct utmp). However, we think that these "pseudo-strings" should not be used in new code because they are easily misused and, from our experience, this is the universal source of bugs. In addition, the strlcpy () and Strlcat () functions do not attempt to "fix" the string handling in C. Instead, they were designed to be a standard architecture for C-characters. If you want to use a string function that supports dynamic allocation, any size buffer, you can use the "astring" package in the MIB software [9].
who should use strlcpy () and the strlcat ()?
The strlcpy () and Strlcat () functions first appear in OpenBSD 2.4. The last two functions have been agreed to be included in the Solaris version. Third-party packages are also starting to use this API. For example, the RSYNC[5] package now uses strlcpy () and provides its own version if the OS does not support the function. We hope that other operating systems and applications will use strlcpy () and Strlcat () in the future, and expect to receive a standard acceptance for a certain period of time.
What will be the next step?
In the OpenBSD project, we plan to replace each strncpy () and Strncat () with strlcpy () and Strlcat (), which is a smart move. Even though the new API is used in OpenBSD to write new code, there is still a lot of code that is converted to strncpy () and Strncat () during our previous security audits. So far, we continue to find bugs in existing code that are caused by errors using strncpy () and Strncat (). Change the old code to use strlcpy () and Strlcat (), should be able to (??). Some programs speed up and can (?) Uncover bugs for some programs.
Where can I get the source code?
The source code for strlcpy () and strcat () is available free of charge and complies with the BSD protocol as part of the OpenBSD operating system. You can also download the code and its manual from the ftp.openbsd.org/pub/openbsd/src/lib/libc/string directory via anonymous FTP. The source code for strlcpy () and Strlcat () is in file strlcpy.c and strlcat.c respectively. Documents (using TMAC.DOC troff macros) can be found from the strlcpy.3.
Author information
In 1993, Todd C. Miller took over the maintenance of the Sudo software package and participated in the free software community from there. He joined the OpenBSD project as an active developer. Todd received his bachelor's degree in computer Science from Colorado State University in 1997. You can use the email address [email protected] to contact him.
Theo de Raadt has joined the free UNIX operating system since 1990. His early development work included porting Minix to SUN3/50 and Amiga, as well as porting PDP-11 BSD 2.9 to 68030 computers. As one of the founders of the NetBSD Project, Theo works to maintain and improve many system components, including SPARC ports and free YP implementations, which are used by most free systems. Theo established the OpenBSD project in 1995, concentrating on the project (?? ) In terms of security, integration of cryptographic systems and code correctness. Theo full-time staff for the promotion of OpenBSD project. You can contact him by email address [email protected] .
Resources
[1] Aleph one. "Smashing the Stack for fun and Profit." phrack Magazine Volume Seven, Issue forty-nine.
[2] Bugtraq mailing List Archives. http://www.geek-girl.com/bugtraq/. This Web page contains searchable archives of the Bugtraq mailing list.
[3] Brian W. Kernighan, Dennis M. Ritchie. the C programming Language, Second Edition. Prentice Hall, PTR, 1988.
[4] International standards Organization. ' C9x FCD, programming languages/*-C ' http://wwwold.dkuug.dk/jtc1/sc22/open/n2794/This Web page contains the current D Raft of the upcoming c9x standard.
[5] Andrew Tridgell, Paul Mackerras. The rsync algorithm. http://rsync.samba.org/rsync/tech_report/. This web page contains a technical report describing the Rsync program.
[6] The Apache Group. The Apache Web Server. Http://www.apache.org. This web page contains information on the Apache Web server.
[7] The Apache Group. New features in Apache version 1.3. Http://www.apache.org/docs/new_features_1_3.html. This Web page contains new features in version 1.3 of the Apache Web server.
[8] The ncurses (new curses) home page. Http://www.clark.net/pub/dickey/ncurses/. This Web page contains ncurses information and distributions.
[9] Forrest J. Cavalier Iii. ' Libmib allocated String functions. ' http://www.mibsoftware.com/libmib/astring/. This Web page contains a description and implementation of a set of the string functions that dynamically allocate memory as N Ecessary.
Transferred from: http://blog.csdn.net/linyt/article/details/4383328
strlcpy and strlcat--consistent, secure string copy and threaded function "Go"