Analysis of Heartbleed fixed size buffer for cardiac bleeding

Last Update:2017-02-28 Source: Internet

Author: User

Tags openssl readline stdin

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Heartbleed is an emergency security warning from OpenSSL: OpenSSL a "Heartbleed" security vulnerability. This loophole allows anyone to read the system's running memory, the name is called "Heart Bleeding", "Breakdown of the Heart" and so on.

Why fixed size buffers are so popular

A Heart bleed is a newly discovered security issue that causes a buffer to be crossed by a long string. The most common buffer crossings occur when the following two conditions are met:

A component A in the program passes a pointer to another component B, or it can also pass the length information
Component B is ignored, or the length information is not used correctly. This information sets how much data the memory area that the pointer points to can store.

The program structure that satisfies all of the above conditions can cause a buffer to cross bounds. One important reason is that caller a allocates a chunk of memory, but only when the data is actually read does it know how much memory the program needs to allocate, because the data that is not being read can be completely saved. In other words, a function is responsible for allocating space, and then calling another function to populate the space with the structure of the data is a bit unsafe.

Even though this risk can be successfully avoided by properly checking the memory boundary, the boundary check also introduces its own negative effects. For example, one of my former colleagues, he created a text file that compresses a tens of thousands of-character single-line string. Then he took the file as input and passed it on to many other parts, such as compilers, text handlers, and so on. Almost all of these programs have such unusual behavior, such as crashing directly, or silently ignoring the last section of the input string. The simple solution to the problem for

is that if any part of the program involves input that is indeterminate in length, it is the responsibility to allocate large enough memory to hold the input. Of course, the use of STL standard library in C + + language can be easily realized. But in C, there is no simple and efficient implementation code, you can read a single line from the input, return the memory pointer containing the input, ignoring the length of the input. Any attempt to implement this function in C language has some side effects.

I have also been quiet in the Department of work at that time, trying to add a solution to the above problem in the C language library. If someone wants to share the code that uses the functions I write to somewhere else, I want them to be able to publish the function I wrote as part of it. The name of the function I added is ReadLine, and designed for ease of use: simply passing in a file pointer (such as stdin) as input, this function reads an entire line of input, returns a null-terminated first character of this string, regardless of the length of the input. If you read the file Terminator (EOF), you return a null pointer.

Obviously, any function that allocates memory and returns to the memory pointer has a problem: When is the memory released? I considered letting the caller of the ReadLine function be responsible for the release, but felt that many call functions might forget to free up memory. Then the buffer crossing problem becomes a memory leak problem.

Finally, I decided to take the strategy I saw elsewhere: ReadLine will return a pointer to the memory space and ensure that its contents remain unchanged until the next time the ReadLine function is called. This strategy not only reduces the user's concern, but also makes the implementation simpler: The program stores a static pointer (static pointer) that points to (dynamically allocated) buffers. The size of the buffer will need to increase or decrease with the length of the read line. This mechanism allows the ReadLine function to be simple and safe in the most commonly used scenarios.

code is as follows	&nbs P;
Char line; while (line = ReadLine (stdin)!= NULL) { &N bsp; / Process a line */ }

Of course, this mechanism also has his own problems. For example, in the same expression, calling the ReadLine function two times will result in undefined behavior (undefined behavior). Because when the programmer plans to save two calls to all of the data read by ReadLine after the second call to the ReadLine () function, the first call to the created memory space will be released on the second call. In addition, the code takes up memory space after it is read into the last line of input, because it is no longer invoked. In fact, the memory space it wastes is the length of the longest line in the entire input. When I implement this function, I redistribute a larger buffer when the buffer is smaller than the input row length, but it does not allow the buffer to be smaller. Because I think the performance degradation caused by the repeated allocation of memory is not worth the loss of a little bit of memory in a few sober places.

Obviously, I overestimate the amount of memory allocation latency that people can tolerate: when I looked back at the code a few months later, I found someone had completely modified my ReadLine version to a fixed 4096-character buffer. As far as I know, his motivation is to completely avoid the overhead of runtime storage allocations. In other words, to avoid multiple memory allocator calls that only exist in a few cases, he quietly lets all programs that use the ReadLine function have a large security risk when they are longer than 4,096 characters in length. The reason

spends a great deal of time telling such a story is because it reveals what I find very important:

    buffer crossings usually occur in a part of the program where a allocates memory, and the actual amount of storage space needed is only another part B knows.
    Allocates and fills the memory inside the same function in the program. This solves the problem of buffer allocation, and the cost is that another function of the program must be responsible for the release of the memory. The allocation and release of memory is in two different functions of the program.
    This allocation and release of two different functions will result in a program availability problem that is difficult to circumvent unless there is systematic support in the programming language.
    Even though users need to receive this reality for security and versatility, they may not be able to accept the overhead of dynamically allocating memory introductions.

I think that the reason that programmers do not want to introduce run-time overhead for security is a common cause of many security problems. We'll talk about the phenomenon in detail next week.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More