Possible use of several software Defects
Author: Wu Shi
My questions are several uncommon Software defects, but there may be some discrepancies in the content. Second, I am also a beginner. There may also be many problems in the PPT. I also hope that experts will point it out.
Let's take a look at Common Software defects:
First, stack overflow. It is to apply for a memory segment in the stack, which is generally an array or string. When performing operations on this memory segment, wrong write operations may cause the stack address with special meanings to be controlled by the user's input content. It was first found that some string-operated functions, such as strcat, also encountered this problem if Strncpy was not operating normally. Finally, there is a Windows UNicode processing function that will also encounter this problem if it is not used properly. The following describes the integer overflow problem.
Integer Overflow occurs frequently, especially when some addition and multiplication operations appear before the memory. The numbers that are added or multiplied are not necessarily larger than the original two numbers. There is also a question of comparing positive and negative numbers, or the question of symbol extension. Even now, this problem still exists in many software. However, it has rarely appeared in many popular software, such as Microsoft software and software of large foreign companies. However, there are still many software problems in China. This problem often exists in JAVA software. For example, the bank system handles system errors and adds the amount deducted from another account to your account.
Second, heap overflow. This is the main method for applying for distribution in modern C language, so it is much larger than the ratio of stack overflow. Microsoft has implemented a lot of protection measures, so it is very complicated to use. Especially for Versions later than Windows xp2, such as vista. Heap management mainly uses two tables, freelist, lookaside, and freelist [0], to represent some irregular chunks that can be used, especially the relatively large chunk. Freelist [1]-freelist [n] indicates the chunk in the heap that can be used by the integer power of 2. With this heap overflow problem, you need to be familiar with Windows heap management. For example, someone successfully uses the linked list freelist [0. Currently, an immdbg program is very helpful for this research. Because it can display all heap allocated content. In theory, attacks against vista software should not exist. Because vista has strict control over heap management, but many software uses their own memory management methods, such as OFFICE. Their own heap management methods and memory methods are different from those of vista, these methods often use textbook methods or previous systematic methods, so these methods may be used.
Third, non-initialization issues. The issue on the stack was discussed in detail by Germans in. When the stack is pressed for the first time, you need to write the required content on the stack, and then the function exits, causing the stack to be moved to the top. When the function is pressed for the stack, this stack space is used, if uninitialized issues such as arrays are found in the function, the stack space of the newly written content may be used. First, write most of the content in the heap as needed. If no initial problem occurs, for example, the content of the pointer in the heap may point to the content we need. At present, this problem exists in many ways, and there are many OFFICE problems. For example, Microsoft patch and excel patch have many such problems this month. You can compare the old and new OFFICE software and find that some newly added code in OFFICE2007 is used for initialization.
Fourth, the issue of secondary release or double free. Memory leakage is an enemy of modern software, especially server software. Many programmers are afraid of this problem. When applying for memory, they always want to release it. The result is released several times more, which also causes security problems. The methods used in linux are clever and classic, but they are difficult to use on Windows. A lot of software uses its own memory management method, so it is likely to be used.
Now we can discuss the software issues that can be used, that is, the full condition for executing any code software is to write any 4 bytes of content to any memory address. If this condition is met, it is certainly usable. This condition is called a sufficient condition. In stack overflow or heap overflow, any 4-byte content is written as some specific memory addresses, which have special meanings. Because code streams or data streams are mixed in x86. In other special cases, any one byte is written into some specific addresses. These addresses have special meanings and may also cause this problem. Therefore, the closer we are to the full condition, the more likely this problem is to be exploited and arbitrary code can be executed.
The key issue lies in the write memory operation, not the READ memory. Write memory operations can change a lot of things. Read Memory, even if an exception occurs, these problems are often not available. When we need to quickly find vulnerabilities that can be exploited from a large number of programs, we must first:
First, find the exceptions caused by write operations.
Second, there are few exceptions. If no initialization problem is found, you need to know the characters that are not initialized by default. This is an example. We can analyze in detail which results are used. Based on our principles:
1. Write operation process. This seems complicated and may be an integer overflow problem.
Next, let's talk about my experiences and the problem of stack overflow. This is a recursive function, which is different from Stack Overflow in general. Because the stack memory space is limited, this process of continuous stack pressure will lead to memory read and write outside the memory space allocated to the stack. The key to finding the problem is to find the "circle" that can be controlled by the user in the Code flowchart ". In general, this will cause the program to crash and may cause malicious code execution in some special circumstances. This situation is special. The two ready-made stacks are tightly tied together.
2. the space allocated for each stack is relatively large.
3. the compiler must be old. The new compiler checks this situation.
After considering stack overflow, we can consider the corresponding heap overflow situation, which is more complex but more flexible. In some cases, it may be exploited if you use this method.
Security problems can be borrowed from mathematical theories, because mathematical methods have been summed up by humans over thousands of years and are of universal significance and foresight. We can see that most of the security issues are caused by attackers and researchers who find a piece of controllable code or data stream that programmers did not think. Traditional stack overflow can basically be seen as a relatively small backdoor of the system or library function. Attackers or researchers know these backdoors better than common programmers. Basically, a program can be seen as two graphs, one called a data flow diagram and the other called a code flow diagram. The security problem is that a path is found on these two graphs, if this execution path is found, attackers can execute the required functions.
To solve this problem, we have prepared relevant theories in mathematics, and the research on Graph Theory has long been available. In my opinion, in the future, researchers will be guided by graph theory based on their own experience to quickly find a path to security problems using computers. What manufacturers need to do is to use their powerful computing capabilities to quickly traverse all intellectual paths to prove software security. This is my experience.
The third theory that can be borrowed is the compilation principle. C/C ++ programs must be implemented through compilation optimization measures. During this compilation process, the Compilation Program actually obtains a lot of useful information and how to safely exploit this information on the software, at present, many companies have made many meaningful attempts. However, there is still a gap between them. In theory, this method should be better than the Fuzzing method. He knows that the software structure can quickly discover software problems.
The following is my experience. I prefer to use this method to obtain some available information and automatically construct some test cases for testing. In this way, the testing performance and labor costs will be greatly reduced.
The fourth mathematical theory that can be borrowed is artificial intelligence and statistical theory. At present, there are major problems. Because there are too few sample libraries. We can first do the template method. For example, we can use the template to scan the original code and try to find such a problem before + or × memory allocation occurs. From these template matching methods to pattern matching methods. However, artificial intelligence and statistical theory play a major role in other network security issues. For example, DDOS or the number of methods that make websites, programs, and crashes, it should play a major role. Its classification of network traffic is still successful, but it cannot be said that it is a great success.
The following is an example of msoffice memory management. Taking 2007 as an example, msopvalloccore is used to allocate content and msofreepv is used to release memory. The address value is changed every time the memory is released, so that the address value is in the correct position. When the memory is allocated, read the first element of freelist. if it meets the requirement, the memory space is provided to the function to be applied. At the same time, the next element of freelist is used as the header. When the memory is released, it points to the address of the memory to be released as the header and points the previous one to the next element. There may be a double free problem, which is a well-known weakness in the office memory method, because he does not check this problem and may cause code execution. The specific method will be mentioned in the next PPT. In this way, we are inspired to write our own memory management methods in many software systems. In this way, some classic problems may be exploited in our own writing management methods. This problem does not exist in the OS layer.
Attack ideas. This is actually a code program explanation. First, we apply for a lot of memory, fill in what we need on the address, and release the memory. We also do this to ensure that the program is more likely to succeed. Apply for a piece of memory. After two free times, the Office Memory Management will mistakenly use the table address as the memory address that can be used after the release, and then apply for a relatively small amount of memory, fill in the memory address to be overwritten. For a few smaller memory segments, the Office Memory Management will incorrectly fill in the required content to overwrite the memory. Second memory problem in MS, not initialized for use. It is similar to double free. This reminds Microsoft not only to secure the system. There is such a low-level error in excel2003. Heap Overflow also exists in this memory management system. Heap Overflow is not protected by OS. It can be easily used.
The next problem is array out-of-bounds read/write, which is also a mainstream channel for vulnerability discovery. Because the methods of these problematic library functions cannot detect vulnerabilities in the past, even integer overflow is rare. Integer Overflow is also relatively small, but the array exclusive exists within a certain range. Currently, I have found a lot of problems with reading and writing arrays out of bounds. If the input parameter has a write operation, the content may be written to any memory address.
Next I will talk about the common problems of IM software. The first problem is that too many third-party software are integrated. These software often does not provide source code, which leads to incomplete or incorrect understanding of interfaces. Because IM software has many things that it is not good at, it often uses third-party software. However, the quality of third-party software code is not necessarily safe, which may cause some security problems. There are also some third-party software inherited from the old version. After the old version reports a security problem, the IM software does not update the software package used. By default, IM software deems that the client or server is trustable or partially trustable, which causes great security. Especially complex functions, such as audio and video functions, there are a large number of security problems. Some instant messaging software uses simplified encryption algorithms, which have poor performance against brute-force cracking. Some software may not use the encryption method correctly. For example, if a fixed key is used and the key is in the client system, the chat content can be known if someone extracts the key. Fourth, the software team is unstable, resulting in poor software code quality and inconsistent guiding ideology. A large number of unfinished functions may be found in some Chinese products, which may be a security risk.
This is my speech today. Thank you!
On-site Q &:
Q: first of all, you said that the compiler will generate some code or problems in Microsoft that you just mentioned. Why is this happening in the compiler? Another point is, for example, Microsoft's default application. What do you mean? As you said, I found many unfinished functions of the software. This software is generally large, with a lot of libraries. Are you looking at it all?
Wu Shi: I don't know. It mainly depends on the executable part.
Q: How do you know where it can be executed?
Wu Shi: first, go through the debugger. First, use it and then add a debugger. Then check the code that he may execute.
Q: How long does it take for you to analyze such things?
Wu Shi: Because I have not analyzed all of his content, I just analyzed the code stream if there is a problem.
Q: Is there any practice for Graph Theory to reduce software bugs?
Wu Shi: I am currently working on this. If you look at Ida software, I mentioned a German just now. He just painted the code flow as a graph to show it to you.
Q: Can I generate a graph by using the source code or execute the code?
Wu Shi: executable code.
Q: Will there be many omissions?
Wu Shi: this is done by IDA. At present, I have not seen any omissions or few.
Q: Can you consider using this mathematical method to solve Software defects?
Wu Shi: I mentioned just now that a company uses compilation principles to solve software problems and software security problems. These things are currently used by all major vendors. So there are still some prospects.
Q: In ms, they have their own memory management method. Why do they have to create their own memory instead of operating systems?
Wu Shi: due to some special requirements, the memory management method of the operating system often does not support the special requirements very well or does not produce very quickly. The most critical reason is that this method is not very fast, so he uses his own method to manage the memory.
Q: Can I give an example to illustrate whether there are other factors besides the speed?
Wu Shi: If the OS method is used, because each request for memory may be the same as the Npower of OS2, the minimum amount of memory fragments is generated, and the least amount of memory fragments is generated when heap management is unavailable. If it is not the second power, it will produce a lot of fragments. If you use your memory management method, he knows that it will generate such memory allocation, so he can avoid such problems, resulting in performance improvement.
Q: I would like to ask, as I mentioned earlier, many security questions are not just the use of simple dangerous functions, but are already a deep-seated question, is there any good method or good software that can be automated in this case?
Wu Shi: if the source code is available, Coverity is currently the best. However, if it is a security issue, it will not be much better. A large amount of data needs to be analyzed manually. If you want to completely solve the security problem. He is doing well with other software problems, such as memory leakage or uninitialized issues. Why is he not doing well in security? I don't think there is anyone in it who really understands security to guide this job.
Q: Can I introduce the tools used by the mining function?
Wu Shi: I originally wanted to write a ppt for getting started with the speed of light, because my notes were not written without power. Mining has several technical conditions: first, I am very familiar with debugging. Have a good understanding of the software structure. You need to know more about the functions you want to explore. Second, you need to have some ideas or experience to know under what circumstances the software may have problems. Next, you need to write a program to find such a problem, instead of finding it yourself. If you are looking for it by yourself, it takes too much energy or time. I used to look for it myself, but the effect is not very good.
This article from the CSDN blog, reproduced please indicate the source: http://blog.csdn.net/shewey/archive/2010/10/16/5945141.aspx