Csapp Bomb Lab Records

Source: Internet
Author: User
Tags first string

Documenting the experimental process of Csapp binary bombs

(Csapp supporting teaching website bomb Lab self-study version, experiment Address: http://csapp.cs.cmu.edu/2e/labs.html)

(Personal experience: to have a clear understanding of the x86 assembly addressing mode, such as the MOV instruction refers to the calculated address points to the value of the storage unit, and the Lea command retains the calculated address, whether the number plus $ to indicate the problem of constant;

In the experiment, the memory of the jumping table and the processing of the list are the C language's assembly language realization Way, the processing is more complicated, but can have a clearer understanding to the object bottom realization way of these methods;

When it comes to pointer manipulation, it is particularly important to note which instructions are used to calculate the address and which are the values of the space corresponding to an address;

Although the self-taught version does not involve scoring and deduction, it still requires a lot of patience and effort to complete, and has fun~)

1. Preparation of the experiment

Read the writeup and Readme files prepared with the experiment, and have a general understanding of the contents of the experiment.

README: Binary Bomb is a C program that can run in a Linux environment with 6 "bombs", which requires debugging tools and techniques to get the target string to remove the bomb. This document also describes how to dynamically deploy the experiment to get a rating and other functions of the process, there is no discussion;

Writeup: Hints at some of the tools that might be applied. Also provides some useful information about the experiment.

(1) The experiment uses two ways to read the input, read the input directly during the run, or read the input from the specified file (each string occupies one line), the latter in the form of./bomb file.txt (./indicates that in the current directory, bomb is an executable file, File.txt file for recording the target string);

(2) A tool that can help with the experiment, such as instruction Objdump-t (output program symbol table) in Gdb,linux,/-d (disassembly of the output program), strings (an output string in the export program). An important finding in using these tools is that in addition to the 6 string processing functions explicitly proposed (PHASE1-6), there are also two other functions called Fun7 and Secert_phase , guessing that there may be additional strings to be cracked;

In addition, simply using the objdump-d Bomb command to view the program, you can know that the main part (main) is the initialization of the operation, the choice of reading mode, the processing and judgment of the target character, and the function of string reading related to Read_line, etc. After the string is read, a separate function is called to validate, so the experimental process can directly focus on the function part of the string processing, other logic can be viewed according to their own interests.

2. The experimental process

The main process of handling the string is as shown in the procedure

  

Call the Read_line function to read the string, the%eax value into the stack (note that the%eax here is the value after the function call, so stored is the return value of Read_line, the specific procedure is described in the Read_line function definition), where the stack is the function calls the process of passing parameters, The string is then validated by a specific function.

    • Phase_1

   

The function parameters are stored in the%eax, and the%eax and 0x80497c0 into the stack, call the Strings_not_equal function, guess 0x80497c0 for the destination string to store the address, using GDB to debug breakpoints, the results are as follows

  

Thus the first string is obtained.

    • Phase_2

The analysis is mainly focused on the functions called by the phase_2.

Read_six_number: The six consecutive stacks on the stack, from the low address to the high address of%ebp-24 to%ebp-4, where the%EBP is the phase_2 function corresponding to the%EBP, not Read_six_number%EBP, The latter is updated when the function is invoked. Then, the address of the 0x8049b1b and the input string is sequentially put into the stack, calling the SSCANF function;

SSCANF: For reading formatted strings, the typical parameter form is-the address of the string to be read, the format of the read string, and the address where the read is stored. According to the function in the stack from right to left order, guess the program will input the address of the string as the source address of the read, 0X8049B1B store should be read the format of the string, the remaining six addresses should be stored to read the data address. The string at the output 0x8049b1b, as shown below, validates the conjecture.

  

Read_six_number is to read the input string (6 digits) to the corresponding position of the stack, which is accurately stored in the order of entry in 6 consecutive storage units between%ebp-24 to%ebp-4 (note the order of the function parameters in the stack, and%ebp relative to the Phase_ 2 functions), and then perform subsequent validation work.

Phase_2 Validation section:

Verifies whether the first number of values is 1

  

The first three statements make the%EBX value 1,%esi the%ebp-24,%eax value to the 2,IMUL directive so that%eax is 2, and one of the multiplication factor addresses is%ebp-24+1*4-4, which is%ebp-24, and the value pointed to the previously validated 1

  

After that, for a loop statement, the%EBX value is increased by 1, and when%EBX is no more than 5 o'clock, repeat the process, i.e.

%ebx=%ebx+1;

%eax=%ebx+1,

%eax=%eax* the value of the previous validated number, comparing the%EAX to the value currently being validated

Therefore the first value is 1, the second value should be (+) *1=2, the third value is (2+1) *2=6, the fourth value is (3+1) *6=24, the fifth value is (4+1) *24=120, and the sixth value is (5+1) *120=720.

    • Phase_3

Phase_3 also calls the SSCANF function, whose parameters are high to low address to%ebp-4,%ebp-5,%ebp-12 (%ebp relative to the Phase_3 function), 0x80497de,%edx (storing strings read from input)

  

Similar to the usage of sscanf in phase_2, this time the format is read, and the corresponding string is validated on the basis of reading numbers and characters.

  

It is important to note that the parameters of the function are stacked from left to right, so for the input string,%ebp-4 stores the third number entered,%ebp-5 stores the characters, and%ebp-12 stores the first number entered, that is, the order of attention.

  

After the analysis of the subsequent verification process, first the first number (%EBP-12) is checked to verify that its value is greater than 7, and according to its value to jump, in this step temporarily can not find out the specific value is how much, continue to look backwards.

  

There are several similar processing in the following diagram, that is, the operation of the%BL, and then the third number (%ebp-4) test, and finally jump to 0x8048c8f place, and in the middle of the character (%ebp-5) to verify

  

  

Guess there should be a number of possible values that correspond to the choices that make up the corresponding set of strings. (note, later found that the form of the jmp *xx (,%eax,4) is the switch statement used in the jump table structure, for indirect jump )

The figure on the left is the specific value of the Jump Address table at 0x80497e8.

Here the solution uses case 0, that is,%eax=0, thus having%bl=0x71,%ebp-4=0x309,%ebp-5=0x71, gets the specified string of 0 Q 777.

  

    • Phase_4

Also call the SSCANF function, passing in a parameter of%ebp-4,0x8049808,%edx, which reads an integer.

  

  

First compare the read value with 0, if less than 0 bombs explode, and then read this value as a parameter, call function Func4.

  

The phase_3 can be undone only if the function return value is 0x37.

  

FUNC4 's Handling Logic description:

The FUNC4 should be in the form of a recursive function. The resulting return value increases with K to the Fibonacci sequence.

int Func4 (int k) {  if (k<=1)  return 1;  Return Func4 (k-1) +func4 (k-2);            }   

The above analysis shows that the input of the Func4 return value is 0x37 (that is, 55) is 9.

    • Phase_5

Phase_5 first calls the String_length function to get the value of the input string and compares it with 6 (again, the%eax that appears after the function call is likely to store the return value of the function, if the return value exists), and then proceed with the decision.

  

Perform some assignment operations to make%edx=0 (XOR),%ecx=%ebp-8,%esi=0x804b220.

  

Next is a looping statement that validates the string. The addressing Mode (%edx,%ebx,1) represents the calculation of the%edx+%ebx*1 and accesses the result as an address to the value of the corresponding storage unit. (%EBX stores the address that points to the first character of the input string)

  

The logic for processing is:

(1) Store the nth character in%al (n=0,1,2,3,4,5) and intercept the low 4-bit (and operation) symbol extension into the%eax

(2)%eax as an offset, 0x804b220 (that is,%esi) +%eax The value of the storage unit points stored in the%al

(3) Store the above treated values at%ebp-8+n

After the 0x804980b and%ebp-8 are stacked, and if the two are equal, the processed string should be stored at 0x804980b.

  

The specific situation is as follows

The processed characters should be Giants, and the corresponding characters are 0x67,0x69,0x61,0x6e,0x74,0x73

A sequence of 16 consecutive characters showing the beginning of the 0x804b220

The corresponding offset in the previous article is 0xf,0x0,0x5,0xb,0xd,0x1. The input string guarantees that the lower four bits are the same as the corresponding offsets.

The choice here is 0x6f,0x70,0x65,0x6b,0x6d,0x61, which is Opekma (the answer is not unique).

    • Phase_6 (Contact feeling processing is a bit complex, involving multiple loops, later by the people reminded that the process also involves linked list operations)

First, the assignment operation,%edx=%ebp+8 (that is, the input string start address, also phase_6 passed in parameters) stored at the value,%eax=%ebp-24, and%eax and%edx into the stack, call read_six_numbers function, its functions are described earlier.

  

Then the read out of the corresponding processing of the number, followed by a larger cycle, where the direct description of its logical process, no longer.

(1) value stored at%eax=%ebp-24,%eax=%eax+%edi*4 (%edi initial value is 0)

(2)%eax--, and compares its value with 5, when the value of%eax-1 is not more than 5 o'clock, continue to verify

(3)%ebx=%edi+1, if%ebx>5, jump to (7)

(4)%eax=%edi*4, and the value of%eax species stored in%ebp-0x38 place, so that%esi=%ebp-24

(5) The value stored at%edx=%ebp-0x38 (note here that there are two consecutive instructions, one is Lea, the other is MOV), the value stored at%eax=%edx+%esi*1, and compared with the value stored at%esi+%ebx*4

(6) When the two are not equal, will%ebx++, when the%EBX is not much more than 5 o'clock, jump to (5), otherwise the order of execution

(7)%edi++, if%edi is not greater than 5, jump to (1)

The function of the above loop is to read each number and verify that it is not greater than 6, and that any two digits are not equal .

  

Then there is a loop similar to the one described above:

(1)%ecx=%ebp-24,%eax=%ebp-48, and store the value of%eax at%ebp-60 Place

(2) The value stored at%esi=%ebp-52,%ebx=1,%eax=4*%edi (%edi initial value is 0),%edx=%eax

(3) Compare the values stored at%EBX and%EAX+%ECX, if the former is greater than or equal to the latter, then jump to (6), the order is executed

(4) Assign the value stored at%edx+%ecx to%eax,%esi=%ebp-44,

(5)%ebx++, if%ebx<%eax, jump to (4)

(6) The value stored at the%ebp-52 is assigned to the%edx,%esi value to the storage unit at%edx+4*%edi,%edi++, if%edi is less than or equal to 5, then jumps to (2) execution, otherwise the order executes

(In this procedure, there is a%eiz register, and finally found the corresponding explanation in the stack overflow, the original answer.) His own understanding of the highest-ticket answer, that is, in order to ensure the normal execution of the instructions, such as pipeline operation, may need to add the necessary delay to avoid competition, generally can be inserted in the middle of two instructions the appropriate number of NOP to ensure normal operation, But the processor is more efficient at processing a long instruction than the corresponding number of short instructions, such as NOP, so it is sometimes inserted into the program with this strange Lea instruction, which takes up 7 bytes, replacing it faster than executing 7 NOP instructions and ensuring that the program executes normally. Here%eiz is a pseudo register with a value of 0, which, through Lea 0x0 (%esi,%eiz,1),%esi This instruction to the effect similar to the NOP instruction. )

At first glance it feels like a little bit of a clue ...

The link list is involved in the processing of the program, as prompted by others. It is assumed that the following procedure involves a list operation (because the address is always stored in%esi, and the new address can be accessed through%esi+8),%esi the initial value (that is, the value at%ebp-52) is the head node address of the list,%esi+8 is the pointer field of the node, and the address of the next node is stored. Theoretically, it can be set up.

  

According to the above conjecture, the function of the above loop is based on the size of each number entered, each number corresponding to the node (1 or less than 1 corresponding to the head node, 6 corresponding to the 6th node, the previous has proved that the input number is not greater than 6, and 22 is not equal) address in order to store in%ebp-48 to%ebp-28 storage space.

  

Accordingly, the following processing modifies the pointer fields for each node according to the address stored in the order listed above. Each time you take the node corresponding to the address in%ebp-48+ (i-1), change its pointer field to the address at%ebp-48+i*4.

  

Immediately after the above logic, the function appears to be to assign null to the pointer field of the tail node, making the above conjecture more convincing.

  

According to the above conjecture, the following code functions as follows:

(1) Assigning%esi to the head node address,%edi=0;

(2) The value of the%edx= node pointer field, (%ESI) represents the value of the Node data field, (%edx) is the value of the Successor data field (the data field should be double word four bytes is the size of a register)

(3) Compare the current node data field with the successor data field, the current person is greater than or equal to the latter when the subsequent node is compared, or the bomb exploded

The function of the above code is: Verify that the reordered list should be a monotonically decreasing list.

  

At this point, the original linked list of the situation is analyzed, the original list of the starting address is stored at%ebp-52 (before the sorting operation).

GDB can directly output the values stored in the linked list nodes, and then the analysis linked list should be rearranged to decrement the list.

So the input sequence value can be 4 2 6 3 1 5

The above is the solution for all 6 string bombs, the main task has been completed ...

However, from the experimental preparation stage we have been aware of the existence of the secert_phase of the branch task ...

    •  Secert_phase

From the disassembly results we can see that there is a function secert_phase, and none of the preceding procedures including the phase1-6 and the main function contain calls to it. In the various functions called by the main function, only Initialize_bomb, Read_line, phase_defused functions are custom functions, look at the definition of these functions, found that the Secert_phase function in Phase_ Called in the defused.

The phase_defused is handled as shown, the main function calls phase_i for validation (if the condition is not met, call the Bomb function), and then immediately call phase_defused, before the function is to dismantle the bomb after the corresponding interactive reminders, Later, it was discovered that this function was implemented by the following printf, where a hint string is stored at 0x80496e0.

  

  

Analysis of the phase_defused:

Compare the value of 0x804b480 with 6, from disassembly can be seen in the number of strings that should be input, if the input string is not 6, then jump directly to the end of phase_defused, that is, Secert_phase needs to be completed after the first six strings are dismissed.

  

  

The SSCANF function is then called, where the values of each parameter can be displayed in GDB

  

The first parameter indicates the read format, and the second parameter, the author's first reaction is the input 9 of the front phase4.

That is, in addition to reading input 9, the phase_defused function can also read the extra string and process the string. When there is no string, the sscanf return value is not 2 and jumps directly to the end of the phase_defused function without affecting the trunk part execution.

  

The phase_defused function then compares the read string to compare the characters stored at 0x8049d09, as shown in. When the comparison matches, the Secert_phase function is called.

  

  

  

Parsing the Secert_phase function:

First call the __strtol_internal function, the prototype is a long int __strtol_internal (const char *__nptr, char **__endptr, int __base, int __group); The parameter is 0 o'clock and the function is the same as strtol.

  

Where the first parameter is the target string, the intermediate parameter can be used as the error return value (central defender null), and the third parameter is the adopted base (which refers to the binary of the first parameter). The return value is the integer value corresponding to the target string. The%eax in the title is the return value of the Read_line function, which is the starting address of the input string. The return value of the Set function is N. The number entered here (in string notation) is 0xa or decimal.

  

To make%ebx=n,%eax=n-1, the first should satisfy the%eax≤0x3e8 in the unsigned comparison.

  

After calling the FUN7 function, only the FUN7 return value is 7 o'clock to avoid a bomb explosion. The arguments passed here are N and constant 0x804b320.

  

  

FUN7 is a recursive function, and the process is as follows:

  

int fun7 (int *p,int n) {  if (!  P)  return 0;  if (n<*p)  return 2*fun7 (* (p+4), n);  else {if (n==*p) return 0; else return 2*fun7 (* (p+8), n) +1;}}      

The values stored at the 0x804b320 are as follows:

  

  

The Fun7 return value of 7 is assumed to be the case of 7=2*3+1,3=2*1+1,1=2*0+1.

So the first layer: P=0x804b320,n>*p (that is, p+8) =0x804b308, the function return value is 2*3+1=7,

Second layer: P=0x804b308,n>*p (that is, p+8) =0x804b2d8, function return value is 2*1+1=3,

  

Third layer: P=0x804b2d8,n>*p (i.e. 107), * (p+8) =0x804b278, function return value is 2*0+1=1

  

Layer Fourth: P=0x804b278,n=*p (that is, 0x3e9, N≤0x3e9 is previously qualified), the function returns a value of 0.

  

This results in a recursive final return value of 7, at which point the decimal representation of the n=0x3e9, while the input string is 1001, is converted from __strtol_internal to 10 binary numbers.

So the input is 1001.

3. Experimental results

Get the final result:

  

  

Csapp Bomb Lab Records

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.