ARM assembly Reverse IOS Combat _ios

Source: Internet
Author: User
Tags function definition volatile

Let's start with some basic knowledge of arm assembly. (We take ARMV7 as an example, the latest iphone5s on the 64-bit is not discussed)

Basic Knowledge Section:

First you introduce registers:

R0-R3: For the transfer of function parameters and return values

R4-R6, R8,r10-r11: There is no special provision, is the general General register

R7: Stack frame pointer (frame pointer). Point to the address of the previous saved stack frame (stack frame) and link register (link register, LR) on the stack.

R9: Operating System retention

R12: Also called IP (intra-procedure scratch), to say clearly to the cost of pen and ink, followed by detailed introduction

R13: Also called SP (stack pointer), is the top of the stack pointer

R14: Also known as LR (link register), store the return address of the function.

R15: Also called the PC (program counter), pointing to the current instruction address.

CPSR: The current programs State register, which stores flags such as the condition flag interrupt disabled in user state.

There is also a spsr in the state of other system state interrupts and CPSR, which is not detailed here.

There are also VFP (vector floating-point operations) related registers, where we skip, interested can be viewed from the back of the reference link.

Basic instructions:

Add plus Instruction

Sub minus instruction

STR Saves the register contents on the stack

LDR load the contents of the stack into a register

.wis an optional instruction width specifier. It does not affect the behavior of this instruction, it simply ensures that 32-bit directives are generated. Infocenter.arm.com for more information

BL executes the function call and directs the LR to the next instruction of the caller (caller), which is the return address of the function

BLX Ibid, but switch between arm and thumb instruction set.

BX bx LR return call function (caller).

Then there are some rules for function calls.

A. In iOS you need to use BLX,BX these instructions to invoke the function, you cannot use the MOV instruction (the specific meaning below will say)

Two. Arm uses a stack to maintain function invocation and return. The stack in arm is growing downward (growing from a high address to a low address).

The layout of the stack before and after the function call is shown in Figure one (quoted Apple iOS ABI Reference):

Figure (i)

The SP (stack pointer) points to the top of the stack (the stack is low on the high address). A stack frame is actually a piece of storage space on a stack that is identified by R7 and the old R7 on the stack. Stack frames include:

Parameter region (parameter area) that holds the arguments passed by the calling function. For 32-bit arm, the first 4 parameters are passed through the R0-R3, and the extra parameters are passed through the stack, which is stored in the area.

A link region (linkage area) that holds the next instruction of the caller (caller).

The stack frame pointer storage area (saved frame pointer), which holds the bottom of the stack frame of the calling function, identifies the end of the caller (caller) stack frame and the start of the stack frame of the called function (callee).

Local variable store (regional storage area). The local variable used to save the modulated function (callee) and the contents of the register that need to be restored before the called function (call) is returned after the called function (callee).

Register store (saved registers area). That's what Apple's documentation says. But I think that this area is adjacent to the local storage areas and it's also about storing the contents of the registers that need to be recovered, so I think it's either to differentiate the area conceptually or to separate the functionality of the registers that need to be recovered from the local storage region. Of course, these are only conceptual, in fact, there is no difference in essence.

Next look at what you want to do when you call the start and end of a child function. (officially called preface and Epilogue, Prologs and Epilogs)

Call Start:

LR into the stack R7 into the stack R7 = SP address. After the previous two into the stack instructions, the SP point to the address moved down, and then the SP to R7, marking the end of the caller stack frame and callee stack frame will callee modify and return to the caller need to restore the register into the stack. Allocate stack space for subroutine use. Because stacks grow from a high address to a low address, you typically use the sub SP, #size来分配.

End of Call:

Free stack space. Add SP, #size指令. Restores the saved registers. Recovery R7 the previously stored LR from the stack to the PC, so the function returns.

-----------------------------------------------------------Gorgeous split Line-------------------------------------------------------- -----

Actual combat section (i):

Create a test project with Xcode, new. c file, add the following function:

#include <stdio.h>
 
int func (int a, int b, int c, int d, int e, int f)
{
  int g = a + B + C + D + E + f;
   return g;
}

View assembly Language:

In the upper-left corner of the Xcode selected Targe in the real machine compiled, so that is the arm assembly, or in the simulator generated by the x86 assembly.

Click XCode => Product => perform Action => Assemble file.c Generate assembly code.

There's a lot of code, lots of "." The ". section", ". Loc" And so on, these are the assembler needs, we do not have to go to the tube. Put these "." After the beginning and comments are added, the code is as follows:

_func:
  . Cfi_startproc
lfunc_begin0:
  add r0, r1 Ltmp0
:
  LDR.W R12  , [sp]
  add r0, R2
  LDR.W  R9, [sp, #4]
  add r0, R3
  add R0, R12
  add r0, R9
  bx lr
Ltmp2:
lfunc_end0:

_func: Indicates the content of the Func function next. Lfunc_begin0 and Lfunc_end0 identify the start and end of the function definition. The beginning and end of the function are generally "Xxx_beginx:" and "Xxx_endx:"

Here's a line of code to explain:

Add R0, r1 add parameter A and parameter B and then assign the result to R0LDR.W R12, [sp] loads the most parameter F from the stack onto the R12 register add r0, r2 the parameter C to r0 LDR.W R9, [sp, #4] load the parameter e from the stack to the R9 register a DD R0, R3 additive D add up to R0add r0, R12 additive parameters F to r0add, R0 cumulative parameters E to R9

At this point, all a to f total of 6 values are added to the R0 register. The R0 is the one that holds the return value.

BX LR: return call function.

-----------------------------------------------------------Gorgeous split Line-------------------------------------------------------- -----

Actual combat Section (ii):

To show you the changes on the stack when the function call is made, here's an example of a three-function, two-call assembly code for C code.

Code on:

#include <stdio.h>
 
__attribute__ (noinline)
int addfunction (int a, int b, int c, int d, int e, int f) {
   
    int r = A + B + C + D + e + F;
  return r;
}
 
__attribute__ ((noinline))
int foofunction (int a, int b, int c, int d, int f) {
  int r = AddFunction (A, B, C, D, F, );
  return r;
}
 
int initfunction ()
{
  int r = foofunction ();  
  return r;
}
   

Since we are looking at function calls and stack changes, here we add __attribute__ ((noinline)) to prevent the compiler from inline (if you do not know inline, please google).

In the upper-left corner of the Xcode selected Targe in the real machine compiled, so that is the arm assembly, or in the simulator generated by the x86 assembly.

Click XCode => Product => perform Action => Assemble to generate assembly code, as follows:

In order to be more consistent with the way we think about ourselves, we start with the call function.

Initfunction:

_initfunction:
  . Cfi_startproc
lfunc_begin2:
@ bb#0:
  push  {R7, lr}
  mov R7, SP
  Sub sp, #4
  movs  r0, #55
  movs  R1, #22
Ltmp6:
  str r0, [sp]
  movs r0  , #11
  movs  R2 , #33
  movs  R3, #44
  bl _foofunction
  Add sp, #4
  pop {R7, PC}
LTMP7:
lfunc_end2:

Or a line of explanation:

1.push {R7, LR} is the first part of the basic knowledge of the function call of the preamble (Prologs) part of the 1, 22, will LR, R7 deposit to the stack up

2.mov R7, SP preamble (Prolog) 3.

3.sub SP, #4 allocate a 4-byte space on the stack to hold the local variable, which is the parameter. As we said earlier, R0-R3 can pass 4 parameters, but more than that can be passed through the stack.

4.movs R0, #55 the immediate number 55 deposit R0

5.movs R1, #22 deposit 22 into R1.

6.str r0, [sp] puts the R0 value in the memory pointed to by the stack pointer sp. The stack is stored with parameter 55

7. The next three instructions moves r0, #11 moves r2, #33 moves R3, #44 to deposit the corresponding immediate number into the specified register. So far, R0-R3 has stored 11, 22, 33,44 a total of 4 immediate number of parameters, the stack stored 55 of this parameter.

8.BL _foofunction call Foofunction, after the call to jump to the foofunction in the case of the next analysis.

9.add sp, #4 stack pointer moves up 4 bytes, reclaims the 3rd instruction Sub sp, #4分配的空间.

10.pop {R7, PC} restores the value of the first instruction push {R7, LR} to the stack, assigning the previous LR value to the PC. Note: When entering the initfunction, LR is the next instruction to invoke the function of the initfunction, so now assign the value in the LR to the PC program counter so that the function is reversed when the LR points to this instruction.

Instruction 1, 2, 3 is the preface to the function (prologs), the instruction 9, and 10 is the epilogue (Epilogs). This is basically a routine, see a lot of natural know, do not have to stop to analyze.

In order to facilitate and stack changes linked, we draw instructions 8, BL __foofunction when the stack layout as shown in Figure II:

Figure (ii)

After the above initfunction invokes the 8th instruction BL _foofunction, enter the Foofunction, the other compilations are as follows:

Foofunction:

_foofunction:
  . Cfi_startproc
lfunc_begin1:
  push  {r4, R5, R7, LR}
  add R7, SP, #8
  Sub sp, #8
  Ldr R4, [R7, #8]
  movs  R5, #66
  strd  R4, R5, [sp]
  bl _addfunction
  Add sp, #8
  pop {r4 , R5, R7, PC}
lfunc_end1:

, we look at one line:

1.push {r4, R5, R7, LR} You should have found that this time and initfunction different, in addition to LR and R7 also put R4, R5 push to the stack up, this is because we will use the next R4, R5, so we first put the R4,R5 on the stack, which We recover the R4, R5 value when we exit Foofunction return initfunction. The order of push to Stack is LR, R7, R4, R5.
2.add R7, SP, #8 in Initfunction we did not push R4, R5 so the SP point to the location is exactly the new R7 value, but here we put R4, R5 also pushed onto the stack, now the SP point to the R4 position on the stack , and the stack is growing downward, so we put the SP + #8个字节就是存放旧r7的位置.
3.sub SP, #8 allocate 8 bytes on the stack.
4.ldr R4, [R7, #8] R7 plus 8 bytes, the position on the stack is exactly where we hold the parameter 55 in the Initfunction. So here's the 55 assignment to R4.
5.movs R5, #66 immediate number assignment, does not explain
6.STRD R4, R5, [sp] saves R4, R5 values to the stack. We have 11,22,33,44 these 4 parameters in the Initfunction to R0-r3, and now 55,66 we're storing them on the stack.
The 7.BL _addfunction parameter is ready, so call addfunction now.
8.add sp, #8 recycle stack space
9.pop {r4, R5, R7, PC} The last two instructions are similar to initfunction, just a few more restores R4,R5. But it's also a command.

After the command BL _addfunction calls AddFunction, the layout of the stack is shown in figure (c):

Figure (iii)

Above the foofunction 7th instruction BL _addfunction after entering the addfunction. The assembly code is as follows:

AddFunction:

_addfunction:
  . Cfi_startproc
lfunc_begin0:
  add r0, r1 LDR.W R12  , [sp]
  add r0, R2
  LDR.W  R9, [sp, #4]
  add r0, R3
  add R0, R12
  add r0, R9
  bx lr
lfunc_end0:

To explain by line:

Add R0, r1     r0 + = R1
LDR.W R12, [sp]      load the SP-pointed content to the R12 register. From figure (c) We know that the SP points to 66, so the R12 save
r0 Add, r2         r0 + = r2 LDR.W R9
, [sp, #4]    from the figure (iii) SP plus 4 bytes saved is the R9 save
Add R0, r3         r0 = R3
add r0, R12 r0        + + R12
add r0, R9 r0      + = R9. To this R0-R4 11,22,33,44, and the stack on the 55,66 want to add to save to the r0.
bx LR             return.  

You should have noticed that because AddFunction did not invoke other functions, the preamble and epilogue are not the same as initfunction and foofunction. Because we do not call other functions, there will be no BL, blx such instructions, so will not personality LR, so we do not push LR.

Here we used the R9, R12 Why do not need to save and restore, I have not been able to understand, heroes if you can enlighten, will be greatly appreciated.

That's what IOS ABI reference says:

About R9:

In IOS 2.x, the register R9 is reserved to operating system use and must the not to used by application code. Failure to does can result in application crashes or aberrant behavior. However, in IOS 3.0 and later, register R9 can be used as a volatile register. These guidelines differ from the general usage provided for by the aAPCs document.

About R12

R12is The intra-procedure Scratch register, also known as IP. It is used by the dynamic linker and are volatile across all function calls. However, it can be used as a scratch register between function calls.

This is the assembly of the C function. The following is a compilation of obj-c functions, including the OBJC block.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.