Preface
In this tutorial, you can see the registers used by the CPU and explore and modify the arguments passed to the function call. You will also learn about the common Apple architecture and how to use registers in functions. This is called a schema calling convention.
It is an extremely important skill to understand how a compilation works, and how a particular schema invocation convention works. It allows you to observe and modify the arguments passed to the function without the source code. In addition, because the source has different or unknown names of variable conditions, it is sometimes better to use the assembly.
For example, suppose you always want to know the second parameter of the calling function, regardless of the name of the parameter. The assembly knowledge provides you with a good base layer to manipulate and observe the parameters in the function.
Assembly
Wait, what's the assembly?
Did you stop in a non-source function and you would see a series of memory addresses followed by some scary short commands? You hug the ball and whisper it in your ear to tell yourself you're not looking at these things? Well... These things are called compilations!
This is a backtracking image in Xcode that shows the assembler function in the simulator.
Look at the picture above, this assembly can be divided into several parts. Each line of assembly instructions contains an opcode, which can be considered a very simple computer instruction.
So what does the opcode look like? An opcode executes instructions for a simple task in the computer. For example, consider the following assembly code snippet:
pushq %rbx subq $0x228, %rsp movq %rdi, %rbx
In this assembly block, you will see three, 操作码
pushq
subq
and movq
. Consider the actions performed by these opcode. The code behind the opcode is the label of the source and destination. These are opcode behavior items.
In the previous example, there are a series of, 寄存器
respectively, rbx
rsp
and rdi
, after each %
of which are called registers.
In addition, you can find 16 binary constants such as 0x228
. The $
constants at the back are all absolute numbers.
There is no need to know what the code is doing, because you first need to understand the registers and calling conventions of the functions.
Note: in the example above, the registers and constants are preceded by a bunch %
of and $
. It's a way of expressing. However, there are two main ways of presenting the assembly. The first is the Intel
assembly, the second is the AT&T
assembly.
By default, the Apple Disassembly Tool Library shows the/T format. As in the example above, although this is a good format, it is certain that it is a little difficult.
x86_64 vs ARM64
As a developer of the Apple platform, when you learn to assemble, you will deal with two main assembly architectures: x86_64
Architecture and ARM64
architecture, x86_64 may be your MacOS computer architecture, unless you run on older computers. X86_64 is a kind 64-bit
of architecture that means that each address can hold 64 1 and 0. In addition, the old Apple Computer uses the 32-bit
architecture, but Apple has stopped producing 32-bit computers in 2010. Programs run under MacOS are compatible with 64-bit, including emulator programs. In other words, even if you are x86_64 MacOS, it can still run 32-bit programs.
If you have any doubts about the architecture of the hardware you are working on, you can run the following command at the terminal:
-m
ARM64 architecture uses on mobile devices such as the iphone to control power consumption is of paramount importance.
ARM emphasizes power protection, so it reduces the number of opcode, which helps reduce energy consumption under complex assembly instructions. This is good news for you because there are fewer instructions to learn on the ARM architecture.
Here's the same method shown earlier, this time running under the ARM64 bit assembly of iphone 7:
In their so many devices, but later all moved to the 64-bit ARM processor. 32-bit devices are almost out of date because Apple has eliminated them through a variety of iOS versions. For example, the iphone 4s is a 32-bit device that already does not support iOS 10. The only iphone 5 available in the 32-digit iphone series supports IOS 10.
Interestingly, all Apple watches are currently 32-bit. This is most likely because 32-bit ARM CPUs typically have a smaller power than their 64-bit siblings. This is important for the watch because the battery is very small.
X86_64 Register calling convention
Your CPU uses a set of registers to process the data that is running. These are storage devices, just like the memory in your computer. However, they are located on the CPU itself, very close to the CPU portion. So the CPU accesses them very quickly.
Most directives involve one or more registers and perform operations. It is like writing registers into memory, reading the contents of memory to registers, or performing arithmetic operations on two registers (plus minus, etc.).
At x64
(beginning here, x64 is the abbreviation of x86_64), a machine with 16 universal registers is used to manipulate the data.
These registers are,,,,,, RAX
RBX
RCX
RDX
RDI
RSI
RSP
and R8
to R15
, respectively. You may not know the meaning of these names now, but you will soon explore these important registers.
When you downgrade the function in x64, this way and using registers, there are very specific conventions behind it. This determines where the function's arguments should be, and where the function's return value is when the function completes. This is important because code compiled with one compiler can use code compiled by another compiler.
For example, take a look at the following OBJECT-C code:
NSString *name = @"Zoltan"; NSLog(@"Hello world, I am %@. I‘m %d, and I live in %@.", name, 30, @"my father‘s basement");
It has four parameters passed to the NSLog function call, some variables are directly accessed, one parameter is defined in the local variable, and then the reference parameter is in the function. However, by compiling the code, the computer does not care about the name of the variable, it only cares about the address in memory.
The following registers are used as arguments to function calls under the x64 assembly. Try to commit these memory to memory, because in the future, you will often use these memory.
- First parameter:
RDI
- A second parameter:
RSI
- A third parameter:
RDX
- The fourth parameter:
RCD
- The Fifth parameter:
R8
- The Sixth parameter:
R9
If there are more than six parameters, the additional parameters are accessed through the stack in the function.
Returning to the OC example above, you can redefine the register like the following pseudo-code:
RDI = @"Hello world, I am %@. I‘m %d, and I live in %@."; RSI = @"Zoltan"; RDX = 30; RCX = @"my father‘s basement"; NSLog(RDI, RSI, RDX, RCX);
When the NSLog
function starts, these registers will contain the appropriate values. As shown in.
In any case, when the function prologue (function Prologue) (the start of the functions for the stack and register) is executed, the values on these registers are likely to change. Typically, the assembly overrides these values when the code does not need them, or simply discards the references.
This means that when you leave the function (via stepping over,stepping in, or stepping out), you can no longer assume that the register will retain the value you want to observe unless you actually see what the assembly code is doing.
This function call seriously affects your debug (breakpoint) policy, if you want to automate any type of interruption to explore, you should stop before the function call in order to check or modify the parameters instead of actually reaching the assembly.
Objective-c and registers
Registers use a specific calling convention. You can use the same knowledge to apply to other languages.
When OC executes the method inside, it is actually executed by a specific C function named Objc_msgsend. This actually has several different types of functions, which we'll talk about later. This is the core of message forwarding. The first argument, objc_msgsend, refers to the object that sent the message. Then there is selector, which is a simple char * Specifies the name of the function to execute on the object. Finally, the objc_msgsend uses variable parameters in the function.
Let's look at a real-world example of an IOS environment:
[UIApplication sharedApplication];
The compiler will turn the code into the following pseudo-code:
UIApplicationClass = [UIApplication class]; objc_msgSend(UIApplicationClass, "sharedApplication");
The first parameter reference is the UIApplication class, followed by the selector of the sharedapplication.
An easy way to tell a parameter is to check the colon of the selector. Each colon represents a parameter followed.
This is another OC example:
NSString *helloWorldString = [@"Can‘t Sleep; " stringByAppendingString:@"Clowns will eat me"];
The compiler will turn into the following pseudo-code:
NSString *helloWorldString; helloWorldString = objc_msgSend(@"Can‘t Sleep; ", "stringByAppendingString:", @"Clowns will eat me");
The first argument is an instance NSString(@"Can‘t Sleep; ")
, followed by a selector, and finally a parameter, as well as an NSString
instance.
Using objc_msgSend
knowledge, you can use the x64 register to help explore the context, which is a shortcut.
Theory to the actual
You can download the tutorial project here
In this chapter, you will use the project-provided tutorial resource bundle call Register, open the project in Xcode, and run it.
This is a fairly simple application that simply displays the contents of the x64 register. It is important to note that this application cannot display the value of the register at any given moment, it can only display the value of the register at the specified function call. means that when a function uses the value of a register to make a call, you do not see too many values for register changes.
Now you will understand the registers for the function behavior of MacOS applications and create NSViewController
a viewDidLoad
method symbol breakpoint. It is recommended to use "NS" instead of "UI" because you are running the Cocoa program.
Build and then return to the application, stop the first breakpoint, and enter in the LLDB console:
register read
A list of major registers is displayed when the execution status is paused. In any case, this information is much more. You should have the option to output registers and fix them as OC objects.
If you re-call it, -[NSViewController viewDidLoad]
it will be converted to the following assembly pseudo-code:
RDI = UIViewControllerInstance RSI = "viewDidLoad" objc_msgSend(RDI, RSI)
Remember the x64 calling convention, understanding the execution of objc_msgsend, you can find the concrete instance that is loaded NSViewController
.
In the LLDB console, enter:
$rdi
You will get the output:
<Registers.ViewController: 0x6080000c13b0>
This will output a reference hidden in the RDI register NSViewController
, you know, for the function this is the first parameter.
In Lldb, the important thing is that the $
prefix is a register, so lldb know you want the value of the register, not the variable in the current source range. Yes, this is different from the assembly in disassembly view! It's kind of annoying, isn't it?
Note: When you look at the OC Stop method, you never see the objc_msgSend
lldb in the backtracking, because objc_msgSend
this type of function execution is jmp
, or is the assembly instruction of the jump opcode. This means that the objc_msgSend
action is like a jump function, but once the OC code starts to run, all the objc_msgSend
history-related stacks will be optimized. This optimization is called 尾部调用优化
.
Try the output RSI
register to include the called selector, and output the following in Lldb:
$rsi
Unfortunately, you get an invalid output message that looks like this:
140735181830794
Why is that so?
OC selector is essentially char *
. This means that, like all C types, LLDB does not know what style to use to present the data. As a result, you must explicitly convert to the type of data you want.
Try to convert to the correct type:
iOS Advanced debug & Reverse Technology-Assembler Register call