IOS development: teaches you how to implement objc_msgSend

Source: Internet
Author: User

IOS development: teaches you how to implement objc_msgSend

The objc_msgSend function supports everything we implement using Objective-C. Gwynne Raskind, reader of Friday Q & A, I suggest you talk about the internal implementation of objc_msgSend. Is there a better way to understand a thing than to implement it by yourself? Let's implement an objc_msgSend by ourselves.

Tramapoline! Trampopoline! (Trampoline)

When you write a method to send an Objective-C message:

[Obj message]

The compiler generates an objc_msgSend call:

Objc_msgSend (obj, @ selector (message ));

After that, objc_msgSend will forward the message.

What has it done? It will find the appropriate function pointer or IMP, then call it, and finally jump. Any parameter passed to objc_msgSend will eventually become an IMP parameter. The return value of IMP becomes the return value of the method that is initially called.

Because objcmsgSend is only responsible for receiving parameters, find the appropriate function pointer, and then jump, sometimes called trampoline ([trampoline] (https://en.wikipedia.org/wiki/Trampoline (computing )). more commonly, any code that forwards a piece of code to another can be called trampoline.

This forwarding behavior makes objc_msgSend special. Because it is just simple to find the right code, and then directly jump to the past, this is quite common. Any parameter combination can be passed in, because it only leaves these parameters for IMP to read. The returned values are tricky, but they can all be seen as different variants of objc_msgSend.

Unfortunately, these forwarding behaviors cannot be implemented in pure C. Because there is no way to pass the generic parameters passed into the C function to another function. You can use variable parameters, but the variable parameters are different from the normal parameter passing method and are slow. Therefore, this is not suitable for common C parameters.

If you want to use C to implement objc_msgSend, it should look like this:

Id objc_msgSend (id self, SEL _ cmd ,...)

{

Class c = object_getClass (self );

IMP imp = class_getMethodImplementation (c, _ cmd );

Return imp (self, _ cmd ,...);

}

This is a bit too simple. In fact, there will be a method cache to increase the search speed, like this:

Id objc_msgSend (id self, SEL _ cmd ,...)

{

Class c = object_getClass (self );

IMP imp = cache_lookup (c, _ cmd );

If (! Imp)

Imp = class_getMethodImplementation (c, _ cmd );

Return imp (self, _ cmd ,...);

}

Generally, for speed, cache_lookup is implemented using the inline function.

Assembly

In Apple's runtime, the entire function is implemented using assembler to maximize the speed. In Objective-C, every time a message is sent, objc_msgSend is called. The simplest action in an application will contain thousands or millions of messages.

In order to make things easier, I will use assembly as little as possible in my own implementation, and use independent C function abstraction complexity. The Assembly Code implements the following functions:

Id objc_msgSend (id self, SEL _ cmd ,...)

{

IMP imp = GetImplementation (self, _ cmd );

Imp (self, _ cmd ,...);

}

GetImplementation can work in a more readable way.

Assembly Code needs:

1. Store all potential parameters in a safe place to ensure that GetImplementation does not cover them.

2. Call GetImplementation.

3. Save the returned value somewhere.

4. Restore all parameter values.

5. Jump to the IMP returned by GetImplementation.

Let's get started!

Here I will try to use x86-64 Assembly so that it can work easily on Mac. These concepts can also be applied to i386 or ARM.

This function will be saved in a separate file called msgsend-asm.s. This file can be passed to the compiler like the source file, and then compiled and linked to the program.

The first thing to do is to declare the global symbol ). For some boring historical reasons, the global symbol of the C function will have an underscore before the name:

. Globl _ objc_msgSend

_ Objc_msgSend:

The compiler will be happy to link the recently available (nearest available) objc_msgSend. By simply linking this to a test app, we can use our own code instead of Apple's runtime for the [obj message] expression, so that we can conveniently test our code to ensure it can work.

The integer and pointer parameters are passed into the registers % rsi, % rdi, % rdx, % rcx, % r8, and % r9. Other types of parameters are passed into the stack. The first thing this function does is store the values in these six registers in the stack so that they can be restored afterwards:

Pushq % rsi

Pushq % rdi

Pushq % rdx

Pushq % rcx

Pushq % r8

Pushq % r9

In addition to these registers, register % rax assumes a hidden parameter. It is used to call variable parameters and saves the number of input vector registers (vector registers). The called function can correctly prepare the variable parameter list. In case that the target function is a variable parameter method, I also save the value in this register:

Pushq % rax

For integrity, the register % xmm used to pass in floating-point parameters should also be saved. However, if I can ensure that GetImplementation does not input any floating point numbers, I can ignore them so that I can make the code more concise.

Next, align the stack. Mac OS X requires that a function call stack be aligned with a 16-byte boundary. The code above is already stack alignment, but it still needs to be explicitly and manually processed to ensure that everything is alignment, so you don't have to worry about the crash when calling a function dynamically. To align the stack, after saving the original value of % r12 to the stack, I saved the current stack pointer to % r12. % R12 is optional, and any stored caller register (caller-saved register) can be used. It is important that these values still exist after GetImplementation is called. Then I set the stack pointer to-0x10 on the bitwise and (and) to clear the four digits at the bottom of the stack:

Pushq % r12

Mov % rsp, % r12

Andq $-0x10, % rsp

Now the stack pointer is aligned. In this way, we can safely avoid the registers stored above (abve) because the stack is growing downward, and this alignment method will make it move it further down ).

It is time to call GetImplementation. It receives two parameters, self and _ cmd. The call habit is to save these two parameters to % rsi and % rdi respectively. However, this is the case when objc_msgSend is passed in. They are not moved, so they do not need to be changed. All you need to do is actually call GetImplementation, and there must be an underline before the Method Name:

Callq _ GetImplementation

The return values of integer and pointer types are stored in % rax. This is where the returned IMP is located. Because % rax needs to be restored to the initial state, the returned IMP needs to be moved to another place. I randomly selected % r11.

Mov % rax, % r11

It is time to restore the original state. First, you need to restore the stack pointer that was previously saved in % r12, and then restore the old value of % r12:

Mov % r12, % rsp

Popq % r12

Then press the reverse order of the incoming stack to restore the register value:

Popq % rax

Popq % r9

Popq % r8

Popq % rcx

Popq % rdx

Popq % rdi

Popq % rsi

Now everything is ready. The parameter registers (argument registers) are all restored to the previous format. Parameters required by the target function are all in the appropriate position. In register % r11, what you need to do now is to jump there:

Jmp * % r11

That's it! No other assembly code is required. Jump handed over the control to the method implementation. From the code perspective, it is like the method directly called by the sender. The previous roundabout call methods have all disappeared. When the method is returned, it is directly put back to the call of objc_msgSend without other operations. The return value of this method can be found in a proper place.

Some details about unconventional return values need to be noted. For example, a large struct (the return value cannot be saved with a register size ). In the x86-64, large struct returns with the first hidden parameter. When you call it like this:

NSRect r = SomeFunc (a, B, c );

This call will be translated as follows:

NSRect r;

SomeFunc (& r, a, B, c );

The memory address used for the return value is passed into % rdi. Because objc_msgSend expects % rdi and % rsi to contain self and _ cmd, it does not work when a message returns a large struct. The same problem exists on multiple different platforms. Runtime provides objc_msgSend_stret for returning struct. its working principle is similar to that of objc_msgSend. It only knows to search for self in % rsi and to search for _ cmd in % rdx.

Similar issues occur when messages are sent on some platforms and return floating point values. On these platforms, runtime provides objc_msgSend_fpret (in x86-64, objc_msgSend_fpret2 is used in especially extreme cases ).

Method Search

Let's continue to implement GetImplementation. The above assembler trampoline means that the code can be implemented in C. Remember, in the real runtime, these codes are directly written in aggregation to ensure the fastest speed possible. In this way, you can not only better control the code, but also avoid repeating the code that saves and restores registers as above.

GetImplementation can be implemented simply by calling class_getMethodImplementation and incorporated into the implementation of Objective-C runtime. This is boring. The real objc_msgSend first searches for the method cache of the class to maximize the speed. Because GetImplementation wants to imitate objc_msgSend, it will do the same. If the cache does not contain the given selector entry, it will continue to look for runtime (it fall back to querying the runtime ).

What we need now is some struct definitions. The method cache is a private struct in the class struct. To obtain it, we need to define our own version. Although private, the definitions of these structs can still be obtained through Apple's Objective-C runtime open-source implementation ).

First, you need to define a cache entry:

Typedef struct {

SEL name;

Void * unused;

IMP imp;

} Cache_entry;

It is quite simple. Don't ask me what the unused field is, and I don't know why it is there. This is the definition of cache:

Struct objc_cache {

Uintptr_t mask;

Uintptr_t occupied;

Cache_entry * buckets [1];

};

The cache is implemented using a hash table. This table is implemented for speed consideration. Other irrelevant tables are simplified, so it is a bit different. The table size is always a power of 2. The table uses selector for indexing. The bucket directly uses the selector value for indexing, and may remove irrelevant low bits through shift ), and execute a logic and (logical and) with the mask ). The following are some macros used to calculate the bucket index when selector and mask are given:

# Ifndef _ LP64 __

# Define CACHE_HASH (sel, mask) (uintptr_t) (sel)> 2) & (mask ))

# Else

# Define CACHE_HASH (sel, mask) (unsigned int) (uintptr_t) (sel)> 0) & (mask ))

# Endif

Finally, it is the struct of the class. This is the type pointed to by Class:

Struct class_t {

Struct class_t * isa;

Struct class_t * superclass;

Struct objc_cache * cache;

IMP * vtable;

};

The required structure already exists. Now let's start implementing GetImplementation:

IMP GetImplementation (id self, SEL _ cmd)

{

The first thing to do is to get the class of the object. The real objc_msgSend is obtained through a method similar to self-> isa, but it will be implemented using the official API:

Class c = object_getClass (self );

Because I want to access the original form, I will perform type conversion for the pointer pointing to the class_t struct:

Struct class_t * classInternals = (struct class_t *) c;

Now it is time to find IMP. First, we set it to NULL. If we find it in the cache, We will assign it a value. If the cache is still NULL, We will roll back to a slow method:

IMP imp = NULL;

Next, obtain the pointer to the cache:

Struct objc_cache * cache = classInternals-> cache;

Calculate the bucket index and obtain the pointer to the buckets array:

Uintptr_t index = CACHE_HASH (_ cmd, cache-> mask );

Cache_entry ** buckets = cache-> buckets;

Then, we use the selector to find the cache. Runtime uses the linear chaining, and then only traverses the buckets subset until the required entry or NULL entry is found:

For (; buckets [index]! = NULL; index = (index + 1) & cache-> mask)

{

If (buckets [index]-> name = _ cmd)

{

Imp = buckets [index]-> imp;

Break;

}

}

If no entry is found, we will call runtime to use a slow method. In the real objc_msgSend, all the above Code is implemented using Assembly. In this case, you should leave the assembly code to call the runtime method. Once the required entry is not found after the cache is searched, the hope for fast message sending will be lost. At this time, it is not so important to get a faster speed, because it is doomed to slow down, and to a certain extent, it is rarely called. Because of this, it is acceptable to discard the assembly code and switch to a more maintainable C:

If (imp = NULL)

Imp = class_getMethodImplementation (c, _ cmd );

In any case, IMP is now available. If it is in the cache, it will be found there, otherwise it will be found through runtime. Class_getMethodImplementation calls also use cache, so the next call will be faster. The rest is to return IMP:

Return imp;

}

Test

To ensure that it works, I wrote a quick test program:

@ Interface Test: NSObject

-(Void) none;

-(Void) param: (int) x;

-(Void) params: (int) a: (int) B: (int) c: (int) d: (int) e: (int) f: (int) g;

-(Int) retval;

@ End

@ Implementation Test

-(Id) init

{

Fprintf (stderr, "in init method, self is % p \ n", self );

Return self;

}

-(Void) none

{

Fprintf (stderr, "in none method \ n ");

}

-(Void) param: (int) x

{

Fprintf (stderr, "got parameter % d \ n", x );

}

-(Void) params: (int) a: (int) B: (int) c: (int) d: (int) e: (int) f: (int) g

{

Fprintf (stderr, "got params % d \ n", a, B, c, d, e, f, g );

}

-(Int) retval

{

Fprintf (stderr, "in retval method \ n ");

Return 42;

}

@ End

Int main (int argc, char ** argv)

{

For (int I = 0; I <20; I ++)

{

Test * t = [[Test alloc] init];

[T none];

[T param: 9999];

[T params: 1: 2: 3: 4: 5: 6: 7];

Fprintf (stderr, "retval gave us % d \ n", [t retval]);

NSMutableArray * a = [[NSMutableArray alloc] init];

[A addObject: @ 1];

[A addObject :@{@ "foo": @ "bar"}];

[A addObject: @ ("blah")];

A [0] = @ 2;

NSLog (@ "% @", );

}

}

In case of some unexpected calls, the runtime implementation is used. I added some debugging logs in GetImplementation to ensure that it is called. Everything works. Even literals and subscripting call the replacement implementation.

Conclusion

The core part of objc_msgSend is quite simple. But its implementation requires some compilation code, which makes it more difficult to understand than it should look like. However, we still need to use some assembly code to optimize the performance. But by building a simple assembler trampoline and then implementing its logic using C, we can see how it works, and it is really nothing advanced.

Obviously, you should not use the replaced objc_msgSend implementation in your app. You will regret it. This is only for the purpose of learning.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.