In my OS x programming of late, one debugging technique I thought wocould be handy is the ability log and trace Objective-C messages sent. it turns out this is not only possible, but quite easy. however my journey to this easy answer was quite long and took me deep into the Objective-C rabbit hole.
Apple's Cocoa programing environment is based on Objective-C, a Smalltalk-influenced object oriented extension to C. objC is a dynamic language, meaning method CILS ("sending messages" in true OO parlance) are done completely at runtime, rather than "hardwired" at compile-time like in C ++. this runtime dispatching is handled by a function namedobjc_msgSend
Under most circumstances (see the Objective-C Runtime Reference for more information). Thus, the compiler turns the following method call:
[foo doIt];
Into this C-style function call, behind the scenes:
objc_msgSend(foo, @selector(doIt));
Since most message sends go through this single entry point, tracing and logging can be implemented right here, in one central location. I figured I cocould create a custom versionobjc_msgSend
That logs the class and selector and then jumps to the realobjc_msgSend
. If implemented as a shared library, you cocould even replace the system library by usingDYLD_INSERT_LIBRARIES
Environment variable. The ability to replace or override a function in a dynamic library is known as interposing, and Apple even has some nice documentation on how to do this.
The variable argument signatureobjc_msgSend
Makes interposing less than straight forward, though. As is common with most variable argument functions, a "v" variant that takes a pointer to the arguments is available, calledobjc_msgSendv
. However, after some digging, it turns it usingobjc_msgSendv
Is a bit more involved than I wanted to deal.
Surely someone has tried this before, so I turned to Google. google, in turn, led me to OpenDarwin, Apple's open source repository. the whole ObjC runtime is open source, under the package name "objc4 ". A quick glance at the code is quite impressive. someone has clearly spent a lot of time optimizingobjc_msgSend
. One bit of code that looked promising wasOBJC_HELP
Environment variable. When set, this variable shows all the tweaks you can do to the runtime. Here's the output when run on a simple ObjC program:
% OBJC_HELP=1 ./build/Debug/HelloWorldobjc: OBJC_HELP: describe Objective-C runtime environment variablesobjc: OBJC_PRINT_OPTIONS: list which options are setobjc: OBJC_PRINT_IMAGES: log image and library names as the runtime loads themobjc: OBJC_PRINT_CONNECTION: log progress of class and category connectionsobjc: OBJC_PRINT_LOAD_METHODS: log class and category +load methods as they are calledobjc: OBJC_PRINT_RTP: log initialization of the Objective-C runtime pagesobjc: OBJC_PRINT_GC: log some GC operationsobjc: OBJC_PRINT_SHARING: log cross-process memory sharingobjc: OBJC_PRINT_CXX_CTORS: log calls to C++ ctors and dtors for instance variablesobjc: OBJC_DEBUG_UNLOAD: warn about poorly-behaving bundles when unloadedobjc: OBJC_DEBUG_FRAGILE_SUPERCLASSES: warn about subclasses that may have been broken by subsequent changes to superclassesobjc: OBJC_USE_INTERNAL_ZONE: allocate runtime data in a dedicated malloc zoneobjc: OBJC_ALLOW_INTERPOSING: allow function interposing of objc_msgSend()objc: OBJC_FORCE_GC: force GC ON, even if the executable wants it offobjc: OBJC_FORCE_NO_GC: force GC OFF, even if the executable wants it onobjc: OBJC_CHECK_FINALIZERS: warn about classes that implement -dealloc but not -finalize2006-04-22 12:08:17.544 HelloWorld[4831] Hello, World!
Pretty cool! There are a few interesting options here. first is the suite of garbage collection (GC) variables. this is more evidence in support of GC in the next version of OS X, which has been speculated before. there was nothing about logging or tracing, however. theOBJC_ALLOW_INTERPOSING
Looked particle ly interesting, though. This means that not only has someone tried to interposeobjc_msgSend
, But that Apple actually has support for this. Unfortunately, doing a Google search on this environment variable turned up nothing. But now at least I now knew interposingobjc_msgSend
Cocould be done.
The next idea I had was to resort to assembly code. Surely, if I cocould just save the registers on the stack, I cocould log the information, then pop the stack and call the realobjc_msgSend
. By restoring the registers, I cocould bypass all the variable argument ugliness and it woshould all just work. well a few minutes into my PPC assembly hacking, I started thinking that this really sucked. although tracing was cool, and it was not really worth the effort it wocould take to complete. but again, I thought "surely someone else has done this ". now that I had the source code to the ObjC runtime, maybe I cocould create a custom version with logging or something.
One lucky search later (it's all a matter of searching with the right set of words), I stumbled guest SS Technical Note TN2124: Max OS X Debugging Magic. now I 've read this article before, and it does indeed have a lot of good information. but apparently I missed one sentence buried away:
If you setNSObjCMessageLoggingEnabled
Environment variable to "YES", the Objective-C runtime will log all dispatched Objective-C messages to a file named/tmp/msgSends-<pid>
.
Holy crap! This is exactly what I was trying to implement! Sure enough, it works:
% NSObjCMessageLoggingEnabled=YES ./build/Debug/HelloWorld2006-04-22 12:30:52.399 HelloWorld[4870] Hello, World!% cat /tmp/msgSends-4870 + NSObject NSObject initialize+ NSNotificationCenter NSObject initialize+ NSNotificationCenter NSNotificationCenter defaultCenter....
I told you it was quite easy, and it doesn't get much easier than this. All that crap about interposing and writing my ownobjc_msgSend
Was completely unnecessary. Okay, now I was curious. How was this being implemented? Even though I had my answer, I wanted to dig further. well, this environment variable isn't part of the ObjC runtime (it's actually part of Foundation), but the runtime does have a function calledinstrumentObjcMessageSends
, Which enables and disables logging. You can do this in your code to selectively trace only a section of your code:
instrumentObjcMessageSends(YES); // Do stuff... instrumentObjcMessageSends(NO);
You can even call this from GDB to enable it while debugging:
(gdb) call (void)instrumentObjcMessageSends(YES)
After more digging in the runtime, I found these functions:
typedef int (*ObjCLogProc)(BOOL, const char *, const char *, SEL);void logObjcMessageSends(ObjCLogProc logProc);static int LogObjCMessageSend(BOOL isClassMethod, const char * objectsClass, const char * implementingClass, SEL selector);
Ah... you can customize your own logger! With this, you cocould do something like filter by class or method. It turns out usinglogObjcMessageSends
Is not that easy, though. you'll get an "undefined symbol" link error, even though the function is not defined as static. for some reason, Apple decided not to export this function from the shared object:
% nm /usr/lib/libobjc.dylib | egrep 'instrumentObjcMessageSends|logObjcMessageSends'909c7cd4 T _instrumentObjcMessageSends909c7d4c t _logObjcMessageSends
[Update 5-Aug-2006:Greg Miller describes how to override_logObjcMessageSends
Usingnlist(3)
Inthis blog entry. Apparentlynlist(3)
Can lookup private symbols. Thanks for the information, Greg!]
The lower-case "t" means this is a local symbol, not a global symbol. This is confirmed by looking atobjc-exports-ppc
File in the objc4 distribution. only Apple knows why this isn't exported, but since we have the source, we can modify the export list to include this and rebuild. I had to use gcc 3.3 to compile objc4, as gcc 4.0 choked on some of the flags. one way to do this is to usegcc_select
. I just editedCC
Variable inMakefile
, So I can still use gcc 4.0 for everything else. However, it still won't link, as it's not easy to get the compiler to link against the hackedlibobjc
. Using-L
Didn't work for me, even thoughld (1) says "directories specified with-L are searched before the standard directories ".
This requires resorting to some shared library trickery. We can usedlsym
To lookup the address of this symbol at runtime, and assign it to a function pointer:
#include <dlfcn.h>typedef int (*ObjCLogProc)(BOOL, const char *, const char *, SEL);typedef void (*logObjcMessageSends_t)(ObjCLogProc logProc);logObjcMessageSends_t logObjcMessageSends = 0;int main (int argc, const char * argv[]) { logObjcMessageSends = dlsym(RTLD_DEFAULT, "logObjcMessageSends");
Now, to make sure that our app runs even against the standardlibobjc
, Let's only use the pointer if it's not null:
if (logObjcMessageSends != 0) logObjcMessageSends(MyLogObjCMessageSend);
You cocould even leave this in production code, if you wanted. Because of this trickery, you'll need to turn off ZeroLink. Also, to actually use our hackedlibobj
, You'll need to forcedyld
To use our library, instead of the system one:
% DYLD_INSERT_LIBRARIES=~/src/objc4-267.1/libobjc.A.dylib ./build/Debug/HelloWorld
Now,MyLogObjCMessageSend
Gets called and we can log where ever and whatever we like. A cool implementation wocould use some sort of config file with regular expressions. Logging everythingNSObjCMessageLoggingEnabled
Is good enough for me, though. I'll just stop right here, but it's good to know ICocouldCustomize the logging if I needed to, as some future point.
I 've since found this blog post mentioninginstrumentObjcMessageSends
. But it turns out usingNSObjCMessageLoggingEnabled
Is much simpler than the solution of using an input manager, too.