How does Mac OS X execute applications?

Source: Internet
Author: User

As a long-term UNIX user, I usually have some tools to troubleshoot system faults. Recently, I am developing software and adding support for Apple's OSX system. However, unlike other traditional unix variants, OSX does not support many tools related to loading, linking, and executing programs.

For example, when an error occurs in shared library relocation, the first thing I do is to run LDD on executable files. The LDD tool lists the shared libraries (including the paths) on which executable files depend ). However, if you try to run LDD on OSX, an error is returned.

evil:~ mohit$ ldd /bin/ls-bash: ldd: command not found

Not found? But there are basically all UNIX-like systems! I want to know whether objdump is available.

evil:~ mohit$ objdump -x /bin/ls-bash: objdump: command not found

Command not found! What's going on?

The problem is that, unlike Linux, Solaris, HP-UX, and many other UNIX variants, OSX does not use the ELF file format. In addition, OSX is not part of the GNU project. This project contains tools such as LDD and objdump.

To obtain the list of shared libraries on which executable files depend, you need to use the otool.

evil:~ mohit$ otool /bin/lsotool: one of -fahlLtdoOrTMRIHScis must be specifiedUsage: otool [-fahlLDtdorSTMRIHvVcXm] <object file> ...        -f print the fat headers        -a print the archive header        -h print the mach header        -l print the load commands        -L print shared libraries used        -D print shared library id name        -t print the text section (disassemble with -v)        -p <routine name> start dissassemble from routine name        -s <segname> <sectname> print contents of section        -d print the data section        -o print the Objective-C segment        -r print the relocation entries        -S print the table of contents of a library        -T print the table of contents of a dynamic shared library        -M print the module table of a dynamic shared library        -R print the reference table of a dynamic shared library        -I print the indirect symbol table        -H print the two-level hints table        -v print verbosely (symbolicly) when possible        -V print disassembled operands symbolicly        -c print argument strings of a core file        -X print no leading addresses or headers        -m don't use archive(member) syntaxevil:~ mohit$ otool -L /bin/ls/bin/ls:        /usr/lib/libncurses.5.4.dylib (compatibility version 5.4.0, current version 5.4.0)        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 88.0.0)

Much better! We can see that/bin/ls references two dynamic libraries. Although we are not familiar with the file extension.

I believe many UNIX and Linux users have had similar experiences when using the OSX system, so I decided to write a little bit about OSX executable files that I know now.

The OSX runtime architecture runtime environment is a framework for code extension on OSX. It consists of a set of sets that define how code is loaded, managed, and executed. Once the application is running, the appropriate runtime environment loads the program to the memory, solves the reference of the External library, and prepares code for execution.

OSX supports three runtime environments:

  • Dyld runtime environment: Recommended Environment Based on dyld Library Manager;
  • CFM runtime environment: os9 legacy environment. It is actually used to design applications that require the use of new OSX features, but have not been fully transplanted to dyld.
  • Classic environment: os9 (9.1 or 9.2) programs run directly on OSX without modification.

This article focuses on the dyld runtime environment.

Almost all executable files in OSX use the Mach-o file format, such as application, framework, library, and kernel extension ...... All are implemented using the Mach-o file. Mach-O is a file format and an ABI that describes how executable files are loaded and run by the kernel (Application binary interface ). Specifically, it tells the system which dynamic library loader is used, which shared library is loaded, how the process address space is organized, and the function entry point address.

Mach-O is not a new thing. Initially, the Open Software Foundation (OSF) was used to design an OSF/1 Operating System Based on the Mach microkernel. Later, it was transplanted to the openstep of the x86 system.

To support the dyld runtime environment, all files should be compiled into the Mach-O executable file format.

Organization of the Mach-o file

The Mach-o file is divided into three areas: the header, the load command segment, and the original segment data. The header and the load command section describe the functions, layout, and other features of the file. The original segment data contains the byte sequence referenced by the load command. To study and check the various parts of the Mach-o file, OSX comes with a very useful program otool, which is located in the/usr/bin directory.

Next, we will use otool to learn how Mach-O files are organized.

View the Mach-O header of the file in the header, and use the-h Parameter of the otool command.

evil:~ mohit$ otool -h /bin/ls/bin/ls:Mach header          magic cputype cpusubtype filetype ncmds sizeofcmds flags 0xfeedface           18                   0            2        11            1608 0x00000085

The first parameter in the header is magic number ). The magic number indicates whether the file is a 32-bit or 64-bit Mach-o file. It also indicates the CPU byte sequence. For more information about the magic number, see/usr/include/Mach-O/loader. h.

The header also specifies the target architecture of the file. In this way, the kernel is allowed to ensure that the Code does not run on the CPU not compiled for this processor. For example, in the above output, cputype is set to 18, which represents cpu_type_powerpc, which is defined in/usr/include/Mach/machine. h.

From the above two pieces of information, we can infer that this binary file is used in a 32-bit PowerPC-based system.

Sometimes binary files may contain more than one system code. It is usually called Universal binaries and usually starts with the extra header fat_header. Check the fat_header content and use the-F Switch Parameter of the otool command.

The cpusubtype attribute specifies the exact CPU model, which is usually set to cpu_subtype_powerpc_all or cpu_subtype_i1__all.

Filetype specifies how files are aligned and used. In fact, it tells you that files are libraries, static executable files, core files, and so on. The above filetype is equal to mh_execute, indicating the demand paged executable file. The following is a clip captured from/usr/include/Mach-O/loader. H. Different file types are listed.

#define MH_OBJECT 0x1   /* relocatable object file */#define MH_EXECUTE  0x2   /* demand paged executable file */#define MH_FVMLIB 0x3   /* fixed VM shared library file */#define MH_CORE   0x4   /* core file */#define MH_PRELOAD  0x5   /* preloaded executable file */#define MH_DYLIB  0x6   /* dynamically bound shared library */#define MH_DYLINKER 0x7   /* dynamic link editor */#define MH_BUNDLE 0x8   /* dynamically bound bundle file */#define MH_DYLIB_STUB 0x9   /* shared library stub for static */          /*  linking only, no section contents */

The following two attributes involve loading the command segment and specify the number and size of commands.

Finally, the status information is obtained, which may be used by the kernel during loading and execution.

The load command section contains a list of commands that tell the kernel how to load various original segments in the file. A typical description of how to align and protect the layout of each segment and each segment in the memory.

View the load command list in the file and use the-l switch parameter of the otool command.

evil:~/Temp mohit$ otool -l /bin/ls/bin/ls:Load command 0      cmd LC_SEGMENT  cmdsize 56  segname __PAGEZERO   vmaddr 0x00000000   vmsize 0x00001000  fileoff 0filesize 0  maxprot 0x00000000initprot 0x00000000   nsects 0    flags 0x4Load command 1      cmd LC_SEGMENT  cmdsize 600  segname __TEXT   vmaddr 0x00001000   vmsize 0x00006000  fileoff 0filesize 24576  maxprot 0x00000007initprot 0x00000005   nsects 8    flags 0x0Section  sectname __text   segname __TEXT      addr 0x00001ac4      size 0x000046e8    offset 2756     align 2^2 (4)    reloff 0    nreloc 0     flags 0x80000400reserved1 0reserved2 0[ ___SNIPPED FOR BREVITY___ ]Load command 4          cmd LC_LOAD_DYLINKER      cmdsize 28         name /usr/lib/dyld (offset 12)Load command 5          cmd LC_LOAD_DYLIB      cmdsize 56         name /usr/lib/libncurses.5.4.dylib (offset 24)   time stamp 1111407638 Mon Mar 21 07:20:38 2005      current version 5.4.0compatibility version 5.4.0Load command 6          cmd LC_LOAD_DYLIB      cmdsize 52         name /usr/lib/libSystem.B.dylib (offset 24)   time stamp 1111407267 Mon Mar 21 07:14:27 2005      current version 88.0.0compatibility version 1.0.0Load command 7     cmd LC_SYMTABcmdsize 24  symoff 28672   nsyms 101  stroff 31020strsize 1440Load command 8            cmd LC_DYSYMTAB        cmdsize 80      ilocalsym 0      nlocalsym 0     iextdefsym 0     nextdefsym 18      iundefsym 18      nundefsym 83         tocoff 0           ntoc 0      modtaboff 0        nmodtab 0   extrefsymoff 0    nextrefsyms 0indirectsymoff 30216  nindirectsyms 201      extreloff 0        nextrel 0      locreloff 0        nlocrel 0Load command 9     cmd LC_TWOLEVEL_HINTScmdsize 16  offset 29884  nhints 83Load command 10        cmd LC_UNIXTHREAD    cmdsize 176     flavor PPC_THREAD_STATE      count PPC_THREAD_STATE_COUNT    r0  0x00000000 r1  0x00000000 r2  0x00000000 r3   0x00000000 r4   0x00000000    r5  0x00000000 r6  0x00000000 r7  0x00000000 r8   0x00000000 r9   0x00000000    r10 0x00000000 r11 0x00000000 r12 0x00000000 r13  0x00000000 r14  0x00000000    r15 0x00000000 r16 0x00000000 r17 0x00000000 r18  0x00000000 r19  0x00000000    r20 0x00000000 r21 0x00000000 r22 0x00000000 r23  0x00000000 r24  0x00000000    r25 0x00000000 r26 0x00000000 r27 0x00000000 r28  0x00000000 r29  0x00000000    r30 0x00000000 r31 0x00000000 cr  0x00000000 xer  0x00000000 lr   0x00000000    ctr 0x00000000 mq  0x00000000 vrsave 0x00000000 srr0 0x00001ac4 srr1 0x00000000

The above files are directly located by loading command 11 under the header, from 0 to 10.

Commands 0 and 3 (lc_segment) ranges from 0 to 3 and defines how segments in the file are mapped to the memory. Segment defines the byte sequence in the Mach-O binary file, which can contain zero or more sections. Let's talk about it later.

  • Command 4 (lc_load_dylinker) specifies the dynamic linker to use. Almost always set to OSX default dynamic linker/usr/lib/dyld.
  • Commands 5 and 6 (lc_load_dylib) specifies the shared library to be linked. They are loaded by the dynamic linker specified by command 4.
  • Commands 7 and 8 (lc_symtab, lc_dynsymtab) Specify the symbol table used by the file and dynamic linker respectively.
  • Command 9 (lc_twolevel_hints) contains two levels of namespace hint table.
  • Command 10 (lc_unixthread) defines the initial state of the main thread of the process. This command is only contained in the executable file.

Segments and zones)

Most of the loading commands mentioned above reference the segments in the file. Segments are a series of character sequences mapped to the virtual memory by the kernel and the dynamic linker. The header and the load command area are considered to be the first part of the file. A typical OSX executable file consists of the following five segments:

  • _ Pagezero is located at virtual address 0 and has no protection rights. This segment does not occupy space in the file. Access to null causes an immediate crash.
  • _ Text contains read-only data and executable code.
  • _ Data contains writable data. These sections are usually marked as copy-on-write by the kernel.
  • _ Objc contains the data used in the runtime environment of Objective C language.
  • _ Linkedit contains the raw data used by the dynamic linker.

The _ text and _ data segments may contain 0 or more sections. Each section consists of data of the specified type, such as executable code, constants, and C strings.

View the content of a section and use the otool command-s option.

evil:~/Temp mohit$ otool -sv __TEXT __cstring /bin/ls/bin/ls:Contents of (__TEXT,__cstring) section00006320 00000000 5f5f6479 6c645f6d 6f645f74 00006330 65726d5f 66756e63 73000000 5f5f6479 00006340 6c645f6d 616b655f 64656c61 7965645f 00006350 6d6f6475 6c655f69 6e697469 616c697a __SNIP__

Disassemble the _ text section and use the-TV switch parameter.

evil:~/Temp mohit$ otool -tv /bin/ls/bin/ls:(__TEXT,__text) section00001ac4        or      r26,r1,r100001ac8        addi    r1,r1,0xfffc00001acc        rlwinm  r1,r1,0,0,2600001ad0        li      r0,0x000001ad4        stw     r0,0x0(r1)00001ad8        stwu    r1,0xffc0(r1)00001adc        lwz     r3,0x0(r26)00001ae0        addi    r4,r26,0x4__SNIP__

In the _ Text Segment, there are four main sections:

  • The machine code after _ text compilation.
  • _ Const common constant data.
  • _ Cstring literal String constant.
  • _ Picsymbol_stub the location-independent code stub routing used by the dynamic linker.

This ensures the obvious isolation between executable and unexecutable code segments.

Running the application knows the format of the Mach-o file. Next, let's take a look at how OSX loads and runs the application. When running an application, shell first calls fork (2) System Call. Fork creates a logical copy of the calling process (Shell) and is ready for execution. The sub-process then calls execve (2) system call. Of course, the program path to be executed must be provided.

The kernel loads the specified file and checks whether its header is a valid Mach-o file. Then begin to explain the load command, replace the child process address space with the sections in the file. At the same time, the kernel also runs a dynamic linker specified by a binary file to load and link all dependent libraries. Call the entry-point function after binding all the necessary symbols for running.

When building an application, the entry-point function usually uses the/usr/lib/crt1.o static Link (standard function ). This function initializes the kernel environment and calls the main () function of the executable file.

The application is running now.

Dynamic linker

OSX dynamic linker/usr/lib/dyld is responsible for loading dependent shared libraries, importing variable symbols and functions, and binding with the current process. When a process is running for the first time, the linker imports the Shared Library to the process address space. Depending on the build method of the program, the actual binding method is different.

  • Bind the load-time binding immediately after loading.
  • Just-in-time is bound when the symbol is referenced.

Pre-binding: If the binding type is not specified, use just-in-time to bind.

The application can continue to run only when all the required symbols and segments are resolved from different target files. To search for libraries and frameworks, the standard dynamic linker/usr/bin/dyld searches for predefined directory sets. To modify the directory or provide a rollback path, you can set the environment variable dyld_library_path or dyld_fallback_library_path.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.