MAC OS x Application Format detailed

Source: Internet
Author: User
Tags stub

OS X Application Format explained

How OS X executes the application

Translator: 51test2003 translated from http://0xfe.blogspot.com/2006/03 ... s-applications.html

As a long-term UNIX user, I usually have some tools to troubleshoot system failures. Recently, I was developing software and added Apple's OS X system support; However, unlike other traditional UNIX variants, OS X does not support many of the tools associated with loading, linking, and executing programs.

For example, when a shared library relocation error occurs, the first thing I do is run LDD on the executable file. The LDD tool lists the shared libraries (including the path) on which the executable file depends. But in OS X, trying to run LDD will error.

evil:~ mohit$ Ldd/bin/ls

-bash:ldd:command not found

Not found? But it's basically on all UNIX. I want to know if Objdump is available.

$ objdump-x/bin/ls

-bash:objdump:command not found

Command not found. What's going on?

The problem is that unlike Linux, Solaris, HP-UX, and many other Unix variants, OS X does not use Elf binaries. In addition, OS X is not part of the GNU project. This project contains tools like LDD and Objdump.

In order to get the list of shared libraries on which the executable depends on OS X, you need to use the Otool tool.

evil:~ mohit$ Otool/bin/ls

Otool:one Of-fahlltdoortmrihscis must be specified

Usage:otool [-FAHLLDTDORSTMRIHVVCXM] object_file ...

-F Print the FAT headers

-A print the archive header

-H print the Mach header

-L PRINT the load commands

-L print shared libraries used

-D print shared library ID name

-T print the text section (disassemble with-v)

-P start dissassemble from routine name

-S Print contents of section

-D Print the data section

-O Print the OBJECTIVE-C segment

-R Print the relocation entries

-S Print the table of contents of a library

-T print the table of contents of a dynamic shared library

-M Print the module table of a dynamic shared library

-R Print the reference table of a dynamic shared library

-I print the indirect symbol table

-H Print the Two-level hints table

-V Print verbosely (symbolicly) when possible

-V Print disassembled operands symbolicly

-C print argument strings of a core file

-X print no leading addresses or headers

-M don ' t use archive (member) syntax

evil:~ mohit$ otool-l/bin/ls

/bin/ls:

/usr/lib/libncurses.5.4.dylib (Compatibility version 5.4.0, current version 5.4.0)

/usr/lib/libsystem.b.dylib (Compatibility version 1.0.0, current version 88.0.0)

Much better. We can see that/bin/ls quoted two dynamic libraries. Although, file extensions are not familiar to us at all.

I believe that many unix/linux users have similar experiences with OS X systems, so I decided to write a little bit of knowledge about OS X executables that I currently know.

The OS X Runtime Schema Runtime environment is a framework for code extensions on OS X. It consists of a set of definitions of how code is loaded, managed, and executed. Once the application runs, the appropriate runtime environment loads the program into memory, resolves references to external libraries, and prepares code for execution.

OS X supports three runtime environments:

DYLD Runtime Environment: a recommended environment based on the Dyld library manager.

CFM Runtime Environment: OS 9 Legacy environment. Applications that are actually designed to use new features of OS X but have not yet been fully ported to DYLD.

The Classic Environment: OS 9 (9.1 or 9.2) programs do not need to be modified to run directly on OS X.

This article focuses on the DYLD runtime environment.

Mach-o executable file format in OS X, almost all files that contain executable code, such as: applications, frameworks, libraries, kernel extensions ..., are implemented in the Mach-o file. Mach-o is a file format and an ABI (Application binary interface) that describes how an executable file is loaded and run by the kernel. Professionally speaking, it tells the system:

which dynamic library loader to use

Which shared library to load.

How to organize the process address space.

function entry point address, and so on.

Mach-o is not a new thing. Originally used by the Open Software Foundation (OSF) to design the Mach micro-core OSF/1 operating system. Later transplanted to the x86 system OpenStep.

In order to support the DYLD runtime environment, all files should be compiled into the mach-o executable file format.

Organization of the Mach-o file

The Mach-o file is divided into three regions: header, loading command area section, and raw segment data. The header and Load command area describe file functions, layouts, and other features; The original segment data contains a sequence of bytes referenced by the load command. To investigate and examine the parts of the Mach-o file, OS X comes with a very useful program Otool, which is located in the/usr/bin directory.

Next, you'll use Otool to understand how the Mach-o file is organized.

Head to view the Mach-o header of the file, using the-h parameter of the Otool command

evil:~ mohit$ otool-h/bin/ls

/bin/ls:

Mach Header

Magic cputype cpusubtype filetype Ncmds sizeofcmds Flags

0xfeedface 0 2 1608 0x00000085

The head first specifies the magic number. The magic number indicates whether the file is a 32-bit or 64-bit Mach-o file. Also indicates the CPU byte order. Explanation of the magic number, see/usr/include/mach-o/loader.h.

The header also specifies the destination schema for the file. This allows the kernel to ensure that the code does not run on a CPU that is not written for this processor. For example, in the above output, Cputype is set to 18, which represents CPU_TYPE_POWERPC, which is defined in/usr/include/mach/machine.h.

From the last two information, we infer that this binary file is used for 32-bit PowerPC based systems.

Sometimes a binary file may contain more than one system of code. Commonly referred to as Universal Binaries, usually begins with Fat_header this extra head. Check the Fat_header content, using the-F switch parameter of the Otool command.

The Cpusubtype attribute sets the exact model of the CPU, usually set to Cpu_subtype_powerpc_all or Cpu_subtype_i386_all.

FileType indicates how the file is aligned and how it is used. It actually tells you that the file is a library, a static executable, a core file, and so on. The above filetype equals mh_execute, indicating demand paged executable file. The following is a fragment from/usr/include/mach-o/loader.h that lists the different file types.

#define MH_OBJECT 0x1/* relocatable OBJECT file */

#define Mh_execute 0x2/* Demand Paged executable file */

#define MH_FVMLIB 0x3/* Fixed VM Shared library file */

#define Mh_core 0x4/* CORE file */

#define MH_PRELOAD 0x5/* Preloaded executable file * *

#define MH_DYLIB 0x6/* Dynamically bound shared library */

#define Mh_dylinker 0x7/* Dynamic Link Editor */

#define MH_BUNDLE 0x8/* Dynamically bound BUNDLE file */

#define MH_DYLIB_STUB 0x9/* Shared library STUB for static */

/* Linking only, no section contents */

The next two properties involve loading the command section, specifying the number and size of the commands.

Finally, status information is obtained, which may be used by the kernel when loading and executing.

The Load Command load command section contains a list of commands that tell the kernel how to load each of the original segments in the file. A typical description of how to align and protect the layout of each segment and segment in memory.

View the list of loading commands in the file, using the-l switch parameter of the Otool command.

Evil:~/temp mohit$ otool-l/bin/ls

/bin/ls:

Load Command 0

CMD lc_segment

Cmdsize 56

Segname __pagezero

VMADDR 0x00000000

Vmsize 0x00001000

Fileoff 0

FileSize 0

Maxprot 0x00000000

Initprot 0x00000000

Nsects 0

Flags 0x4

Load Command 1

CMD lc_segment

Cmdsize 600

Segname __text

Vmaddr 0x00001000

Vmsize 0x00006000

Fileoff 0

FileSize 24576

Maxprot 0x00000007

Initprot 0x00000005

Nsects 8

Flags 0x0

Section

Sectname __text

Segname __text

Addr 0x00001ac4

Size 0x000046e8

Offset 2756

Align 2^2 (4)

Reloff 0

Nreloc 0

Flags 0x80000400

Reserved1 0

Reserved2 0

[___snipped for brevity___]

Load Command 4

CMD Lc_load_dylinker

Cmdsize 28

Name/usr/lib/dyld (offset 12)

Load Command 5

CMD lc_load_dylib

Cmdsize 56

Name/usr/lib/libncurses.5.4.dylib (offset 24)

Time Stamp 1111407638 Mon Mar 21 07:20:38 2005

Current version 5.4.0

Compatibility version 5.4.0

Load Command 6

CMD lc_load_dylib

Cmdsize 52

Name/usr/lib/libsystem.b.dylib (offset 24)

Time Stamp 1111407267 Mon Mar 21 07:14:27 2005

Current version 88.0.0

Compatibility version 1.0.0

Load Command 7

CMD Lc_symtab

Cmdsize 24

Symoff 28672

Nsyms 101

Stroff 31020

Strsize 1440

Load Command 8

CMD Lc_dysymtab

Cmdsize 80

Ilocalsym 0

Nlocalsym 0

Iextdefsym 0

Nextdefsym 18

Iundefsym 18

Nundefsym 83

Tocoff 0

Ntoc 0

Modtaboff 0

Nmodtab 0

Extrefsymoff 0

Nextrefsyms 0

Indirectsymoff 30216

Nindirectsyms 201

Extreloff 0

Nextrel 0

Locreloff 0

Nlocrel 0

Load Command 9

CMD lc_twolevel_hints

Cmdsize 16

Offset 29884

Nhints 83

Load Command 10

CMD lc_unixthread

Cmdsize 176 Flavor Ppc_thread_state

Count Ppc_thread_state_count

R0 0x00000000 R1 0x00000000 R2 0x00000000 R3 0x00000000 R4 0x00000000

R5 0x00000000 R6 0x00000000 R7 0x00000000 R8 0x00000000 R9 0x00000000

R10 0x00000000 R11 0x00000000 R12 0x00000000 R13 0x00000000 R14 0x00000000

R15 0x00000000 R16 0x00000000 R17 0x00000000 R18 0x00000000 R19 0x00000000

R20 0x00000000 R21 0x00000000 R22 0x00000000 r23 0x00000000 R24 0x00000000

R25 0x00000000 R26 0x00000000 r27 0x00000000 r28 0x00000000 r29 0x00000000

R30 0x00000000 R31 0x00000000 CR 0x00000000 Xer 0x00000000 LR 0x00000000

CTR 0x00000000 MQ 0x00000000 vrsave 0x00000000 srr0 0x00001ac4 srr1 0x00000000

The above file has 11 load commands directly located under the head, from 0 to 10.

The first four commands (lc_segment), from 0 to 3, define how the segments in the file are mapped into memory. A segment defines a sequence of bytes in a mach-o binary binary file that can contain 0 or more sections. Let's talk about paragraph later.

Load Command 4 (lc_load_dylinker) specifies which dynamic linker to use. Almost always set to OS X default dynamic linker/usr/lib/dyld.

Commands 5 and 6 (lc_load_dylib) specifies the shared library to which the file needs to be linked. They are loaded by the dynamic linker specified in command 4.

Commands 7 and 8 (Lc_symtab, lc_dynsymtab) specify the symbol tables used by the file and dynamic linker respectively. Command 9 (lc_twolevel_hints) contains a two-level namespace for hint table. Finally, Command ten (Lc_unixthread), defines the initial state of the main thread of the process. This command is only included in the executable file.

Segments and Sections

Most of the load commands involved above refer to the segments in the file. A segment is a sequence of characters that mach-o files are directly mapped to virtual memory by the kernel and dynamic linker. The header and load command areas are considered the first paragraph of the file. A typical OS X executable file typically consists of the following five paragraphs:

__pagezero: Fixed at virtual address 0 without any protection rights. This segment does not occupy space in the file, and accessing null causes an immediate crash.

__text: Contains read-only data and executable code.

__data: Contains writable data. These sections are typically copy-on-write by the kernel flag.

__OBJC: Contains data that is used by the Objective C language runtime environment.

__linkedit: Contains the raw data for the dynamic linker.

The __text and __data segments may contain 0 or more sections. Each section consists of data of the specified type, such as executable code, constants, C strings, and so on.

To view a section content, use the otool command-s option.

Evil:~/temp mohit$ OTOOL-SV __text __cstring/bin/ls

/bin/ls:

Contents of (__text,__cstring) section

00006320 00000000 5f5f6479 6c645f6d 6f645f74

00006330 65726d5f 66756e63 73000000 5f5f6479

00006340 6c645f6d 616b655f 64656c61 7965645f

00006350 6d6f6475 6c655f69 6e697469 616c697a

__snip__

Disassembly __text section, using the THE-TV switch parameter.

Evil:~/temp mohit$ Otool-tv/bin/ls

/bin/ls:

(__text,__text) Section

00001AC4 or R26,R1,R1

00001ac8 Addi R1,R1,0XFFFC

00001ACC RLWINM r1,r1,0,0,26

00001ad0 Li r0,0x0

00001ad4 STW r0,0x0 (R1)

00001ad8 STWU r1,0xffc0 (R1)

00001ADC lwz r3,0x0 (R26)

00001ae0 Addi r4,r26,0x4

__snip__

In the __text section, there are four main sections:

__text: Compiled machine code.

__const: General-purpose constant data.

__cstring: Literal string constant.

__picsymbol_stub: The location-independent code stub route used by the dynamic linker.

This preserves the obvious isolation of executable and unenforceable code in the segment.

Run the application now that you know the format of the Mach-o file, let's see how OS X loads and runs the application. When you run the application, the shell first calls the fork () system call. Fork creates a logical copy of the calling process (shell) and is ready to execute. The child process then calls the EXECVE () system call, which of course needs to provide the program path to execute.

The kernel loads the specified file and checks to see if its header verification is a valid Mach-o file. You then begin to interpret the load command to replace the child process address space with the segments in the file. At the same time, the kernel executes the dynamic linker specified by the binary file, starting to load and link all dependent libraries. Call the Entry-point function after you have bound the individual symbols that are necessary to run.

In the build application, the Entry-point function is typically from/USR/LIB/CRT1.O static link (standard function). This function initializes the kernel environment and invokes the main () function of the executable file.

The application is now running.

Dynamic linker

OS X Dynamic Linker/usr/lib/dyld, responsible for loading dependent shared libraries, importing variable symbols and functions, and binding to the current process. When the process first runs, the linker does the import of the shared library into the process address space. Depending on how the program is build, the actual binding is done in a different way.

Bind--load-time bindings immediately after loading.

--just-in-time binding when a symbol is referenced.

Pre-binding

If no binding type is specified, the just-in-time binding is used.

The application can continue to run only when all required symbols and segments are resolved from different target files. To find libraries and frameworks, the standard dynamic linker/usr/bin/dyld will search for a predefined collection of catalogs. To modify the directory, or provide a rollback path, you can set the Dyld_library_path or DYLD_FALLBACK_LIBRARY_PATH environment variable

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.