Linux-Android啟動之Init進程前傳

來源:互聯網
上載者:User

      對Linux-Android系統的啟動做了一些分析,下面的一篇文章側重講述Linux啟動過程中函數Start_kernel()被調用之前的一些分析,同時也對函數Start_kernel()之後的代碼流程作了概述,我希望關於Linux-Android系統的啟動的專題能夠繼續地寫下去,哈哈。如果有不正確或者不完善的地方,歡迎前來拍磚留言或者發郵件到guopeixin@126.com進行討論,現行謝過。

 

一. 核心自引導程式

1. 核心zimage自解壓

 這部分代碼在arch/${arch}/boot/compressed/head.S中,該檔案的代碼在zimage的產生過程中,將會被打包到zimage中。
 head.S會首先初始化自解壓相關的如記憶體等環境,接下來就去調用decompress_kernel去解壓,並調用call_kernel函數去啟動vmlinux。
 去下面僅僅列舉一下head.S檔案中最重要的部分:
----------------------------------------------------------------
/*
 * We're not in danger of overwriting ourselves.  Do this the simple way.
 *
 * r4     = kernel execution address
 * r7     = architecture ID
 */
wont_overwrite: mov r0, r4
  mov r3, r7
  bl decompress_kernel
  b call_kernel
...
call_kernel: bl cache_clean_flush
  bl cache_off
  mov r0, #0   @ must be zero
  mov r1, r7   @ restore architecture number
  mov r2, r8   @ restore atags pointer
  mov pc, r4   @ call kernel
----------------------------------------------------------------
 其中函數decompress_kernel在arch/${arch}/boot/compressed/misc.c中實現,功能就是完成zimage鏡像的自解壓,顯然該自解壓的過程需

要配置相應的解壓地址等,這部分代碼如下:
----------------------------------------------------------------
ulg
decompress_kernel(ulg output_start, ulg free_mem_ptr_p, ulg free_mem_ptr_end_p,
    int arch_id)
{
 output_data  = (uch *)output_start; /* Points to kernel start */
 free_mem_ptr  = free_mem_ptr_p; /* 顯然,這個地址是從通過寄存器傳進來的 */
 free_mem_end_ptr = free_mem_ptr_end_p;
 __machine_arch_type = arch_id;

 arch_decomp_setup();

 makecrc();
 putstr("Uncompressing Linux...");
 gunzip();
 putstr(" done, booting the kernel./n");
 return output_ptr;
}
----------------------------------------------------------------
 調用call_kernel後首先關閉cache,然後就跳轉到vmlinux入口去執行並將系統的控制權交給了vmlinux。

2. 核心vmlinux入口

>> vmlinux的編譯簡單描述
 因為這裡會牽扯到兩個檔案head.S和head-nommu.S,所以下面簡單的描述一下vmlinux的產生過程。來看一下/arch/${arch}/kernel/makefile

,在該檔案的最後指令碼如下:
----------------------------------------------------------------
#
# Makefile for the linux kernel.
#

AFLAGS_head.o := -DTEXT_OFFSET=$(TEXT_OFFSET)

ifdef CONFIG_DYNAMIC_FTRACE
CFLAGS_REMOVE_ftrace.o = -pg
endif

# Object file lists.

obj-y  := compat.o elf.o entry-armv.o entry-common.o irq.o /
     process.o ptrace.o setup.o signal.o /
     sys_arm.o stacktrace.o time.o traps.o

obj-$(CONFIG_ISA_DMA_API) += dma.o
obj-$(CONFIG_ARCH_ACORN) += ecard.o
obj-$(CONFIG_FIQ)  += fiq.o
obj-$(CONFIG_MODULES)  += armksyms.o module.o
obj-$(CONFIG_ARTHUR)  += arthur.o
obj-$(CONFIG_ISA_DMA)  += dma-isa.o
obj-$(CONFIG_PCI)  += bios32.o isa.o
obj-$(CONFIG_SMP)  += smp.o
obj-$(CONFIG_DYNAMIC_FTRACE) += ftrace.o
obj-$(CONFIG_KEXEC)  += machine_kexec.o relocate_kernel.o
obj-$(CONFIG_KPROBES)  += kprobes.o kprobes-decode.o
obj-$(CONFIG_ATAGS_PROC) += atags.o
obj-$(CONFIG_OABI_COMPAT) += sys_oabi-compat.o
obj-$(CONFIG_ARM_THUMBEE) += thumbee.o
obj-$(CONFIG_KGDB)  += kgdb.o

obj-$(CONFIG_CRUNCH)  += crunch.o crunch-bits.o
AFLAGS_crunch-bits.o  := -Wa,-mcpu=ep9312

obj-$(CONFIG_CPU_XSCALE) += xscale-cp0.o
obj-$(CONFIG_CPU_XSC3)  += xscale-cp0.o
obj-$(CONFIG_IWMMXT)  += iwmmxt.o
AFLAGS_iwmmxt.o   := -Wa,-mcpu=iwmmxt

ifneq ($(CONFIG_ARCH_EBSA110),y)
  obj-y  += io.o
endif

head-y   := head$(MMUEXT).o
obj-$(CONFIG_DEBUG_LL) += debug.o

extra-y := $(head-y) init_task.o vmlinux.lds
----------------------------------------------------------------
 可以看到,檔案的結束位置有一行代碼“head-y   := head$(MMUEXT).o”,其中MMUEXT在/arch/${arch}/makefile中

定義,實際上對於沒有mmu的處理器,MMUEXT就是nommu,而對於包含mmu的處理器,它的值是空,參照MMUEXT在/arch/${arch}/makefile中的相關代碼

如下:
----------------------------------------------------------------
# defines filename extension depending memory manement type.
ifeq ($(CONFIG_MMU),)
MMUEXT  := -nommu
endif
----------------------------------------------------------------
 所以對於諸如S3C6410之類的包含MMU的處理器,實際上最終vmlinux開始位置的代碼就是/arch/${arch}/kernel/head.S.

>> head.S檔案的分析
 需要注意的是,對於該檔案的描述,一般的書籍上可能是僅僅對老版本的linux系統進行了分析,就是說該檔案結束位置直接調用了

start_kernel 函數,至此開始執行c代碼。其實,並不是這樣的。
 下面簡單的列寫一下head.S的內容:
----------------------------------------------------------------/*
 * Kernel startup entry point.
 * ---------------------------
 *
 * This is normally called from the decompressor code.  The requirements
 * are: MMU = off, D-cache = off, I-cache = dont care, r0 = 0,
 * r1 = machine nr, r2 = atags pointer.
 *
 * This code is mostly position independent, so if you link the kernel at
 * 0xc0008000, you call this at __pa(0xc0008000).
 *
 * See linux/arch/arm/tools/mach-types for the complete list of machine
 * numbers for r1.
 *
 * We're trying to keep crap to a minimum; DO NOT add any machine specific
 * crap here - that's what the boot loader (or in extreme, well justified
 * circumstances, zImage) is for.
 */
 .section ".text.head", "ax"
ENTRY(stext)
 msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE @ ensure svc mode
      @ and irqs disabled
 mrc p15, 0, r9, c0, c0  @ get processor id
 bl __lookup_processor_type  @ r5=procinfo r9=cpuid
 movs r10, r5    @ invalid processor (r5=0)?
 beq __error_p   @ yes, error 'p'
 bl __lookup_machine_type  @ r5=machinfo
 movs r8, r5    @ invalid machine (r5=0)?
 beq __error_a   @ yes, error 'a'
 bl __vet_atags
 bl __create_page_tables

 /*
  * The following calls CPU specific code in a position independent
  * manner.  See arch/arm/mm/proc-*.S for details.  r10 = base of
  * xxx_proc_info structure selected by __lookup_machine_type
  * above.  On return, the CPU will be ready for the MMU to be
  * turned on, and r0 will hold the CPU control register value.
  */
 ldr r13, __switch_data  @ address to jump to after
      @ mmu has been enabled
 adr lr, __enable_mmu  @ return (PIC) address
 add pc, r10, #PROCINFO_INITFUNC
ENDPROC(stext)

#if defined(CONFIG_SMP)
ENTRY(secondary_startup)
 /*
  * Common entry point for secondary CPUs.
  *
  * Ensure that we're in SVC mode, and IRQs are disabled.  Lookup
  * the processor type - there is no need to check the machine type
  * as it has already been validated by the primary processor.
  */
 msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE
 mrc p15, 0, r9, c0, c0  @ get processor id
 bl __lookup_processor_type
 movs r10, r5    @ invalid processor?
 moveq r0, #'p'   @ yes, error 'p'
 beq __error

 /*
  * Use the page tables supplied from  __cpu_up.
  */
 adr r4, __secondary_data
 ldmia r4, {r5, r7, r13}  @ address to jump to after
 sub r4, r4, r5   @ mmu has been enabled
 ldr r4, [r7, r4]   @ get secondary_data.pgdir
 adr lr, __enable_mmu  @ return address
 add pc, r10, #PROCINFO_INITFUNC @ initialise processor
      @ (return control reg)
ENDPROC(secondary_startup)

 /*
  * r6  = &secondary_data
  */
ENTRY(__secondary_switched)
 ldr sp, [r7, #4]   @ get secondary_data.stack
 mov fp, #0
 b secondary_start_kernel
ENDPROC(__secondary_switched)

 .type __secondary_data, %object
__secondary_data:
 .long .
 .long secondary_data
 .long __secondary_switched
#endif /* defined(CONFIG_SMP) */

/*
 * Setup common bits before finally enabling the MMU.  Essentially
 * this is just loading the page table pointer and domain access
 * registers.
 */
__enable_mmu:
#ifdef CONFIG_ALIGNMENT_TRAP
 orr r0, r0, #CR_A
#else
 bic r0, r0, #CR_A
#endif
#ifdef CONFIG_CPU_DCACHE_DISABLE
 bic r0, r0, #CR_C
#endif
#ifdef CONFIG_CPU_BPREDICT_DISABLE
 bic r0, r0, #CR_Z
#endif
#ifdef CONFIG_CPU_ICACHE_DISABLE
 bic r0, r0, #CR_I
#endif
 mov r5, #(domain_val(DOMAIN_USER, DOMAIN_MANAGER) | /
        domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | /
        domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | /
        domain_val(DOMAIN_IO, DOMAIN_CLIENT))
 mcr p15, 0, r5, c3, c0, 0  @ load domain access register
 mcr p15, 0, r4, c2, c0, 0  @ load page table pointer
 b __turn_mmu_on
ENDPROC(__enable_mmu)

/*
 * Enable the MMU.  This completely changes the structure of the visible
 * memory space.  You will not be able to trace execution through this.
 * If you have an enquiry about this, *please* check the linux-arm-kernel
 * mailing list archives BEFORE sending another post to the list.
 *
 *  r0  = cp#15 control register
 *  r13 = *virtual* address to jump to upon completion
 *
 * other registers depend on the function called upon completion
 */
 .align 5
__turn_mmu_on:
 mov r0, r0
 mcr p15, 0, r0, c1, c0, 0  @ write control reg
 mrc p15, 0, r3, c0, c0, 0  @ read id reg
 mov r3, r3
 mov r3, r3
 mov pc, r13
ENDPROC(__turn_mmu_on)

#include "head-common.S"
----------------------------------------------------------------
 可能大家注意到,上面有大段的文字是secondary_startup以及CONFIG_SMP等,其實這個是對於SMP系統才會採用的代碼。眾所周知,SMP是對

稱多處理的簡稱,是指系統中使用了一組處理器,各CPU之間共用記憶體子系統和匯流排結構,對應的有非對稱式多處理,嵌入式裝置上我們並不會使用到

SMP的功能。
 乍一看,無論如何也調用不到網上所謂的start_kernel函數中,大家注意看“ldr r13, __switch_data”,這裡就是將函數__switch_data的

地址儲存到r13,並在函數__enable_mmu-->__turn_mmu_on結束位置的“mov pc, r13”中將__switch_data調用起來。而函數__switch_data是實

現在/arch/${arch}/kernel/head-common.S中的一個函數,而函數start_kernel就是由__switch_data調用起來的。
 你一定在奇怪,那麼函數__enable_mmu是怎麼調用起來的呢,呵呵,你簡直是太聰明、太細心了。那趕緊聽我跟你說吧,代碼“add pc,

r10, #PROCINFO_INITFUNC”將會跳轉到/arch/${arch}/mm/proc-arn-926.S中的初始化函數__arm926_setup中,並在該函數結束的位置以“mov pc,

lr”的方式調用__enable_mmu,千萬別告訴我你忘記了前面提到的__enable_mmu的值儲存在lr中哦。
 至於為什麼代碼“add pc, r10, #PROCINFO_INITFUNC”將會跳轉到/arch/${arch}/mm/proc-arn-926.S中的初始化函數__arm926_setup

中,我這裡就不列舉了。可以參照後面我轉載的一篇文章。
----------------------------------------------------------------
 .type __arm926_setup, #function
__arm926_setup:
 mov r0, #0
 mcr p15, 0, r0, c7, c7  @ invalidate I,D caches on v4
 mcr p15, 0, r0, c7, c10, 4  @ drain write buffer on v4
#ifdef CONFIG_MMU
 mcr p15, 0, r0, c8, c7  @ invalidate I,D TLBs on v4
#endif

#ifdef CONFIG_CPU_DCACHE_WRITETHROUGH
 mov r0, #4    @ disable write-back on caches explicitly
 mcr p15, 7, r0, c15, c0, 0
#endif

 adr r5, arm926_crval
 ldmia r5, {r5, r6}
 mrc p15, 0, r0, c1, c0  @ get control register v4
 bic r0, r0, r5
 orr r0, r0, r6
#ifdef CONFIG_CPU_CACHE_ROUND_ROBIN
 orr r0, r0, #0x4000   @ .1.. .... .... ....
#endif
 mov pc, lr
----------------------------------------------------------------
 好了,終於調用到start_kernel了,這是任何版本的linux核心通用的初始化函數。

3. Linux系統初始化

 前面已經提到,函數start_kernel是任何版本的linux核心通用的初始化函數,也是彙編代碼執行結束後的第一個c函數,它實現在

init/main.c中。
 有關start_kernel的代碼很長,初始化了很多東西,比如調用了setup_arch()、timer_init()、init_IRQ、console_init()、

pgtable_cache_init()、security_init()、signals_init()和rest_init()等,這裡只對rest_init()做簡單的分析。
 下面首先列寫一下rest_init()的代碼:
----------------------------------------------------------------
/*
 * We need to finalize in a non-__init function or else race conditions
 * between the root thread and the init thread may cause start_kernel to
 * be reaped by free_initmem before the root thread has proceeded to
 * cpu_idle.
 *
 * gcc-3.4 accidentally inlines this function, so use noinline.
 */

static noinline void __init_refok rest_init(void)
 __releases(kernel_lock)
{
 int pid;

 kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND);
 numa_default_policy();
 pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES);
 kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns);
 unlock_kernel();

 /*
  * The boot idle thread must execute schedule()
  * at least once to get things moving:
  */
 init_idle_bootup_task(current);
 rcu_scheduler_starting();
 preempt_enable_no_resched();
 schedule();
 preempt_disable();

 /* Call into cpu_idle with preempt disabled */
 cpu_idle();
}
----------------------------------------------------------------
 可以看到,函數rest_init()首先會去建立線程kernel_init(注意:這裡和網上或者相關書籍中描述的也不一樣,可能是Linux版本的問題)

,有些文檔中描述這裡建立的是Init線程,雖然名字不一致,但是具體的實現是基本一致的,基本上都是完成根檔案系統的掛載、初始化所有Linux的

裝置驅動(就是調用驅動的初始化函數,類似於CE/Mobile中的Device Manager對裝置驅動的初始化)以及啟動使用者空間Init進程。
 由於手中的rest_init進程和網上描述的都是不一致的,所以這裡也進行了簡要的列舉,代碼如下:
----------------------------------------------------------------
static int __init kernel_init(void * unused)
{
 lock_kernel();
 /*
  * init can run on any cpu.
  */
 set_cpus_allowed_ptr(current, CPU_MASK_ALL_PTR);
 /*
  * Tell the world that we're going to be the grim
  * reaper of innocent orphaned children.
  *
  * We don't want people to have to make incorrect
  * assumptions about where in the task array this
  * can be found.
  */
 init_pid_ns.child_reaper = current;

 cad_pid = task_pid(current);

 smp_prepare_cpus(setup_max_cpus);

 do_pre_smp_initcalls();
 start_boot_trace();

 smp_init();
 sched_init_smp();

 cpuset_init_smp();

 do_basic_setup();

 /*
  * check if there is an early userspace init.  If yes, let it do all
  * the work
  */

 if (!ramdisk_execute_command)
  ramdisk_execute_command = "/init";

 if (sys_access((const char __user *) ramdisk_execute_command, 0) != 0) {
  ramdisk_execute_command = NULL;
  prepare_namespace();
 }

 /*
  * Ok, we have completed the initial bootup, and
  * we're essentially up and running. Get rid of the
  * initmem segments and start the user-mode stuff..
  */

 init_post();
 return 0;
}

static noinline int init_post(void)
{
 /* need to finish all async __init code before freeing the memory */
 async_synchronize_full();
 free_initmem();
 unlock_kernel();
 mark_rodata_ro();
 system_state = SYSTEM_RUNNING;
 numa_default_policy();

 if (sys_open((const char __user *) "/dev/console", O_RDWR, 0) < 0)
  printk(KERN_WARNING "Warning: unable to open an initial console./n");

 (void) sys_dup(0);
 (void) sys_dup(0);

 current->signal->flags |= SIGNAL_UNKILLABLE;

 if (ramdisk_execute_command) {
  run_init_process(ramdisk_execute_command);
  printk(KERN_WARNING "Failed to execute %s/n",
    ramdisk_execute_command);
 }

 /*
  * We try each of these until one succeeds.
  *
  * The Bourne shell can be used instead of init if we are
  * trying to recover a really broken machine.
  */
 if (execute_command) {
  run_init_process(execute_command);
  printk(KERN_WARNING "Failed to execute %s.  Attempting "
     "defaults.../n", execute_command);
 }
 run_init_process("/sbin/init");
 run_init_process("/etc/init");
 run_init_process("/bin/init");
 run_init_process("/bin/sh");

 panic("No init found.  Try passing init= option to kernel.");
}
----------------------------------------------------------------
 可以看到,和網路上相關的描述不一樣的是,這裡首先會去初始化裝置驅動,而不是像網上或者資料上所描述的一樣,首先去載入跟檔案系

統,難道不存在初始化的時候需要訪問檔案的驅動了?或者以前的做法純屬一種安全的考慮?
 這些問題就留到以後對Linux&Android有深入地瞭解之後再去考慮吧!
 好了,搞定!!!

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.