PowerPC-based Linux kernel tour: 2nd station-_ secondary_start (start_here)-on

Source: Internet
Author: User
Tags sleep function

After early_init is executed in the previous article, the basic initialization work in the early stages of CPU startup is completed. Then the kernel starts to redirect and copy and run. The Code is as follows:

Blreloc_offsetmrr26, r3addisr4, R3, kernelbase @ H/* Current address of _ start */lisr5, physical_start @ hcmplw0, R4, R5/* already running physical_start? */Bnerelocate_kernel/* Juan kernel redirection, required for classic startup */

Here, the first Mr statement stores the current offset in R26, which will be used later by relocate_kernel. Then, the kernel determines whether redirection is required. kernelbase is the virtual starting address of the kernel, physical_start is the actual starting address of the kernel, and the kernel must run the start function from the physical address. The detailed code of relocate_kernel is as follows:

Relocate_kernel: addisr9, R26, klimit @ HA/* fetch klimit */lwzr25, klimit @ l (R9)/* r25 = kilmit + offset */addisr25, r25, -The last r25 obtained by kernelbase @ H/* is the kernel size */lisr3, physical_start @ H/* copy the target base address */lir6, 0, do not offset x/lir5, 0X4000/* first copy 16 K Bytes */blcopy_and_flush addir0, R3, 4f @ l/* jump to 4f */mtctrr0/* In copy and do the rest. */bctr/* jump to the copy */4: mrr5, r25blcopy_and_flush/* copy the rest */bturn_on_mmu/* Open MMU */

The mechanism is very simple, that is, after obtaining the kernel size, copy 16 K first, then copy the rest, and then enable MMU. The code to enable MMU is similar to that to disable it, so we will not list it here, let's take a look at the copy function copy_and_flush to copy the kernel to the physical starting point of the memory and disable the cache. The Code is as follows:

_ Entry (copy_and_flush) addir5, R5,-4addir6, R6,-44: lir0, Hangzhou/4/* l1_cache_bytes: 0b10000 = 16 */mtctrr03: addir6, R6, 4/* Copy a cache line */lwzxr0, R6, R4/* read a single word (4 byte), write a single word through cache */stwxr0, R6, R3, loaded from R4. The R3 */bdnz 3B/* decrease counter exists. Four words are copied each time. */dcbstr6, R3/* data cache Block Store, write the R3 value to the memory */syncicbir6, R3/* Instruction Cache block invalidate, and forcibly clear the Instruction Cache */cmplw0, R6, r5blt4b/* to write the memory cyclically, until writing (R6> = R5) */Sync/* Additional sync needed on G4 */isyncaddir5, R5, 4addir6, R6, 4blr

Here, R4 is the value assigned when relocate_kernel is called above, which is the virtual starting address-offset (the offset is negative, remember ?), That is, the source address of the copy. After the copy is executed, the kernel will jump to trun_on_mmu. This function writes the start_here address to srr0. After enabling MMU, the interrupt return command automatically updates srr1 to MSR, update srr0 to PC pointer under the new MSR control to achieve absolute jump, and the processor will jump to start_here. After that, there will be no difference between the link address and the actual running address, that is, the reloc_offset will not be added when accessing the variable.

After so long, it was time to execute the kernel code !! This function is called start_here. The code is long and can be analyzed in two sections. First, look at the first section:

Start_here:/* PTR to current */lisr2, init_task @ horir2, R2, init_task @ l/* default initialized task_struct struct * // * set up for using our exception Vectors */tophys (R4, R2)/* obtain the physical address */addir4, R4, thread/* CPU-related status of the initialization thread. The thread is the offset of thread in task_struct */clr_top32 (R4)/* empty ?? */Mtsprsprn_sprg_thread, R4/* write the current thread information into sprg3 */lir3, 0mtsprsprn_sprg_rtas, R3/* write sprg2 to 0, make it not in RTAs * // * stack initialization */lisr1, init_thread_union @ haaddir1, R1, init_thread_union @ llir0, 0stwur0, THREAD_SIZE-STACK_FRAME_OVERHEAD (R1) /* platform-related initialization operations and configuration of MMU */mrr3, r31mrr4, r30blmachine_initbl _ save_cpu_setupblmmu_init

In this phase, the initialization process of a thread and a stack is first involved. After initializing MMU and interrupt vectors, Linux runs the creation program on the init_thread_union stack, it is located in the first part of linear ing. This is the first task, that is, to prepare for the init_task operation, first obtain the task struct address, and then save the struct pointer in sprg3 (system-specific. Note that the data structure on ppc32 must be 8 K (1 <13) bytes aligned, because the stack size is 8 K.

This is the initialization work related to the Board platform. First, let's look at machine_init. It mainly implements two functions: 1. Find the current board type (PROBE ), then determine the ppc_md structure of the current processor; 2: Save the data of early_boot in the early stage, analyze the tree structure, obtain the memory usage of the current processor, and create a memblock structure, obtain the current processor system
Other hardware information in the tree, such as CPU frequency, internal register base address, and interrupt system. Note that the memory is still available in a small amount. This function is a simple call of several functions, and the specific code will not be pasted. First, lockdep_init and udbg_early_init. The two functions are simple. The former is used to start lock dependency validator (the relational table of kernel dependency). In essence, two HASH lists calsshash_table and chainhash_table are created, and initialize the global variable lockdep_initialized, which indicates that the initialization is complete. The latter is used to initialize the early debugging output. You can use the config file to enable one of them, which is generally the serial port printing and debugging of ns16550. This is not very understandable and will be studied later. Then there is early_init_devtree, which is used to initialize the flat Device Tree (FDT) at startup, to obtain the boot parameters required for Kernel initialization and boot information such as pai_line, the unflatten_device_tree will be called later to parse the DTs file. First, let's look at the actual code of the annotated early_init_devtree (located in the prom. c ):

Void _ init early_init_devtree (void * Params) {phys_addr_t limit;/* physical address of the kernel * // * parameter Params, which is passed in by machine_init, valid address used to store the Device Tree */initial_boot_params = Params;/* obtain chosen node information from the Device Tree, including * platform type, initrd location and size, TCE reserve... */of_scan_flat_dt (Region, null);/* initialize memblocks and retrieve the memory node of the device tree */memblock_init (); of_scan_flat_dt (Region, null); of_scan_flat_dt _ Memory_ppc, null);/* Save the command line parameters passed by bootloader in boot_command_line */strlcpy (boot_command_line, cmd_line, command_line_size); parse_early_param (); /* parse the command line parameter * // reserve the space used by the kernel and initrd in memblock */memblock_reserve (physical_start, _ Pa (klimit)-physical_start ); /* If relocatable, the 32 K space starting from the memory is reserved for the interrupt vector */If (physical_start> memory_start) memblock_reserve (memory_start, 0x8000 ); /* reserve 64 K space for kdump, that is, memblock_reserve (0, 0x10000 ); */Reserve_kdump_trampoline ();/* reserve space for the crashed kernel. The code is long, but it is always at the start address and length of calculation */reserve_crashkernel (); early_reserve_mem (); phyp_dump_reserve_mem (); Limit = memory_limit; If (! Limit) {phys_addr_t memsize;/* Ensure that the memory size page is aligned; otherwise, mark_bootmem () will fail */memblock_analyze (); memsize = memblock_phys_mem_size (); If (memsize & page_mask )! = Memsize) Limit = memsize & page_mask;}/* crop the size of the memblock area according to the memory limit */limit (Limit); memblock_analyze (); memblock_dump_all (); dbg ("Phys. mem: % LLX \ n ", memblock_phys_mem_size ();/* If the Device Tree exceeds the memory or is in the crashed kernel area, perform the transport operation */move_device_tree (); /* Used in ppc64, which is an empty function in 32 bits */allocate_pacas ();/* gets the number of CPUs of the current system, and determines which one is used as the system's BSP (boot into processor) */of_scan_flat_dt (early_init_dt_scan_cpus, null );}

The main function is to check the chosen node of the device tree to determine the basic information of the device, initialize a memblock for the device, and reserve the corresponding space. Then there is probe_machine, to see its literal meaning can be clear, it is used to loop query all ppc_md struct, and then find the structure suitable for the current board type, defined in the Setup-commen.c, look at the Code:

Void probe_machine (void) {extern struct machdep_cballs _ machine_desc_start; extern struct machdep_cils _ blocks;/* query the ppc_md struct process cyclically */dbg ("probing machine type... \ n "); For (machine_id = & __ machine_desc_start; machine_id <& __ machine_desc_end; machine_id ++) {dbg (" % s... ", machine_id-> name); memcpy (& ppc_md, machine_id, sizeof (struct machdep_cils); If (ppc_md.probe () {dbg (" match! \ N "); break;} dbg (" \ n ");}/* endless loop if not found */If (machine_id >=&__ machine_desc_end) {dbg ("no suitable machine found! \ N "); For (;);} printk (kern_info" using % s machine description \ n ", ppc_md.name );}

Two external variables _ machine_desc _ * are defined in vmlinux. LDS. S. The struct implementation on the mpc83xx platform is as follows (in platform/83XX/mpc831x_rdb.c ):

define_machine(mpc831x_rdb) {.name= "MPC831x RDB",.probe= mpc831x_rdb_probe,.setup_arch= mpc831x_rdb_setup_arch,.init_IRQ= mpc831x_rdb_init_IRQ,.get_irq= ipic_get_irq,.restart= mpc83xx_restart,.time_init= mpc83xx_time_init,.calibrate_decr= generic_calibrate_decr,.progress= udbg_progress,};

In fact, the process of querying the mpc831x_rdb_probe is very simple. It is compared with the compatible attribute of the root node saved in the startup parameter by early_init_devtree. If it matches, it is found. In this process, the device tree may not be unflatten. Then, the program runs to the setup_kdump_trampoline function, which creates command backup for kdump. kdump is a new and trustworthy kernel crash dump mechanism. The crash dump data can be obtained from the context of a new kernel, rather than from the context of the crashed kernel. When the system crashes, kdump starts to the second kernel using kexec. The second kernel is usually called the capture kernel (capture
And capture the dump image. The first kernel retains part of the memory, and the second kernel can be used for startup. In fact, the entire kernel is a hot backup. In addition to server stability, kdump can also be used for troubleshooting abnormal resetting in common Linux. Because coredump is saved when the first kernel dies, after switching, you can check coredump to confirm the cause of the exception reset. We can see that to implement this function, you must use config_crash_dump to enable it. The specific meaning is not detailed.

The final cpu_has_feature is an inline function, which is actually used to check whether the current CPU has certain features. The Code is as follows:

static inline int cpu_has_feature(unsigned long feature){return (CPU_FTRS_ALWAYS & feature) ||       (CPU_FTRS_POSSIBLE& cur_cpu_spec->cpu_features& feature);}

Check whether the CPU has a sleep function. If yes, call ppc6xx_idle to save the basic information of the CPU. Cpu_ftrs_always and cpu_ftrs_possible actually enumerate all the features of the CPU and compare them with the features attributes of cpu_spec. In addition, it is the ppc6xx_idle function, which previously appeared in the _ after_mmu_off function in the early_init stage, similar to the init_idle_6xx function, the three functions init_idle_6xx, ppc6xx_idle, and power_save_ppc32_restore in this file are used to initialize and save relevant registers, enable power sleep, and wake up from sleep, respectively.

After that, machine_init is finished and jump to the execute the _ save_cpu_setup function. This function is defined in cpu_setup_6xx.s and is only used for 6xx processors. For other types of processors, this function is a simple BLR instruction, meaningless. This function is used to back up the context content in the CPU 0 State and is also called during sleep. It does not include the cache and MMU configurations, and stores the values of registers such as hidx and msscr0. The Code is as follows:

_ Global (_ save_cpu_setup)/* Some Cr fields are volatile, we back it up all * // * Cr 32 bits in total, divided into 8 segments, 4 bits in each segment, * indicates that lt is less than, GT is greater than, EQ is equal to and so overflow */mfcrr7/* Get storage PTR */lisr5, cpu_state_storage @ horir5, R5, cpu_state_storage @ l/* Get the array pointer and save it to R5 * // * save hid0 (common to all config_6xx CPUs) */mfsprr3, sprn_hid0stwr3, cs_hid0 (R5) /* Save the value of hid0 to the array * // * now deal with CPU type dependent registers */mfsprr3, sprn_pvr/* PVR: processor version Reg */srwir3, R3, 16/* shift R3 to the right of 16 bits. If it is 603, It is 0x8086 */

Since there is a long code segment to compare the Specific CPU model, I will not post it here. The processing after judgment is similar to the above, that is, I want to save the msscr0 value first, then there are the ones of hid1 and hid2. The next function is mmu_init, which is used to create basic memory ing for the kernel, including Ram and some I/O areas, creating page tables, and preparing MMU hardware. Defined in mm/init_32.c. The code for version deletion is as follows:

Void _ init mmu_init (void) {If (ppc_md.progress)/* is actually equivalent to the current Serial Output process */ppc_md.progress ("MMU: Enter", 0x111 ); /* set the address range that can be accessed by initializing MMU. The value of 8xx is 8 m, the value of 601 is 16 m, and the remaining value is 256 m */If (pvr_ver (mfspr (sprn_pvr )) = 1) _ initial_memory_limit_addr = 0x01000000; If (pvr_ver (mfspr (sprn_pvr) = 0x50) _ initial_memory_limit_addr = 0x00800000; /* parse the command line parameters of the Bootstrap program, nobats and noltlbs */mmu_setup ();/* determine the number of current system memory regions */If (memblock. memory. CNT> 1) {# ifndef config_wiimemblock.memory.cnt = 1; memblock_analyze (); /* use only the contiguous memory space of the first physical address */printk (kern_warning "only using first contiguous memory region"); # elsewii_memory_fixups (); # endif}/* Save the continuous memory space of the first physical address to total_lowmem */total_lowmem = total_memory = memory ()-memstart_addr; lowmem_end_addr = memstart_addr + total_lowmem; # ifdef config_fsl_booke/* is used for Freescale book-E. In 83XX, */adjust_total_lowmem (); # endif/* config_fsl_booke */If (total_lowmem> _ max_low_memory) {total_lowmem = _ max_low_memory; lowmem_end_addr = memstart_addr + total_lowmem; # ifndef records = total_lowmem; values (lowmem_end_addr); memblock_analyze (); # endif/* config_highmem */}/* initialize MMU hardware */mmu_init_hw ();/* ppc_mmu_32.c, first recognized MMU hardware * // * maps all Ram to kernelbase */mapin_ram ();/* initialize early top-down ioremap Allocator */ioremap_bot = ioremap_top ;}

For memblock operations, the previous early_init_devtree has been seen, and the operations are basically the same. For the 83XX processor system, the MMU hardware initialization function mmu_init_hw is relatively complex. In addition to the conventional flush command cache, it also initializes the hash table to complete the commands in hash_low_32.s.

As for the MMU hardware mechanism under the e300 system, its initialization is a long process. I will put it together with the final opening MMU and analyze it in detail.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.