Arm startup code analysis

Source: Internet
Author: User

Understanding startup code (ADS)
The startup Code refers to a piece of code executed by the processor at startup. The main task is to initialize the processor mode, set the stack, and initialize variables. since the above operations are closely related to the processor architecture and system configuration, compilation is generally used.
Specifically to s64, the startup code is divided into two parts: one is related to the ARM7TDMI kernel, including the configuration of various exception vectors of the processor and the stack settings of each processor mode. If necessary, copy the vector to ram so that the processor can correctly handle exceptions after REMAP, initialize data (including RW and Zi), and finally jump to main. the second part is related to the external devices of the processor, which is closely related to the manufacturer. although ARM7TDMI kernel is used, different manufacturers integrate different on-chip peripherals and require different initialization. Among them, it is important to initialize wdt and initialize the clock of each subsystem, remap if necessary. this part is similar to the initialization of General controllers. Therefore, this article does not focus on it.
Before analysis, confirm the following concepts:
Flash on s64 Chip starts from 0x100000, a total of 64 KB, on-chip RAM starts from 0x200000, a total of 16 kb.
After s64 is reset, the program starts from 0 and flash is mapped to 0 address. Therefore, s64 can obtain and execute commands. obviously, the address remains at 0x100000. if you use the remap command, the RAM will be mapped to the 0 address. In this case, the content of the 0 address is only a ram image.
Flash of s64 can ensure single-cycle access at 30 MHz in the worst case, while RAM can ensure single-cycle access at the maximum speed.
OK. The startup code is analyzed below.

1. processor exception
S64 treats the exception vectors directly at the beginning of 0, which must be processed. Because the Reset vector is located at 0, a jump command is also required. The specific code is as follows:
Reset
B sysinit; Reset
B udfhandler; undefined
B swihandler; SWI
B pabthandler; prefetch abort
B dabthandler; Data abort
B.; reserved
B vectored_irq_handler
B.; add FIQ code here
 
Udfhandler
B.
 
Swihandler
B.
 
Pabthandler
B.
 
Dabthandler
B.
 
Note that after Instruction B is compiled, it will replace the current Pc value with a correction value (+/-). Therefore, this instruction is irrelevant to the code position, that is, whether this command is executed at 0 or 0x100000, it can jump to the specified location, and ldr pc, = ??? A label value will be directly loaded to the PC. Note that the label will be replaced with a value corresponding to RO after compilation, that is, such a command is executed wherever it is, will only jump to a specified position. the following is a specific example to illustrate the differences between the two:
Assume there are the following programs:
Reset
B init or LDR PC, = init
...
 
Init
...
Here, the reset is the code at the start, that is, the offset of this Code is 0, and the init offset is set to offset. if this program is compiled according to RO = 0x000000, B init can be interpreted as add PC, PC, # offset, and LDR PC, = init can be interpreted as mov PC, # (RO + offset ). obviously, when the system is reset, the program starts to run from 0, and the 0 address has a copy of flash, execute B init to direct the PC to the image code location at the 0 address, that is, Init; if you run ldr pc, = init will direct the PC to the original code in flash. therefore, both of the above can be correctly run. set Ro to 0x200000, compile and generate the code, and write it to flash, that is, 0x100000. After the system is reset, execute the code from the 0 address, or a copy of flash. Execute B init at this time and jump to the init position in the copy for execution. The corresponding code exists here. However, if you execute LDR PC, = init, 0x200000 + offset will be loaded to the PC, which will cause the PC to jump to ram. At this time, because the code is not copied, There is no code in the specified position in Ram, and the program cannot run.

2. processor Mode
Arm's processor can work in multiple modes. Different modes have different stacks. The following describes how to set each mode and its stack.
Predefine some parameters:
Modusr equ 0x10
Modsys equ 0x1f
Modsvc equ 0x13
Modabt equ 0x17
Modudf equ 0x1b
Modirq equ 0x12
Modfiq equ 0x11

Irqbit equ 0x80
Fiqbit equ 0x40

Ramend equ 0x00204000; s64: 16 kB RAM

Vectsize equ 0x100;

Usrstksz equ 8; size of USR Stack
Sysstksz equ 128; size of SYS Stack
Svcstksz equ 8; size of SVC Stack
Udfstksz equ 8; size of UDF Stack
Abtstksz equ 8; size of ABT Stack
Irqstksz equ 128; size of IRQ Stack
Fiqstksz equ 16; size of FIQ Stack

Modify these values to modify the size of the corresponding mode stack.
The code for each mode is as follows:
Sysinit
;
Mrs r0, CPSR
Bic r0, R0, # 0x1f

MoV R2, # ramend
ORR R1, R0, # (modsvc: Or: irqbit: Or: fiqbit)
MSR cpsr_cxsf, R1; enter SVC Mode
MoV sp, r2
Sub R2, R2, # svcstksz

ORR R1, R0, # (modfiq: Or: irqbit: Or: fiqbit)
MSR cpsr_cxsf, R1; enter FIQ Mode
MoV sp, r2
Sub R2, R2, # fiqstksz

ORR R1, R0, # (modirq: Or: irqbit: Or: fiqbit)
MSR cpsr_cxsf, R1; enter IRQ Mode
MoV sp, r2
Sub R2, R2, # irqstksz

ORR R1, R0, # (modudf: Or: irqbit: Or: fiqbit)
MSR cpsr_cxsf, R1; enter UDF Mode
MoV sp, r2
Sub R2, R2, # udfstksz

ORR R1, R0, # (modabt: Or: irqbit: Or: fiqbit)
MSR cpsr_cxsf, R1; enter abt Mode
MoV sp, r2
Sub R2, R2, # abtstksz

; Orr R1, R0, # (modusr: Or: irqbit: Or: fiqbit)
; MSR cpsr_cxsf, R1; enter USR Mode
; MoV sp, r2
; Sub R2, R2, # usrstksz

ORR R1, R0, # (modsys: Or: irqbit: Or: fiqbit)
MSR cpsr_cxsf, R1; enter sys Mode
MoV sp, R2;

3. initialize the variable
After compilation, the connector will generate three basic segments: RO, RW, and Zi, which will be placed sequentially in the image. apparently, RW and Zi are not at the specified RW position at the start of running, so they must be initialized.
LDR r0, = | image $ Ro $ limit |
LDR R1, = | image $ RW $ base |
LDR R2, = | image $ Zi $ base |
1
CMP R1, R2
Ldrlo R3, [R0], #4
Strlo R3, [R1], #4
BlO % B1

MoV R3, #0
LDR R1, = | image $ Zi $ limit |
2
CMP R2, r1
Strlo R3, [R2], #4
BlO % B2

4. Copy the exception Vector
Because the Code has obvious speed advantages when running in Ram, and variables can be dynamically configured, You can map Ram to 0 through REMAP, so that when an exception occurs, arm obtains vectors from Ram.
Import | image $ Ro $ base |
Import | image $ Ro $ limit |
Import | image $ RW $ base |
Import | image $ RW $ limit |
Import | image $ Zi $ base |
Import | image $ Zi $ limit |


Copy_vect_to_ram
LDR r0, = | image $ Ro $ base |
LDR R1, = sysinit
LDR R2, = 0x200000; Ram start
0
CMP r0, r1
Ldrlo R3, [R0], #4
Strlo R3, [R2], #4
BlO % B0

This program copies all the code before sysinit, that is, the exception handling function, to ram. This means that RW cannot be set to 0x200000, which will cause the vector to be washed out.

4. Run in Ram
If necessary and the code is small enough, you can place the code to run in Ram. Since there is no code in Ram itself, you need to copy the code to ram:
Copy_begin
LDR r0, = 0x200000
LDR R1, = reset; = | image $ Ro $ base |
CMP R1, R0;
BlO copy_end;

ADR r0, reset
ADR R2, copy_end
Sub r0, R2, R0
Add R1, R1, R0

LDR R3, = | image $ Ro $ limit |
3
CMP R1, r3
Ldrlo R4, [R2], #4
Strlo R4, [R1], #4
BlO % B3

Ldr pc, = copy_end

Copy_end
The program first obtains the reset connection address and determines whether the program is running in RAM by comparing it with the ram start address. If it is smaller than the start address, code replication is skipped.
When copying code, you must note that there is no need to copy the code before the end of this program. Because the code has been executed, you must first obtain copy_end as the start address of the copy, then, calculate the offset relative to the reset, and add the offset to the Ro value, that is, copy the starting address of the destination, and then start copying.

5. Start the main program
After the preceding steps are completed, you can jump to main to run
Import main

Ldr pc, = Main
B.
6. device Initialization
The main program first needs to initialize the device. For s64, wdt should be initialized first, because by default, wdt is enabled, then the clock distribution of each device, and remap should be made.

The preceding steps are required. You can add some code as needed.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.