Arm Assembly language subroutine Design Method

Source: Internet
Author: User

In the development process of embedded software systems, a large number of applications are developed using C language to improve development efficiency. At the same time, the system often contains some key modules that determine the overall system performance. In order to achieve the best performance, they are often written in assembly language, or in some special circumstances, such as operating hardware, you must also use the assembly language.

Functions are an important concept in C language. In assembly languages, we often use subroutines or procedures (subroutine or procedure) to express the same concept. This article uses the term subroutines. This article first introduces the general method of designing the arm assembly language subroutine, and proposes a new design method based on stack frame, and introduces the Interaction Technology with C language.

1 General Method

Generally, in arm assembly language, the BL (branch and link) command is used to call a subroutine. The BL command first saves the return address in the Link register R14 (also called LR, then jump to the target address. After the sub-routine is executed, copy the content of R14 to the PC to return the content from the sub-routine.

...

BL subr; call subr

... ; Return here

Subr

... ; Subroutine body

MoV PC, LR; returned from subr

This method is sufficient for leaf routines (that is, routines that do not call other child routines), but it cannot process nested or recursive calls. Assume that the subr uses BL to call another sub-routine, LR will be rewritten by the return address of the next call, resulting in an endless loop that cannot be returned from subr. To solve this problem, subr must save LR before calling the second subroutine. Furthermore, to enable the subroutine to call another subroutine in any depth, some method must be taken to save any number of return addresses. The most common method is to save the return address in the stack, as shown in the following example:

Subr

Stmfd sp !, {R4-R12, LR}; save all the working registers and return addresses and update the stack pointer

... ; Subroutine body

Ldmfd sp! {R4-R12, PC}; recover all working registers and load the PC with the saved return address,

; Update the stack pointer

At the child routine entry point, you can save any working registers and LR that need to be used in the subr to the stack and pop them up at the exit point, so that you can safely call the child routine, you do not have to worry that the return address is rewritten and cannot be returned from the child routine. Note that the return address is directly used to load the PC at the exit point. It is equivalent to the following two commands:

Ldmfd sp! {R4-R12, LR}

Mov pc, LR

2 stack frame-based child routines

Although the previous child routine design method can meet the design requirements, it is not suitable for programmers familiar with x86 assembly languages. As we all know, the x86 assembly language subroutine has a standard stack structure, as shown in 1. A notable feature of EBP is that the EBP register is used as a reference point to reference parameters and local variables. For example, the first parameter is located at the address [EBP + 8. The advantage of stack frame is that it unifies the programming style of assembly subroutines. parameters, return addresses, work registers, or local variables have fixed positions, this not only improves code readability, but also facilitates code maintenance. Based on the above considerations, the concept of stack frame is introduced into the design of the arm assembly language subroutine, as shown in the following example. For ease of use, assume that the subr prototype is int subr (int A, int B, int C, int D, int e, int F );, obviously according to APCS (arm process call standard), parameter a-d is passed through the Register R0-R3, and the remaining two parameters E and F are passed through the stack. The final stack frame structure 2 is shown. Compared with the x86 Frame Structure in figure 1, the only difference is that the location of the local variable is opposite to that of the working register, the reason for this difference is to make full use of the advantages of the Multi-register load-store command in arm.

Caller

... The passing code of the-d parameter is omitted.

MoV R4, #2

STR R4, [Sp, #-4]! ; 1) Push parameter f into the stack

MoV R4, #1

STR R4, [Sp, #-4]! ; Push parameter E into the stack

BL subr; 2) Call the subroutine subr

Add SP, SP, #8; 8) balance the stack. Subr is returned here, and the returned value is saved in R0.

 

Subr

Stmfd SP !, {R4-R7, FP, LR}; 3) Save the working register, FP, and LR

Add FP, SP, #16; 4) Calculate the frame pointer

Sub sp, SP, #8; 5) Allocate space for local variables

LDR R4, [FP, #8]; loading parameter E

LDR R5, [FP, #12]; load parameter F

... ; Subr subroutine body

Add SP, SP, #8; 6) Release local variable space

Ldmfd SP !, {R4-R7, FP, PC}; 7) recover register and return

Figure 1 x86 stack frame structure

Figure 2 stack frame structure in arm

The following details how to build a stack frame step by step. The sequence number corresponds to the sequence number in the sample code comment:

1) Generally, STR Rn, [SP, #-4] are used. The command pushes the parameters required by the subroutine into the stack. Note that according to APCS, first consider passing parameters through the Register R0-R3, and the remaining parameters are pushed in reverse order into the stack. This step can be omitted if all parameters can be passed through the Register R0-R3.

2) the BL command pushes the return address into the stack and jumps to the specified subroutine for further execution. Since then, all stack modification work has been transferred to the child routine.

3) if the child routines need to use R4-R11 working registers, they must be pushed into the stack; while the old frame pointer register FP and link register LR are pushed into the stack, these tasks can be efficiently completed in one command.

4) Adjust the frame pointer FP so that it can be used to reference stack parameters and variables. In this example, you can use LDR R0, [FP, #8] to reference the first local variable by referencing e, LDR R0, [FP, #-20.

5) allocate 8 bytes of stack space to store the local variables of the subroutine. However, if you do not need to use local variables, skip this step. Unlike the x86 processor of the CISC architecture, the ARM processor of the RISC architecture has a large number of General registers, such as R0-R7 and LR in this example, therefore, in most cases, you do not need to allocate stack space for local variables.

6) if the stack space is previously allocated to local variables, release them to maintain a stack balance.

7) recover the registers stored in the stack in step 3, which is returned from the subroutine by directly loading the PC register.

8) the subr is returned here after execution. This step is very important. Since caller pushes the parameters e and f into the stack before calling subr, after returning the parameters from subr, caller must bring the two parameters to the stack to maintain a balance of the stack. Of course, if you call a subroutine from a C language, the compiler will be responsible for completing stack balancing.

3. Interaction between Assembly Language and C Language

After compiling the assembly subroutines, the next question is how to call them in the C language. Essentially, no matter which language is used for code writing, the routines that call other modules in each other must follow a common parameter and result transfer convention. For ARM, this Convention is called the ARM Process calling standard, which defines:

L specific usage of General registers

L type of stack used

L parameter and result Transmission Mechanism

L support the ARM shared library Mechanism

Since the code generated by the compiler always strictly follows APCS, you only need to ensure that the compiled Assembly Code complies with APCS. The following example shows how to call a sub-routine written in assembly language in C to implement the memory copy function. The development environment is RealView MDK3.22a.

Define and export the mymemcpy. s file of mymemcpy

; R0 destination address; R1 points to the source address; R2 copy Length

AREA Demo, CODE, READONLY

EXPORT mymemcpy

Mymemcpy

Stmfd sp !, {R4, LR}

MOV R3, R0; fetch the destination address

MOV R12, R1; Retrieve the source address

Copy

CMP R2, #0; exit if the length is less than or equal to 0

BLE exit

SUB R2, R2, #0x1

BEQ exit

Ldrb lr, [R12], #0x1

Strb lr, [R3], #0x1

B copy

Exit

LDMFD R13 !, {R4, PC}

END

// Main. c testing program

Extern void * mymemcpy (void * dst, const void * src, size_t size );

Int main (int argc, char ** argv)

{

Const char * src = "First string-source ";

Char dst [] = "Second string-destination ";

Mymemcpy (dst, src, strlen (src) + 1 );

Return (0 );

}

The key to calling a C function from an assembly language is how to pass Parameters correctly based on the C function prototype. The following example shows how to call the strcmp function of library C. Its prototype is int strcmp (const char * s1, const char * s2). It has only two pointer type parameters, so R0 and R1 point to the first and second strings respectively. Note that the C library function is used. Select the Use MicroLib option in the Project Options dialog box and Target tab.

AREA |. text |, CODE, READONLY

EXPORT main; EXPORT main

IMPORT _ main

Import strcmp; import strcmp Functions

Main

Stmfd SP !, {R4, LR}; save lR

ADR r0, big; Pass Parameter 1 through R0

ADR R1, small; Pass Parameter 2 through r1

BL strcmp; call the strcmp Library Function

Ldmfd SP !, {R4, PC}

Big

DCB "big", 0

Small

DCB "small", 0

END

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.