MIPs compilation tips

Source: Internet
Author: User
Tags mips instruction set

Instruction length and number of registers
All MIPS commands are 32-bit, and the Instruction format is simple. Unlike x86, The x86 instruction length is not fixed. Take 80386 as an example,
The instruction length can be 1 byte (for example, push) to 17 bytes.CodeThe density is high, so the MIPs binary file is about 20% ~ larger than that of X86 ~ 30%. Fixed-length commands and formats
The simple advantage is that it is easy to decode and more in line with pipeline operations. Because the specified register position in the instruction is fixed, the decoding process and the read instruction process can be carried out simultaneously, that is, fixed field decoding.
32
General registers. The number of registers is related to the compiler requirements. Register allocation is one of the most important Optimizations in compilation optimization (maybe important ). Current register allocationAlgorithmGraph Coloring Technologies
. The basic idea is to construct a graph to represent the various schemes for allocating registers, and then use this graph to allocate registers. Roughly speaking, we use a limited color to make the nodes in the graph different colors.
The color problem is an exponential function of the graph size. Some heuristic algorithms generate almost linear time distribution. In global allocation, if 16 General registers are used for integer variables, Additional registers are used for Floating Point
Graph Coloring works well. Graph Coloring does not work well when there are few register numbers.
Q: Since there cannot be less than 16, why not 64?
A:

Using 64 or more registers requires a larger instruction space to encode the registers, but also increases the burden of context switching. Except for functions that are not very complex, 32 registers are enough to store.
Frequently used data. It is not necessary to use more registers. At the same time, the computer design has a principle called "the smaller the faster", but it does not mean that using 31 registers will be better than 32, 32 general-purpose registers
Is a popular practice.
Command Format
All MIPS commands have the same length and are 32-bit, but in order to make the command format suitable, the designer made a compromise: All commands have fixed length, but different commands have different formats. There are three formats for MIPS commands: R, I, and J. Each format is composed of several fields (filed), as shown in the following figure:
Type I commands
6 5 5 16
------ | ----- | ------------------ |
| Op | Rs | RT | immediate count operation |
------ | ----- | ------------------ |
Load/Store byte, half word, word, double word
Conditional branch, jump, jump and link register
R-type commands
6 5 5 5 6
------ | ----- | -------- |
| Op | Rs | RT | RD | shamt | funct |
------ | ----- | --------- |
Register-register ALU operation
Read/write registers
Type J commands
6 26
------ | ------------------------------ |
| Op | jump address |
------ | ------------------------------ |
Jump, jump and link
Traps and returned from exceptions
Meaning of each field:
OP: An operation code.
RS: The first source operand register.
RT: the second source operand register.
RD: the destination operand that stores the operation result.
Shamt: displacement
Funct: function. This field selects a specific variant of the OP operation.
All commands are encoded according to one of the three types. The positions of common fields in each format are the same.
This type of Fixed Length and simple format command code is very regular, it is easy to see its machine code, such:
Add $ T0, $ S0, $ S1
$ T0 = $ S0 + $ S1, that is, the content of register 16 (S0) is added to the content of register 17 (S1), and the result is put into Register 8 (t0 ).
In decimal format
------ | ----- | ------ |
| 0 | 16 | 17 | 8 | 0 | 32 |
------ | ----- | ------ |
OP = 0 and funct = 32 indicate that this is addition, 16 = $ S0 indicate that the first source operand (RS) is in register 16, 17 = $ S1 indicates that the second source operand (RT) is in register 17, and 8 = $ t0 indicates that the destination operand (RD) is in register 8.
Write the fields in binary format.
------ | ----- | ------ |
| 000000 | 10000 | 10001 | 01000 | 00000 | 100000 |
------ | ----- | ------ |
This is the machine code of the preceding command. It can be seen that it is regular.
General Register (GPR)
There are 32 general-purpose registers, from $0 to $31:
$0:
That is, $ zero. This register always returns zero, providing a concise encoding form for the useful constant of 0. The MIPs compiler uses commands such as SLT, beq, and BNE and the 0 obtained by the register $0.
To produce all the comparison conditions: equal, unequal, less than, less than or equal to, greater than, greater than or equal. You can also use the Add command to create the move pseudo command, that is
Move $ T0, $ T1
Actually
Add $ T0, $0, $ T1
The predecessor of Jiao Lin mentioned that he had an error in the move command when porting the FPC. Instead, he used add instead.
Using Pseudo commands can simplify tasks and compile Program Provides a richer instruction set than hardware.
$1: $ at. This register is retained for the Assembly. We just mentioned that using pseudo commands can simplify the task, but the cost is to keep a register for the assembly program, that is, $.
By
The immediate numeric segment of type I commands is only 16 bits. When loading large constants, the compiler or assembler needs to split the large constants and then recombine them into registers. For example, to load a 32-bit instant count
Lui and addi commands. Large constants such as MIPS program breaking up and reinstalling are completed by the assembler. The assembler must have a temporary register to reassemble the large constants.
One of the reasons for retaining $.
$2 .. $3 :( $ V0-$ V1) is used for non-floating-point results or return values of subprograms. There is a set of conventions on how subprograms transmit parameters and return results, the content in a few locations in the stack is loaded into the CPU register, and the corresponding memory location is not defined. When these two registers are not enough to store the return value, the compiler completes the operation through the memory.
$4 .. $7:
($ A0-$ A3) is used to pass the first four parameters to the subroutine. Stack is not enough. A0-a3 and v0-v1 and Ra together to support subroutine/process calls, respectively used to pass parameters, return results
And store the return address. When more registers need to be used, a stack is required. The MIPs compiler always leaves space for parameters in the stack to prevent the storage of parameters.
$8 .. $15 :( $ T0-$ T7) Temporary registers, which can be used by subroutines without being retained.
$16 .. $23:
($ S0-$ S7) saves registers and needs to be retained during the process call (saved and restored by the caller, including $ FP and $ RA ), MIPs provides temporary registers and storage registers.
This reduces the number of registers overflow (spilling, which is the process of placing infrequently used variables into memory). When the compiler compiles a leaf process (the process of not calling other processes), it always
The registers to be saved are used only after the temporary registers are allocated.
$24 .. $25 :( $ T8-$ T9) same as ($ T0-$ T7)
$26 .. $27:
($ K0, $ k1) is reserved for operating system/Exception Handling. At least one is reserved. An exception (or interruption) is a process that does not need to be displayed in the program. MIPs has an exception program counter.
(Exception Program
The counter, EPC) Register, which belongs to the cp0 register, used to save the address of the command that caused the exception. The only way to view a control register is to copy it to a general-purpose register.
Mfc0 (move from System
Control) You can copy the address in the EPC to a general register. By using the jump Statement (JR), the program can return the command that causes the exception and continue execution. After careful analysis, we will find that
Interesting thing:
To view the EPC value of the control register and jump to the command that causes the exception (using Jr), the EPC value must be included in a general-purpose register. In this way, the program returns
When you return to the interrupt, all registers cannot be restored to the original value. If all the registers are restored first, the copied value from the EPC will be lost, and JR will not be able to return the interrupt. If we only recover
The registers outside the returned address copied by the EPC, but this means that a register is changed without reason after the program encounters an exception. This is not acceptable. To get rid of this dilemma, MIPS programmers must keep
Two registers $ K0 and $ K1 are used by the operating system. When an exception occurs, the values of the two registers are not restored, and K0 and K1 are not used by the compiler. The exception handler can put the return address to the two
And then use JR to jump to the command that causes the exception and continue the execution.
$28 :( $ GP) There are two storage types in C language: automatic and static. The automatic variable is a process
. Static variables exist during the process of entering and exiting. To simplify access to static data, MIPS retains a register: Global pointer GP (Global
Pointer, $ GP). If there is no global pointer, two commands are required to load data from static data: a valid bit in a 32-bit address constant calculated by the compiler and connector.
Input data. The global pointer only needs the address determined by the runtime in the static data zone. When you access data within the range of 32 kb in the GP value, you only need a GP-based instruction. Number
The data must be within the 64 kB range of the GP-based pointer.
$29 :( $ SP) MIPS hardware does not directly support stacks. For example, it does not have x86 SS, SP, and BP storage.
Although MIPs is defined as a stack pointer as $29, it is a general-purpose register for special purposes. You can use it for other purposes, but in order to use others' programs or allow others to use your programs
It must comply with this agreement, but it has nothing to do with hardware. X86 has independent push and pop commands, but MIPS does not, but this does not affect MIPS's use of stacks. When a process call occurs
The caller pushes the registers used after the process is called into the stack, and the caller pushes the returned address register $ RA and reserved Register into the stack. At the same time, the stack pointer is adjusted. When returned, the register is restored from the stack and
The whole stack pointer.
$30 :( $ FP) the gnu mips c compiler uses frame pointer, which is not used by the sgi c compiler and uses this register as a storage register ($ S8 ), this saves the call and return overhead, but increases the complexity of code generation.
$31:
($ RA) stores the return address. MIPS has a Jal (jump-and-link, jump and link) command. When you jump to an address, put the address of the next instruction in $ Ra. Use
For example, the calling program places the parameter in $ a0 ~ $ A3. Then, Jal x jumps to the X process. After the process is completed, the result is placed in $ V0 and $ V1, and then return the result using Jr $ Ra.
The register to be saved during the call is $ a0 ~ $ A3, $ S0 ~ $ S7, $ GP, $ sp, $ FP, $ Ra.
Jump Range
J
The address field of the command is 26 bits for jump to the target. The command is 4-byte aligned in the memory. The minimum two valid BITs do not need to be stored. In MIPS, the minimum two characters of each address specify a word.
Section, the subscript of the cache ing does not use these two digits, which can represent 28-bit byte addressing, and the allowed address space is 256 MB. PC is 32-bit. Where does the other four come from? MIPs
The jump command only replaces the low 28 bits of the PC, while the high 4 bits reserve the original value. Therefore, to load and link programs, do not span over 256 MB. In the MB segment, the branch jump address is treated as an absolute address, and
PC independent. If it exceeds 256 MB (out-of-segment jump), the jump register command will be used.
Similarly, if the 16-bit instant count in the condition branch command is not enough, you can use PC relative addressing, that is, using the Branch Address in the branch command and (PC + 4) as the branch target. This method is ideal because the general loop and if statements are less than 2 ^ 16 characters (16 power of 2.

 

0 zero always returns 0
1 At is a temporary variable used as the assembler
2-3 v0, V1 subfunction call return results
4-7 parameters of a0-a3 sub-function call
8-15 temporary variables for t0-t7, which do not need to be saved and restored when used by subfunctions
T8-t9 24-25
16-25 s0-s7 sub-function register variable. The sub-function must save and restore the used variables before the function returns, so that the called function knows that the values of these registers have not changed.
K0, K1 is usually interrupted or the exception handling program is used to save some System Parameters
28 GP global pointer. Some operating systems maintain this pointer to facilitate access to the "static" and "extern" variables.
29 SP Stack pointer
30 S8/FP 9th register variables. Sub-functions can be used for callback pointer
31 return location of the RA sub-function □

The usage of these registers follows a series of conventions. These conventions have nothing to do with hardware, but if you want to use other people's code, compilers, and operating systems, you 'd better follow these conventions.

Register name conventions and usage

* At: this register is used by some merging commands compiled. If you want to display the use of this Register (such as saving and restoring the register in the exception handling program ), there is an assembly directive that can be used to prevent the assembler from using the at register after direve ve (but some macro commands in the Assembly cannot be available ).

* V0, V1: used to store the results or return values of non-floating-point operations of a sub-Program (function. If the two registers are insufficient to store the values to be returned, the compiler will use the memory. For details, see section 10.1.

* A0-a3: used to pass the first four non-floating-point parameters when the sub-function is called. In some cases, this is incorrect. For details, see section 10.1.

* T0-t9: As agreed, a subfunction can use these registers without saving them. During expression calculation, these registers are very good temporary variables. The compiler/programmer must note that when calling a sub-function, the values in these registers may be damaged by the quilt function.

* S0-s8: As agreed, sub-functions must ensure that when the function returns, the contents of these registers must be restored to previous values called by the function, you can also remove these registers in sub-functions or save them on the stack and restore them when the function exits. This Convention makes these registers very suitable as register variables or to store some of the original values that must be saved during function calls.

* K0, K1: used by the OS exception or interrupt processing program. The original value will not be restored after being used. Therefore, they are rarely used elsewhere.

*
GP: if a global pointer exists, it will point to your static data (static
Data. This means that, using GP as the base pointer, the system only needs one command to complete data access at around 32 K of the GP pointer. If there is no global pointer, access a static
The value of the data area requires two Commands: one is to obtain the 32-bit address constant determined by the compiler and loader. The other is real access to data. To use GP,
At the time of compilation, the compiler must know whether a data is within the 64 K range of GP. Generally, this is impossible. You can only guess it. The general approach is
(Small global data) is placed within the GP coverage range (for example, a variable is 8 bytes or smaller ), in addition, the linker alarm should be triggered if the small global data is still too large, thus exceeding GP's ability as a base pointer.
The access range.

Not all compilation and running systems support the use of GP.

* SP: the upper and lower of the stack pointer must be displayed using commands. Therefore, MIPS usually adjusts the stack pointer only when the sub-function enters and exits. This is achieved through the called subfunction. The SP is usually adjusted to the lowest stack required by the called sub-function, so that the compiler can access the stack variables on the stack through the offset relative to the SP. For details, see section 10.1 stack usage.

* FP: the other agreed name of FP is S8. If the sub-function wants to dynamically expand the stack size at runtime, FP can be used as the memory pointer to record the stack situation. SomeProgramming LanguageThis is supported by the display. Assembler programmers often use this FP usage. The library function alloca () in C language uses FP to dynamically adjust the stack.

If the bottom of the stack cannot be determined at the time of compilation, you cannot use SP to access the stack variables. Therefore, FP is initialized as a constant position relative to the function stack. This usage is invisible to other functions.

* Ra: when calling any sub-function, the return address is stored in the RA register. Therefore, the last instruction of a sub-program is generally Jr Ra.

If you want to call other sub-functions, you must save the Ra value, usually through the stack.

There is also a standard convention for the usage of floating-point registers. We will be in section 7.5. Here, we have introduced the storage introduced by MIPS.

 

1,
The MIPs instruction set is indeed very good. There are only load, store, and move data classes. Of course, the number of operations is divided into many LWS, LH, and so on. But actually there are three. Only
Only basic functions are completed, and many sub-commands are divided according to the number of operations. There are fewer jump classes, either unconditional jump or jump Based on the operands. These commands are indeed the most commonly used 80%. Compared with intel
The Lea and other commands are rarely used due to my personal habits, and I have almost never used commands such as aad and AAA.

2. There are few MIPS commands, but the assembler defines many
Pseudocommands, such as Li and ror. It is eventually extended into multiple actual commands. In this way, the advantage is the ability to save effort, but the disadvantage is that the assembler has a high requirement, and it is difficult to restore the machine instruction to a pseudo-finger after disassembly.
(The Anti-assembler faces lui $ at, 0xabcd and Ori R, $ at, 0xef00 and does not seem to be able to regard it as Li, R,
0xabcdef00); the number of commands produced by disassembly is large, which is not conducive to hack (or perhaps a good thing ).

3. The MIPs addressing method is the simplest. Only registers are added with the offset addressing method (embedded 16-bit instant addressing is not included ), this is a great thing for people suffering from Intel's eight addressing methods.

4. There are no stack operation commands for MIPS, although there are conventions commonly known as $ sp. The stack must be manually managed for recursive calls. When calling a subroutine, there is no call command for automatic stack pressure. Jal can only be used. This is another nightmare for people who are familiar with Intel's push and pop.

5. MIPS's memory ing, interrupt, and other functions all achieve 0 in the coprocessor and 1 in the floating point operation.

6. There are many MIPS registers, which are advantageous for expression evaluation, but the scheduling algorithm is complicated. In addition, although there is a conventional usage of registers, there is actually no limit.

7. The MIPs commands are fixed and uniform, which gives me a very good feeling.

In the end, my personal experience is another way of thinking in the MIPs system. Because the stack is managed manually, you don't have to consider whether push, pop match, and operand size, however, manual stack management requires no mind
Often clear; because there are more registers, it is more important to consider register scheduling and how to make full use of the potential of all registers; there is no need to worry about addressing. MIPs provides
With higher flexibility, the design process can be more free, but it also increases the difficulty of communication and learning, which is totally different from Intel's strict architecture.


According to the characteristics of MIPs, because the MIPs instruction set is simple, easy to design and implement, and the size can be small, the direction of MIPs is not only embedded, but also multi-core, which improves the degree of parallelism.
Applications with high sending performance, such as servers. In terms of desktop applications, there is no obvious advantage of X86 at present. Speed: on the one hand, there are few MIPS applications, the instruction set is too concise, and the degree of friendliness to programmers is not good.
.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.