OR1200 instruction Cache Use example

Source: Internet
Author: User

The following excerpt from the "Step-by-step core-Soft core processor internal design analysis," a book

Special registers in the 12.4 icache

The Icache interface indicates that it has a special register and is not readable, and the special registers implemented by Icache in the OR1200 processor are seen in table 12.1.

watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvbgvpc2hhbmd3zw4=/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/southeast ">

It can be seen that only a special register--icbir (Instructioncache block Invalidate Register) is implemented, that is, the instruction cache block invalid register, and is not readable register. In general, the address of the special register is 16 bits, the high 5 bits are the group number, and the low 11 bits hold the index of the Special register in the group. However, the index of Icbir found in table 12.1 is arbitrary, because only this special register is implemented in the 4th group. So only the high 5 bits of the special register address are 0x4, then it must be Icbir. This is not necessary to qualify an index, in addition, from figure 12.5 can be found Icache no SPR_ADDR interface, the same reason.

The Icbir format is as seen in Table 12.2.

watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvbgvpc2hhbmd3zw4=/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/southeast ">

Suppose you write an address to Icbir and set it to addr. Then the Icache folder table in the Addr[12:4] line v is 0, indicating that the line is invalid, the actual implementation is to set the Ic_tag of the Addr[12:4] item v to 0.

12.5 Icache Usage Scenarios

The same method as the 10th chapter analyses the MMU. This chapter is also the use of situational analysis method. The analysis of various scenarios using Icache is done to analyze the Icache module.

Scenarios that use Icache have such as the following:

(1) L.MTSPR instruction write Icbir

(2) Icache loss target during instruction taking

(3) The instruction takes the stage Icache hit

(4) The instruction takes the point stage. The memory page where the destination instruction is located suppresses caching

The above four scenarios. Scene (1) in the running stage of the pipeline used to Icache, Scene (2), (3), (4) in the pipeline to refer to the use of Icache.

This chapter will give a demonstration sample program in the section, which involves all the usage scenarios of icache. Then combining the demonstration sample program to analyze the working process of icache under the above-mentioned scenarios to understand the Icache code and principle.

12.6 Analysis Use Cases

This section presents a demo sample program. All of Icache's usage scenarios are covered. The program executes on the simple SOPC established in chapter 11th. The code is as follows:

       . section. Text, "Ax". Global _start. org 0x100 ####################### 1th step ######################## #_start: #初始化r0-r3, all zeroed. At the same time, you can observe the number of clock cycles required to run the L.addi command l.movhi r0,0x0 l.addi r1,r0,0x0 l.addi r2,r0,0x0 l.addi R                                 3,r0,0x0####################### 2nd step ######################## #_IC_Init: # Icache initialization, set the V of the 512 table entries in the Ic_tag to 0, which means that the setting method is to write to the Icbir register, in turn, to 0x0, 0x10, 0x20 ... 0x2000 L.MTSPR r0,r1,0x2000 #ICBIR寄存器的组内索引能够随意.              Just address #的高5位是0x4就可以, take 0x2000 l.sfeqi r1,0x2000                                          #当r1等于0x2000时表示已经设置完成IC_TAG中 #所有512个表项                                          L.BNF _ic_init #假设r1不等于0x2000, indicating that no setup has been completed, #继续循环     L.addi r1,r1,0x10 #延迟槽指令, R1 plus 0x10 L.movhi r1,0x0 #ICache初始化完成后. The R1 is cleared, and the 3rd step is to be #使用r1 ####################### because of the back ####################                                  # # # # L.ori r3,r0,0x1 #r3等于0x1 l.mtspr r0,r3,0x1200 #向地址0x1200的SPR写入0x1, 0x1200 corresponding SPR                                  #是ITLBW0MR0, so here's the setup itlbw0mr0, for #应MR表的第0项放置0x1, #当中VPN为0.                                  Valid for 1 L.ori r3,r0,0x00c0 #r3等于0x00C0 l.mtspr r0,r3,0x1280 #向地址0x1280的SPR写入0x00C0, 0x1280 corresponding SPR #就是ITLBW0TR0, so here is the setting itlbw0tr0, for #应TR表的第0项放置0x00C0.                                  Among them PPN is 0x0. #sxe为1, Uxe is 1,ci for 0 #上述设置使得有效地址0x0 -0x1fff is translated to physical address 0x0-0x1fff, both equal #####################           # # # 4th step ######################## l.ori r3,r0,0x8051 #r3等于0x8051 L.MTSPR r0,r3,0x11 #设置SR寄存器为0x8051, i.e. Sr[ime], SR[ICE] is 1.                                  #使能IMMU, ICache l.nop######################## 5th step ####################### #_loop: #此时ICache已经使能, run the following loop for the first time. The Icache #失靶 occurs when reading the 1th instruction and reads into the memory block where the first instruction is located. A total of 16 bytes. Contains 4 instructions. This will icache hit when running the #2条, 3rd, and 4th instructions, Icache the same time in the back loop #命中 l.addi r1,r1,0x1 #每次循环r1加1 L.sfeqi R 1,0x10 #推断是否循环了16次 L.BNF _loop #假设循环了16次, exit the loop L.addi r2,r2,0x1 #r2记录总循环次            Number ######################### 6th step ######################### l.addi r1,r0,0x0 #r1清零 L.ori r3,r0,0x00c2 L.MTSPR r0,r3,0x1280 #向地址0x1280的SPR写入0x00C2. 0x1280 the corresponding SPR #就是ITLBW0TR0, so here is the setting itlbw0tr0, on #应TR表的第0项放                                  0X00C2, where PPN is 0x0,sxe for 1. #uxe为1, CI is 1, which means that 0x0-0x1fff corresponding memory page is forbidden #缓存 L.J _loop #At this time 0x0-0x1fff the corresponding memory page is forbidden to cache, run #第5步中的循环 again, observe the effect of the operation at this time 

The above procedure can be divided into 6 steps. The main work for each step is as follows:

1th Step: Initialize the Register R0-R3, all zeroed.

2nd step: Initialize the Icache. Because the main body of Ic_tag in Icache is the one-port RAM. When the system is started. The contents of RAM are indeterminate. So it is necessary to initialize the Icache, which is initialized by setting the V of each table entry in Ic_tag to 0. Indicates that the table entry is invalid. Instruction L.mtspr write to the special register Icbir in sequence 0x0, 0x10, 0x20 ... 0x2000 Each table entry in Ic_tag will be set to the V of 0, before the introduction of Icbir.

3rd Step: Set itlb the first table entry, so that the valid address 0x0-0x1fff is translated to 0x0-0x1fff, the actual is the valid address equals the physical address, the demo sample program code is very short, from the address 0x100 start, will not exceed 0x1fff, So when the sample program executes, the result of Immu address translation is that the physical address is equal to the valid address. The role played by Immu will be reflected in the 6th step.

4th step: Set up the SR register. Enable to Immu, ICache.

5th step: This is a circular body with only 4 instructions, and by calculating that the 4 instruction addresses in the loop body are 0x140, 0x144, 0x148, 0x14c, the 4 instructions are located in the same memory block.

Run the loop body for the first time. When reading the 1th instruction in the loop body. The Icache anomaly will occur, and the four instructions (due to the same memory block) will be read into the Icache. At this point in the Icache Data section 12.9 is seen. This will icache hit when running the 2nd, 3, and 4 instructions in the loop body. And in the next loop will be Icache hit. Thus, each instruction in the loop body requires only one clock cycle, the loop runs 16 times, and then exits.

watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvbgvpc2hhbmd3zw4=/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/southeast ">

6th step: Set Itlb's first table entry again, so that the page's property flag bit CI is 1, that is, disable caching and then move to the loop body in step 5th to run. Observing the operation of the loop body at this time, the expected effect should be that although the required instruction is in Icache at this time. However, because the disable cache is set, it is still taken from Ram, and as with no icache, the instruction runs with multiple clock cycles.

The R1 is zeroed before the 6th step moves to the loop body of step 5th, so the R1 in the loop body is added from 0x1 to 0x10 again. But R2 will then increment the previous value, that is, R2 records the total number of cycles.

So when R2 equals 0x0-0x10, it should be the first time the loop body is run, when CI is 0, when R2 is the remaining value. Indicates that it is not the first time to run the loop body, it should have been set to 1 CI, when observing the Modelsim simulation waveform can be inferred from the value of R2 at this time is the number of times to run the loop body, at the same time can know if CI is 1.

Create a new file in Ubuntu Example.s, which is the code above. Copy the Ram.ld, Makefile, Bin2Mem.exe to the Example.s folder, in which the Makefile selects the Makefile after the change in Chapter 10, that is, does not use Or1ksim for simulation. At this point, open the terminal again. Adjust the path to the folder where the above files are located. Enter "Make all" to get the memory initialization file Mem.data that can be used in Modelsim emulation.

Simple SOPC uses this file to initialize RAM. In order to know the corresponding instruction of IF_INSN, ID_INSN, EX_INSN and so on in the simulation waveform, the following list instruction and its corresponding binary system are divided into three columns, each of which is the corresponding binary of instruction address, instruction and instruction.

############################## 1th Step ##################### command Address instruction Instruction the corresponding binary _start:0x100 l.movhi r0,0x0 0x18000000 0x104 l.addi R1,              r0,0x0 0x9c200000 0x108 l.addi r2,r0,0x0 0x9c400000 0x10c       L.addi r3,r0,0x0 0x9c600000############################## 2nd step #####################  Instruction address instruction instruction corresponding binary _ic_init:0x110 L.MTSPR              r0,r1,0x2000 0xc0800800 0x114 l.sfeqi r1,0x2000 0xbc012000 0x118 L.BNF _ic_init 0x0ffffffe 0x11c l.addi r1,r1,0x10 0x9c210010 0x L.movhi r1,0x0 0x18200000############################## 3rd Step ######### ############ Command Address            instruction instruction corresponding binary 0x124 l.ori r3,r0,0x1 0xa8600001 0x1 L.MTSPR r0,r3,0x1200 0xc0401a00 0x12c l.ori r3,r0,0x00c0 0xa86000c            0 0x130 L.MTSPR r0,r3,0x1280 0xc0401a80############################## 4th step ##################### instruction address instruction instruction corresponding binary 0x134 L.ori r3,r0,0x80 Wuyi 0xa8608051 0x138 l.mtspr r0,r3,0x11 0xc0001811 0x13c l.no P 0x15000000############################## 5th Step ##################### Instruction               Address directive instruction corresponding binary _loop:0x140 l.addi r1,r1,0x1 0x9c210001 0x144 l.sfeqi r1,0x10 0xbc010010 0x148 l.bn                 F _loop  0x0ffffffe 0x14c l.addi r2,r2,0x1 0x9c420001############################## 6th Step ##################### instruction address instruction instruction corresponding binary 0x150 L.add              I r1,r0,0x0 0x9c200000 0x154 l.ori r3,r0,0x00c2 0xa86000c2 0x158 L.MTSPR r0,r3,0x1280 0xc0401a80 0x15c l.j _loop 0x03fffff9

Modelsim Simulation Waveform 12.10-12.14 is seen. From the simulation results, it is shown that the Icache effect is in line with expectations. This book contains MODELSIM emulation project under the Chapter12 folder of the CD. The Chapter12/code folder contains the Demo sample program source code.

This chapter will then use the program to specifically analyze the Icache in the various usage scenarios of the work process.

watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvbgvpc2hhbmd3zw4=/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/southeast ">








watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvbgvpc2hhbmd3zw4=/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/southeast ">






OR1200 instruction Cache Use example

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.