Step 7 of Self-writing CPU (7) -- Implementation of multiplication and accumulation commands

Source: Internet
Author: User

I will upload my new book "Write CPU by myself". Today is 30th articles. I try to write them every Thursday.

The sales address of Amazon is as follows. You are welcome to look around!

Http://www.amazon.cn/dp/b00mqkrlg8/ref=cm_sw_r_si_dp_5kq8tb1gyhja4

The sales address of China-pub is as follows:

Http://product.china-pub.com/3804025

The sales address of beifa is as follows:

Http://book.beifabook.com/Product/BookDetail.aspx? Plucode = 712123950 & extra = 0_s25960657


7.8 modify openmips to implement the multiplication, accumulation, and subtraction command 7.8.1 to modify the ID module of the decoding stage

The ID module of the decoding stage needs to add the analysis of multiplication, accumulation, and multiplication and subtraction commands. According to the instruction format given in Figure 7-11, the four commands are all special commands, you can determine the Command Based on the Function Code, as shown in 7-13.


The macro definition is as follows, which is the function code of each instruction in Figure 7-13. You can find these definitions in the defines. V file under the CD Code \ chapter7_2 directory attached to this book.

`define EXE_MADD   6'b000000`define EXE_MADDU  6'b000001`define EXE_MSUB   6'b000100`define EXE_MSUBU  6'b000101
The ID module of the decoding stage is mainly modified as follows. For the complete code, see the ID. V file in the "CD Code \ chapter7_2" directory.

Module ID (......);...... assign stallreq = 'nostop; always @ (*) begin if (RST = 'rstenable) Begin ...... end else begin aluop_o <= 'exe _ nop_op; alusel_o <= 'exe _ res_nop; wd_o <= inst_ I []; // The default destination Register address wd_o wreg_o <= 'writedisable; instvalid <= 'instinvalid; reg1_read_o <= 1 'b0; reg2_read_o <= 1 'b0; reg1_addr_o <= inst_ I [25:21]; // The default reg1_addr_o reg2_addr_o <= inst_ I [20:16]; // The default reg2_addr_o Imm <= 'zeroword; Case (OP )...... 'exe _ special2_inst: Begin // Special Class Command case (OP3 )...... 'exe _ MADD: Begin // MADD command wreg_o <= 'writedisable; aluop_o <= 'exe _ madd_op; alusel_o <= 'exe _ res_mul; reg1_read_o <= 1' B1; reg2_read_o <= 1 'b1; instvalid <= 'instvalid; end 'exe _ maddu: Begin // maddu command wreg_o <= 'writedisable; aluop_o <= 'exe _ maddu_op; alusel_o <= 'exe _ res_mul; reg1_read_o <= 1 'b1; reg2_read_o <= 1 'b1; instvalid <= 'instvalid; end' EXE _ Msub: begin // Msub command wreg_o <= 'writedisable; aluop_o <= 'exe _ msub_op; alusel_o <= 'exe _ res_mul; reg1_read_o <= 1 'b1; reg2_read_o <= 1 'b1; instvalid <= 'instvalid; end 'exe _ msubu: Begin // msubu command wreg_o <= 'writedisable; aluop_o <= 'exe _ msubu_op; alusel_o <= 'exe _ res_mul; reg1_read_o <= 1 'b1; reg2_read_o <= 1 'b1; instvalid <= 'instvalid; end default: begin end endcase // exe_special_inst2 case ...... endmodule

The decoding process of these four commands is similar. The following is a simple description.

(1) because the final result is written into the HI and lo registers instead of the General Register, set wreg_o to writedisable.

(2) because both register values need to be read, set reg1_read_o and reg2_read_o to 1 'b1, by default, the value of the Register address reg1_addr_o read through the regfile module read Port 1 is 21-25bit of the command, which is exactly the RS in the command, by default, the value of the Register address reg2_addr_o read through the regfile module read Port 2 is 16-20 bit of the command, which is the RT in the command. Therefore, the output of the final decoding phase is the value of the Register whose address is Rs, and reg2_o is the value of the Register whose address is Rt.

(3) The values of the Operation Type alusel_o are set to exe_res_mul. However, because there is no general register to be written, the value of alusel_o here does not work or can be set to exe_res_nop.

(4) set the value of the computation subtype aluop_o to correspond to specific commands.

7.8.2 modify the ex module of the execution phase

As shown in figure 7-12, four interfaces are added to the ex module, as shown in table 7-2.


The code of the ex module is modified as follows. For the complete code, see the ex. V file under the CD Code \ chapter7_2 directory.

Module ex (...... // added input interface input wire ['doubleregbus] hilo_temp_ I, input wire [1:0] cnt_ I ,...... // added output interface output Reg ['doubleregbus] hilo_temp_o, output Reg [1:0] cnt_o, output Reg stallreq );...... wire ['regbus] opdata‑mult; wire ['regbus] opdata2_mult; wire ['doubleregbus] hilo_temp; Reg ['doubleregbus] hilo_temp1; Reg ;...... /*************************************** ********* **************************** Section 1: ************************************* *************************************/// (1) obtain the multiplier of the multiplication operation. The commands MADD and Msub are all signed multiplication. If the first // operand reg1_ I is a negative number, the reg1_ I complement is used as the multiplier. Otherwise, directly // use reg1_ I as the multiplier assign opdata‑mult = (aluop_ I = 'exe _ mul_op) | (aluop_ I = 'exe _ mult_op) | (aluop_ I = 'exe _ madd_op) | (aluop_ I = 'exe _ msub_op) & (reg1_ I [31] = 1 'b1 ))? (~ Reg1_ I + 1): reg1_ I; // (2) obtain the multiplier of the multiplication operation. The commands MADD and Msub are signed multiplication. If the second // operand reg2_ I is negative, then the reg2_ I complement is used as the multiplier, and vice versa. // use reg2_ I as the multiplier assign opdata2_mult = (aluop_ I = 'exe _ mul_op) | (aluop_ I = 'exe _ mult_op) | (aluop_ I = 'exe _ madd_op) | (aluop_ I = 'exe _ msub_op )) & (reg2_ I [31] = 1 'b1 ))? (~ Reg2_ I + 1): reg2_ I; // (3) Get the temporary multiplication result and save it in the hilo_temp variable assign hilo_temp = opdata1_mult * opdata2_mult; // (4) the temporary multiplication result is corrected, and the final multiplication result is stored in the variable mulres. There are two situations: // A. If there is a signed Multiplication operation MADD and Msub, the result of the temporary multiplication must be corrected as follows: // A1. If both are positive and negative, the result of the temporary multiplication must be obtained. // hilo_temp, as the final multiplication result, the variable mulres is assigned. // A2. If the multiplier is the same as the multiplier, the value of hilo_temp is used as the value of mulres. // B. If it is an unsigned Multiplication operation of maddu and msubu, the value of hilo_temp is assigned to the variable mulres as the final multiplication result. Always @ (*) begin if (RST = 'rstenable) Begin mulres <= {'zeroword, 'zeroword}; end else if (aluop_ I = 'exe _ mult_op) | (aluop_ I = 'exe _ mul_op) | (aluop_ I = 'exe _ madd_op) | (aluop_ I = 'exe _ msub_op )) begin if (reg1_ I [31] ^ reg2_ I [31] = 1 'b1) Begin mulres <= ~ Hilo_temp + 1; end else begin mulres <= hilo_temp; end end /************************************* ************************************* second paragraph: multiplication, accumulation, and subtraction *********************************** ***************************************/ // Madd, maddu, Msub, msubu command always @(*) begin if (RST = 'rstenable) Begin hilo_temp_o <= {'zeroword, 'zeroword}; cnt_o <= 2 'b00; stallreq_fo R_madd_msub <= 'nostop; end else begin case (aluop_ I) 'exe _ madd_op, 'exe _ maddu_op: Begin // Madd, maddu command if (cnt_ I = 2 'b00) begin // hilo_temp_o <= mulres; cnt_o <= 2 'b01; hilo_temp1 <= {'zeroword, 'zeroword}; stallreq_for_madd_msub <= 'Stop; end else if (cnt_ I = 2 'b01) Begin // The second clock cycle of the execution phase hilo_temp_o <= {'zeroword, 'zeroword}; cnt_o <= 2' B10; hilo_temp1 <= hilo_temp_ I + {hi, lo}; stallr Eq_for_madd_msub <= 'nostop; end 'exe _ msub_op, 'exe _ msubu_op: Begin // Msub, msubu command if (cnt_ I = 2 'b00) begin // The first clock cycle of the execution phase hilo_temp_o <= ~ Mulres + 1; cnt_o <= 2 'b01; stallreq_for_madd_msub <= 'Stop; end else if (cnt_ I = 2 'b01) begin // The second clock cycle of the execution phase hilo_temp_o <= {'zeroword, 'zeroword}; cnt_o <= 2 'b10; hilo_temp1 <= hilo_temp_ I + {hi, lo }; stallreq_for_madd_msub <= 'nostop; end default: Begin hilo_temp_o <= {'zeroword, 'zeroword}; cnt_o <= 2 'b00; stallreq_for_madd_msub <= 'nostop; end endcase end /*********************************** **************************************** section 3: pause the pipeline ************************************** ********************************** // currently the pipeline will be suspended only when the multiplication, accumulation, and subtraction commands are used, so stallreq is directly equal to/stallreq_for_madd_msub value always @ (*) Begin stallreq = stallreq_for_madd_msub; end ...... /*************************************** ************************************ Section 4: modify the write information of HI and lo registers ******************************** **************************************** */always @(*) begin if (RST = 'rstenable) Begin whilo_o <= 'writedisable; hi_o <= 'zeroword; lo_o <= 'zeroword; end else if (aluop_ I = 'exe _ msub_op) | (aluop_ I = 'exe _ msubu_op) Begin whilo_o <= 'writeenable; hi_o <= hilo_temp1 [63: 32]; lo_o <= hilo_temp1 [31: 0]; end else if (aluop_ I = 'exe _ madd_op) | (aluop_ I = 'exe _ maddu_op )) begin whilo_o <= 'writeenable; hi_o <= hilo_temp1 [63: 32]; lo_o <= hilo_temp1 [31: 0]; ...... endmodule


The above code can be understood in four sections.

(1) Section 1: Calculate the multiplication result of the two registers read from the General Register and save it in mulres.

(2) Section 2: Describes the multiplication and accumulation commands. The multiplication and subtraction commands are similar.

  • If cnt_ I is 2'b00, it indicates the first execution cycle of the multiplication accumulate command. In this case, the multiplication result mulres is output to the EX/MEM module through hilo_temp_o through the interface for use in the next clock cycle. At the same time, set the variable stallreq_for_madd_msub to stop, indicating that the pipeline is suspended by the accumulate command.
  • If cnt_ I is 2'b01, it indicates the second execution cycle of the multiplication accumulate command. At this time, the input hilo_temp_ I of the ex module is the multiplication result of the previous clock cycle, therefore, add hilo_temp_ I to the values of HI and lo registers to obtain the final calculation result and save it to the hilo_temp1 variable. At the same time, set the variable stallreq_for_madd_msub to nostop, indicating that the execution of the accumulate command is complete and the pipeline is no longer paused. Finally, set cnt_o to 2 'b10 instead of 2 'b00. The purpose is: if the pipeline is paused for other reasons, because cnt_o is 2 'b10, therefore, the ex stage is no longer computed to prevent repeated run of multiplication and accumulation commands.

(3) Section 3: Provides the stallreq value. Currently, only the multiplication, accumulation, and multiplication and subtraction commands will suspend the pipeline, so stallreq is directly equal to the value of stallreq_for_madd_msub.

(4) Section 4: write information about the HI and lo registers in Section 4 as the multiplication, accumulation, and multiplication instructions write the final result into the HI and lo registers.

7.8.3 modify the EX/MEM Module

As shown in figure 7-12, four interfaces are added to the EX/MEM module, as shown in table 7-3.

The code of the EX/MEM module is modified as follows. The complete code is located in the ex_mem.v file under the CD Code \ chapter7_2 directory attached to this book.

Module ex_mem (...... // input wire from the control module [] stall ,...... // added input interface input wire ['doubleregbus] hilo_ I, input wire [1:0] cnt_ I ,...... // added output API: Output Reg ['doubleregbus] hilo_o, output Reg [1:0] cnt_o); // when the pipeline is paused during execution, send the input signal hilo_ I through the output interface hilo_o, // the input signal cnt_ I is sent through the output interface cnt_o. For the rest of the time, hilo_o is 0, and cnt_o // is also 0. Always @ (posedge CLK) begin if (RST = 'rstenable) Begin ...... hilo_o <= {'zeroword, 'zeroword}; cnt_o <= 2' B00; end else if (stall [3] = 'Stop & stall [4] = 'nostop) Begin ...... hilo_o <= hilo_ I; cnt_o <= cnt_ I; end else if (stall [3] = 'nostop) begin ...... hilo_o <= {'zeroword, 'zeroword}; cnt_o <= 2 'b00; end else begin hilo_o <= hilo_ I; cnt_o <= cnt_ I; end endmodule

7.8.4 modify the openmips Module

Because the above interfaces are added for the ex and EX/MEM modules, You need to modify the openmips module to connect these interfaces. The connection relationship is 7-12. The specific code is not listed in the book, readers can refer to the openmips under the CD Code \ chapter7_2 directory attached to this book. V file.


Code http://download.csdn.net/detail/leishangwen/7858701

Step 7 of Self-writing CPU (7) -- Implementation of multiplication and accumulation commands

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.