Http://bbs.ednchina.com/BLOG_ARTICLE_3003247.HTM
Topic Five: Radical operation
Although the radical operation is not as common as addition and multiplication, it has its own use. In a project last year, the author is responsible for the use of the root operation of the module, the beginning of the use of Altera IP Core, verify that the module use no problem, but because the platform conversion, need to switch to the Xilinx platform, many IP Core also need to transfer, and finally simply write one of them, Includes multipliers and dividers in the first few topics.
The root operation module also uses a nonrestoring algorithm similar to the divider, which contains input d, the output root result q, and the remainder R; implemented in an FPGA, an iterative approach is used to approximate the result, where the iteration of each level is: Ti+1=ti (3–TI2)/2,ti is 1/ The approximate value of Q.
The Verilog HDL code is as follows:
Module sqrt
# (Parameter d_width=32,
Q_WIDTH=D_WIDTH/2,
R_width=q_width + 1)
(
Input CLK,
Input rst,
input [d_width-1:0] D,
Output reg [q_width-1:0] Q,
Output reg [r_width-1:0] r,
Input Ivalid,
Output Reg Ovalid
);
reg [d_width-1:0] d_t[q_width:1];
reg [q_width-1:0] q_t[q_width:1];
Reg signed [r_width-1:0] r_t[q_width:1];
Reg IVALID_T[Q_WIDTH:1];
[Email protected] (Posedge CLK)
Begin
if (RST)
Begin
R_t[q_width]<={r_width{1 ' B0}};
D_t[q_width]<={d_width{1 ' B0}};
Q_t[q_width]<={q_width{1 ' B0}};
Ivalid_t[q_width]<=1 ' B0;
End
Else
Begin
if (ivalid)
Begin
R_t[q_width]<={r[r_width-3:0],d[d_width-1:d_width-2]}-{{q_width-1{1 ' b0}},2 ' B01};
d_t[q_width]<=d;
Q_t[q_width]<={q_width{1 ' B0}};
Ivalid_t[q_width]<=1 ' B1;
End
Else
Begin
R_t[q_width]<={r_width{1 ' B0}};
D_t[q_width]<={d_width{1 ' B0}};
Q_t[q_width]<={q_width{1 ' B0}};
Ivalid_t[q_width]<=1 ' B0;
End
End
End
Generate
Genvar i;
for (I=Q_WIDTH-1;I>=1;I=I-1)
Begin:u
[Email protected] (Posedge CLK)
Begin
if (RST)
Begin
Q_t[i]<={q_width{1 ' B0}};
R_t[i]<={r_width{1 ' B0}};
D_t[i]<={d_width{1 ' B0}};
Ivalid_t[i]<=1 ' B0;
End
Else
Begin
if (ivalid_t[i+1])
Begin
if (r_t[i+1]>=0)
Begin
q_t[i]<={q_t[i+1][q_width-2:0],1 ' B1};
R_t[i]<={r_t[i+1][r_width-3:0],d_t[i+1][2*i-1:2*i-2]}-{1 ' b0,q_t[i+1][q_width-4:0],1 ' b1,2 ' B01};
d_t[i]<=d_t[i+1];
Ivalid_t[i]<=1 ' B1;
End
Else
Begin
q_t[i]<={q_t[i+1][q_width-2:0],1 ' B0};
R_t[i]<={r_t[i+1][r_width-3:0],d_t[i+1][2*i-1:2*i-2]} + {1 ' b0,q_t[i+1][q_width-4:0],1 ' b0,2 ' B11};
d_t[i]<=d_t[i+1];
Ivalid_t[i]<=1 ' B1;
End
End
Else
Begin
Q_t[i]<={q_width{1 ' B0}};
R_t[i]<={r_width{1 ' B0}};
D_t[i]<={d_width{1 ' B0}};
Ivalid_t[i]<=1 ' B0;
End
End
End
End
Endgenerate
[Email protected] (Posedge CLK)
Begin
if (RST)
Begin
Q<={q_width{1 ' B0}};
R<={r_width{1 ' B0}};
Ovalid<=1 ' B0;
End
Else
Begin
if (Ivalid_t[1])
Begin
if (r_t[1]>=0)
Begin
q<={q_t[1][q_width-2:0],1 ' B1};
r<=r_t[1];
End
Else
Begin
q<={q_t[1][q_width-2:0],1 ' B0};
R<=r_t[1] + {1 ' b0,q_t[1][q_width-3:0],1 ' b0,1 ' B1};
End
Ovalid<=1 ' B1;
End
Else
Begin
Q<={q_width{1 ' B0}};
R<={r_width{1 ' B0}};
Ovalid<=1 ' B0;
End
End
End
Endmodule
The combined results are as follows:
Number of Slice registers:677
Number of Slice luts:1105
Minimum Period:3.726ns (Maximum frequency:268.384mhz)
1 is shown in the simulation diagram:
Figure 1
Reproduced FPGA implementation Development Operation Verilog