Optimization based on VERILOGHDL model

Last Update:2014-12-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1 Introduction

each designer in the Verilog modeling will form their own design style, the same circuit design, with the Verilog description can write many logically equivalent models, and most designers consider the code is written on the convenience and functionality is correct , the optimal structure of the model is considered very little, which not only aggravate the burden of logic synthesis, affect the comprehensive efficiency, but also may lead to the design of the chip does not reach the optimal area and speed. Therefore, when modeling Verilog, it is necessary to optimize the model. 2 Model Optimization Overview indicators that affect the performance of a single chip are mainly two: area and speed. Model optimization is to adjust, combine and streamline the structure of the model by certain means, so that the designed chip achieves smaller area and faster speed. the logic generated by the synthesis is susceptible to the way the model is described. Moving a statement from one location to another, or splitting an expression, can have a significant effect on the generated logic, which could result in a combination of a number of logical gates or a change in its timing characteristics. Therefore, the optimization of logic can be realized by means of some measures. However, since the optimization endpoint contains two aspects of area and speed are contradictory, the optimization of one aspect must affect another aspect, and can not achieve the optimal area and speed, which requires the designer to weigh the two, look at the design emphasis on which aspect, and take different optimization starting point. The method of model optimization is introduced from two aspects of area and velocity respectively. Optimization of 3 area 3.1 extract common subexpression If there is a common subexpression in the mutex branch of the conditional statement, you can extract the common subexpression. The following model can extract the common subexpression: if (enable) P = a& (b + C); else Q = (B+C) | d; expression B+c is evaluated in the mutex branch of the conditional statement in this model, so the expression should be extracted to be assigned before the conditional statement, as shown in the new model: TMP = B+c ; //introduce a temporary variable if (enable) P = a&tmp; else Q = TMP | d; this way, the synthesis tool will synthesize an adder, and the original model will synthesize two adder. In general, if there is a common subexpression found in the logic, you can assign the common subexpression to a temporary variable, and then use the temporary variable to represent the common subexpression, so that the number of ALU units can be reduced to achieve area optimization. 3.2 code Shift If the value of an expression within a looping statement does not change in every loop, you can move the expression outside of the loop. The following model can be used for code shift: P = ... for (i=1;i<=5;i++) begin ... Q = p+5;// Suppose the loop does not assign a new value to P ...end assignment Statement "Q = p+5;" The expression on the right side does not change with the loop variable, so the expression should be moved outside the loop, as shown in the new model: P = ...... TMP = p+5; //introduce a temporary variable for (i=1;i<=5;i++) begin ... Q = tmp; ...end In this way, the integrated tool P+5 "will only combine an adder, and the original model will produce 5 adder, one at a time, resulting in code redundancy." The new optimized model not only reduces the number of Alu units combined, but also improvesThe simulation efficiency. 3.3 resource sharing resource sharing refers to the process of sharing an arithmetic logical unit (ALU) under mutually exclusive conditions. such as the following model: if (num>5) P = a+b; else P = a-c; If resource allocation is not used, the operator "+" and "-" will be integrated into two separate Alu. In the case of resource allocation, the "+" and "-" operations can be achieved with only one ALU. This is because the two operators are always used in mutually exclusive ways. A multi-channel selector is also generated to select the appropriate amount from B and C to the second input of the ALU. In fact, resource allocation is the process of sharing operators. The shared operator has the following possible scenarios: (1) operators are the same and the operands are the same. such as: A+b and A+b, this is the case with "extracting common sub-expressions", which obviously must be shared. (2) operators are the same, there is a difference in the number of computations. such as: A+b and A+c, the need to introduce a multi-channel selector, to make the trade-off between area and speed. (3) operators are the same, the operands are different. such as: A+b and c+d, this time need to introduce two multi-channel selector, to make the trade-off between area and speed. (4) operators are different and the operands are the same. such as: A+b and A-B, can be "+" and "-" to synthesize an ALU unit, to be shared. (5) operators are different, there is a difference in the number of computations. such as: A+b and a-c, the need to introduce a multi-channel selector, to make the trade-off between area and speed. (6) operator and calculation are different. such as: A+b and c-d, this time need to introduce two multi-channel selector, to make the trade-off between area and speed. when sharing the ALU, a multi-channel selector is introduced at an input of the ALU, which increases the delay of the path. Therefore, the designer should be based on the actual situation to optimize the area is important or optimize the speed is important, if it is in the "timing first" design, it is best not to use resource sharing. In addition, for complex operationsUnit, you can use functions and tasks to define these shared data processing modules to reduce the consumption of device resources and reduce costs. 3.4 elimination trigger Some designers are easy to write code for diagrams, like to write assignment statements under the same condition control in a timing control statement, such as the following model: always @ (Posedge CLK) begin case (state) 0: begin prestate <= 1; Dout <= ' h56; end 1: begin prestate <= 0; Dout <= ' h29; endendcaseend The designer's intention is only to save the value of prestate in the trigger triggered by the rising edge, The value of dout is only the combination logic affected by the state, originally only need 1 triggers, and the above model synthesis of the Web table will generate 17 triggers, wasting resources, the optimized model is as follows: Always @ (Posedge CLK) //Derivation of Trigger begin case (state) 0: Prestate <= 1; 1: prestate <= 0;endcaseend always @ (state) Combinatorial logic begin case (state) 0: Dout <= ' h56; 1: Dout <= ' h29; endcaseend3.5 elimination latch derivation latch rules are: (1) variables are assigned values in a conditional statement (if or case statement). (2) variables are not assigned in all branches of a conditional statement. (3) The value of the variable needs to be saved between multiple invocations of the always statement. when the above 3 conditions are met, the variable is deduced as a latch. Some designers may want to make it easy not to assign a variable in all the conditional branches, which results in a lock being created by a variable that does not need to produce a latch, and a waste of resources. the best way to eliminate latches is to identify which variables require latches at design time and which do not. A variable that does not need to derive a latch, is assigned to it in all of its conditional branches, or is initialized before the conditional statement. Optimization of 4 speed 4.1 using brackets parentheses in expressions, you can control the structure of the integrated logic circuit, shorten the circuit's critical path, and achieve speed optimization. For example, for the statement p = a+b-c+d, the synthesis tool will construct the circuit shown in 1 by following the calculation from left to right when synthesizing the right-hand expression. the statement after using parentheses is: P = (a+b)-(c-d), as shown in the integrated circuit 2.
Fig. 1 circuit diagram 2 with parentheses combined without parentheses
Obviously, the depth of the critical path is 3 when parentheses are not used, and the critical path depth after parentheses is 2, which optimizes the speed. 4.2 Extracting the critical path in the circuit design, some of the signal path is longer, or the signal itself is relatively late, resulting in insufficient time to establish the circuit. This signal path, which causes insufficient circuit settling time, is called the critical path. This critical signal path is mostly extracted to be specially treated to minimize its delay. 4.2.1 Extracting repeating variables such as statement P = (a&b&c) | The path to signal B (b&d&e) is the critical path, which can be extracted and processed separately, and the circuit Model 3 before and after extracting the critical path is shown.

Figure 3 Comparison of circuit models
as can be seen from the figure, the path of signal B is changed from 2 to 1, increasing its settling time, shortening the delay, and also reducing one with the door, both increasing the speed and reducing the area. 4.2.2 Extract the first critical path as shown in the following model: always @ (A or B or C or D or E or current_out) begin next_out = current_out; if (!a) begin if (b & ! (d &!e)) next_out =!c; else Next_out = C ; end else if (d &!e) next_out= c; end wherein, the input signal e in the Always statement block is a very tight time critical signal, need special treatment. The processed model is as follows: always @ (A or B or C or D or E or current_out) begin &nbsp ; next_out = Current_out; if (E) begin if (!a) begin if (b) next_out =!c; else Next_out =c; end end else begin if (!a) begin if (b&!d) next_out =!c; else next_out = c; &nbSp; end

else if (d) Next_out = C;

end end The model described above describes the step-up extraction method of the key signal E, and the rewritten description is equivalent to the original always block logic. 5 Other Optimization methods 5.1 references to predefined macro structures in the process library designers can use module instantiation statements to implement predefined function blocks as needed, as if they were treated as components, instantiate them in the model, and then synthesize this instance model. For example, to create an adder, an area-efficient row-wave adder can be called, depending on the area constraint, and a fast, but large, pre-structure adder can be called based on the delay constraint. 5.2 Using small design experimental research to show that the logic circuit is optimized for the Logic optimizer at 2000 to 5000 gates, so the design should be organized into multiple modules or multiple always statement segments as much as possible. The run time of the synthesis process is mainly used for logical optimization, which is exponentially related to design scale, so it is critical to maintain the size of each sub-function block within the scope of the design that can be processed. 5.3 Propagation constants The use of constant propagation techniques can increase the flexibility and portability of circuit model modifications. If you have a constant that is meaningful and is referenced in many places in the model, you can define the constant value as a constant symbol, and then directly reference the constant symbol, as follows: parameter COUNT = 16; ... p = Count*2; for (i=0;i<count-1;i++) ... where count represents the number of loops, you can modify its value in the statement "parameter COUNT = 16;" As needed. Since it is a constant, no hardware is generated for the expression "count*2" and "COUNT-1" when synthesized, and the value of the expression is computed directly at compile time and assigned to the variable. 6 Concluding remarks Generally, the synthesis tool automatically optimizes the Verilog model, but if the designerIn the circuit design, the structure optimization circuit model is written directly, which can greatly reduce the running time of the integrated tools, and sometimes the structure of the integrated tools can not be optimized by manual adjustment to achieve the purpose of optimization. Therefore, it is very important to form a good design style, not only to ensure the correctness of design, but also to pay attention to the efficiency of design, avoid unnecessary repeated correction, so as to improve design efficiency and shorten the development cycle.

Based on VERILOGHDL model optimization

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Optimization based on VERILOGHDL model

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support