SAP Senior Development Engineer Fan Decheng
October 25, 2014
Before writing this article, the first question I thought about was the quality of the code. And in the premise of writing good code, the code's comments become another part of the code quality-it does not look so big at first glance, but the more it becomes more important in the back. When a diligent programmer voluminous thousands of lines of code for a large project, he went on to do another module of the project. A year later, when he came back to see the thousands of lines of code he had written earlier, it would have scratched the head if there were no meaningful comments-because no comments had been written.
Why does this happen? The programming languages we use today, such as Java, C #, and Python, are not high-level languages. It's not just the code that's going to be given to the machine, it's a code that can be read. Indeed, they are high-level languages. The problem, however, is that the code itself only writes about how to do one thing, and the result of doing it. As for what it does, why it should be done, and under what circumstances it can be used to do this, the code itself is not reflected.
Therefore, annotations have their own unique uses. The quality of the code itself comes from correctness, security, usability, readability, maintainability, and efficiency. Good annotations can help the code improve its readability and maintainability, and ultimately have a positive impact on several other parties, such as correctness.
So, in order to ensure the quality of the code itself, the comment should be how to write, to be effective. Here are some of the lessons I've learned from my personal work over the years. It is suitable for most imperative (imperative) programming languages:
1. Write a clear comment for the complex function interface.
2. Write clear important details in the comments.
3. The annotations themselves do not have redundant information.
4. Note to be updated at any time.
5. When encountering complex, non-intuitive implementations, write a comment for the implementation.
6. Note the full name and meaning of the variable or function name that is simplified, abstract, and abbreviated.
7. Do not annotate the self-explanatory code.
8. Do not write redundant annotations for frequently changing code.
1th, write a clear comment for the complex function interface. The function interface here refers to the specification of the function name and its parameters, return values, exceptions, and so on. More strictly, the comment on the function interface defines the contract for a function. Although the programming language we use does not necessarily support contract-oriented programming, or we do not choose such a programming pattern, we can still conceptually use annotations to represent the contract of a function.
How to write the comment of the function interface. In C #, for example, one of its functions has a function name, a parameter, and a return value. At the same time, the parameters and return values have their own type, and a potential exception is thrown. Below, it is an imaginary interface to a function that does merge sort (the interface is obviously too complex, but it is intended to demonstrate how to write comments):
BOOL Mergesort<t> (t[] array, int begin, int len, icomparer<t> comparer)
{
}
The C # language supports XML annotations, which function like Javadoc. Moreover, its XML annotations just support all the content of the contract we want to annotate. Therefore, we can write with XML annotations. For this function, our comments are as follows:
<summary>//Performs merge sort on a part of an array with a comparer. </summary>//<typeparam name= "T" >the type of members in the array to be sorted</param>//<par Am Name= "array" >the array to sort</param>//<param name= "Begin" >the beginning index of the part in the AR
Ray to be sorted</param>//<param name= "len" >the length of the part in the array to be sorted</param> <param name= "Comparer" >the comparer used to compare elements in the array; See documentation on <see cref= "system.collections.generic.icomparer<t>" >IComparer</see> for more
information</param>//<returns>//Whether The part of the array is already sorted. </returns>//<remarks>//<para>this method checks whether the specified part of the source array is already sorted. If It is already sorted, the method returns true directly without changing the array. Otherwise, it sorts the Part and returns false.</para>//<para>this method performs stable sort on the specified part of the array. </para>//<para>after Calling this method, the range of the array from position <paramref name= "Begin"/& Gt And of length <paramref name= "Len"/> is sorted.</para>//<para>the time complexity of this method is O (n log n), where n is the length of the part being sorted.</para>//</remarks>//<exception cref= "Syst Em. IndexOutOfRangeException ">//IndexOutOfRangeException exception is thrown if the indices <paramref name=" begin
"/> and <paramref name=" Len "/> is out of range. </exception> bool Mergesort<t> (t[] array, int begin, int len, icomparer<t> comparer)
Chinese translation is as follows:
<summary>//////use a comparator to perform a merge sort on a piece of content in an array. </summary>//<typeparam name= "T" > element type of sorted array </param>//<param name= "array" > Array to sort </ param>//<param name= "Begin" > The beginning subscript of the segment to sort in the array </param>//<param name= "Len" > The length of the segment to sort in the array </ param>//<param name= "comparer" > comparators used to compare the size of elements in an array; see <see cref= "for more information. System.collections.generic.icomparer<t> ">IComparer</see> 's documentation.
</param>///<returns> whether the array segment is originally ordered. </returns>//<remarks>//<para> This method checks whether the specified segment of the source array is already in an ordered state. If so, this method will not modify the contents of the array and return true directly. Otherwise, this method sorts the specified segments and returns FALSE. </para>///<para> This method the sort performed by the specified segment of the array is a stable sort. </para>///<para> After calling this method, the array begins with the <paramref name= "Begin"/>, and the segments of <paramref name= "Len"/> are sorted. </para>///<para> The time complexity of this function is O (n log n), where n is the length of the sorted part. </para>//</remarks>/<exception cref= "System.IndexOutOfRangeException" >//If subscript <paramref Name= "Begin"/> and <paramref NA IndexOutOfRangeException exception is thrown when the range of the ame= "Len"/> is overrun. </exception> bool Mergesort<t> (t[] array, int begin, int len, icomparer<t> comparer)
The XML annotations for C # also support more tags, such as example (example). But in our daily programming process, these tags are only used occasionally, as needed. And what I'm talking about above is often used. Let's take a look. First, for this function, we have an introduction (see Summary section). Then, for each parameter, and the return value of the function, we have to explain. Typical exceptions need to be explained, but not every one is necessary. For complex functions, in the introduction there is no way to summarize the interesting in a sentence, you need to write a note (Remarks section).
One of the functions of the introduction, and strive to use a sentence (up to two sentences) what the function should be done to clarify what things to do. Comments on the parameters and return values, to tell them what they mean, and to give their special values a little bit. An example of a special value is a value that differs from the value of a peaceful simultaneous. For example, for some optional arguments, passing in null means ignoring the parameter, so the value is a special value and needs to be described. The annotations section adds the details of some function contracts. For example, a precondition (a condition that needs to be met before a function call), a post-condition (what the data will look like after a function call, such as the sorted state here is the post condition), the space-time complexity (if required), a typical application, Special applications (This is important for some business APIs that need to be executed in a specific context) and so on.
Here, I would like to highlight the comment for the exception. In each language, there are different grammatical requirements for exceptions. C # does not support checked exception, its designer Anders Hejlsberg also does not recommend that we use checked Exception;java to require exceptions other than program bugs as checked exception. The so-called checked exception, is such a number of exception types, when they are thrown by the current function, the current function must be in the prototype (that is, the interface of the function) to declare these exceptions. The benefit of this is that the caller knows which exceptions will be received. The downside is that applications can be very inconvenient to extend: When you need to add a new exception class from the bottom, you either have to catch these exceptions in the application's body where they are called, or you have to declare them in the application's function prototype. Failure to do so results in a compilation error. For some libraries, they are required to derive all their exception classes from a base class for the sake of the application, so that the application only needs to declare that base class. For this reason, we write comments that write comments only for typical exceptions (exceptions that are easily encountered in real-world situations). Those anomalies that are hard to come by, or even theoretically impossible, are not written at all. And, when necessary, although the checked exception declaration may be a base class, our comments reflect the specific occurrence of the subclass exception.
2nd, write clear important details in the notes. It has been brought to this point in the previous example: the time complexity and stability of the sorting algorithm are important details.
3rd, the annotations themselves do not have redundant information. Comments are used to interpret the program. When writing comments, for the completeness of the annotations, the contents may overlap with the meaning of the program code, i.e. redundant information. This is difficult to avoid altogether. However, the redundancy that occurs between the different parts of the annotation can be avoided. To avoid this redundancy, there is no definite method, mainly in the process of writing comments to read a few times, the repetition of the deletion of the content, some places can be referenced in the way to avoid writing duplicate comments.
4th, note to be updated at any time. This is what we have to do to have high-quality annotations. Each time a function, interface, or contract is expected to change, the comment should be updated accordingly. Of course, in real software engineering, the perfect time to update is difficult to guarantee, but at least to strive for big changes, the comments are enough new, and, we read the program code in the process if the comments are not correct, you can also investigate the behavior of the program, and to update the comments accordingly.
5th, when encountering complex, non-intuitive implementations, write comments for the implementation as well. Sometimes, a function is very brief and its implementation is self-explanatory. But there are times when a function is more complex to implement. This can be because of complex algorithms, complex business logic, and so on, where the comment that represents the execution step is convenient. This is demonstrated by the following topological sort function:
def topological_sort (graph, Output_func): # time complexity is O (n ^ 2) while Len (graph) > 0: # Output Dependency-free nodes to_pop_node_name = [] for node_name in graph: # Remove and Output all No
Des without dependencies If len (graph[node_name].dependencies) = = 0:output_func (node_name)
To_pop_node_name.append (node_name) # Remove the nodes for Node_name in To_pop_node_name: Graph.pop (node_name) to_pop_node_name = None # finished using # Remove dependency links F or node_name in Graph:current_node = graph[node_name] To_pop_node_name = [] for child _node_name in Current_node.dependencies:if child_node_name No in Graph:to_pop_node_ Name.append (Child_node_name) for child_node_name in To_pop_node_name:current_node.dependencie
S.pop (Child_node_name) To_pop_node_name = None # finished using
The Output dependency-free nodes, Remove the nodes, etc. are all comments on a block of code. Such annotations can represent the steps of a program. This annotation can also span a larger range, at which point the technique is to use curly braces or the words of begin and end to indicate their scope. As shown below:
int i;
int max =-1;
int sum = 0;
Do the first thing {for
(i = 0; i < arr. Length; i++) {
if (Max < 0 | | arr[i] > max) {
max = arr[i];
}
}
}
//Do the second thing {for
(i = 0; i < arr. Length; i++) {
sum + = Arr[i];
}
// }
For those bosses who do not like to see curly braces in comments, use begin, end instead:
Begin of "Do the second Thing" for
(i = 0; i < arr. Length; i++) {
sum + = Arr[i];
}
End of "Do the second thing"
The advantage of using curly braces in annotations or begin and end accurately is that when a chunk of code has nested annotations, a range can still be clearly represented. In addition, my personal criteria is not to add a note to the Step 1, step 1.1, step 2 and other words, the reason is very simple, once in the original steps to insert a new step, then from this step back all the steps of the number to adjust, too much.
In addition, for complex algorithms or business requirements, you also need to add comments, so as not to look back a few years later in this code, forget why it was originally written. For the algorithm, it is necessary to clarify what the requirements of this algorithm are, how it is designed, what input needs to be handled, what the principle of processing is, and so on. For business logic, it is necessary to describe the input conditions to be supported (including situations such as global, static variables, and environmental data such as databases), the meaning of all processing steps in the business process, the impact on data and business status after processing has been completed, and so on. As an example of an algorithm, the following example comes from a method that checks whether a diagram has a loop before the topology is sorted:
def detect_loop (graph, O_loop): "" "detect_loop:detects any loop in a graph.
Parameters:graph-the graph to test o_loop-a list to receive the looping nodes Return value: True If a loop has been detected.
False otherwise.
"" "# We are using the Depth-first search to find loops. # # If We do not implement recursive calls, we could use Trace-back in a # non-recursive manner; Python default recursion limit is about which # are in general enough here, as the dependency we analyze are Usuall
Y less # than 100. # # Due to the fact so if we traverse a non-tree directed acyclic graph # (DAG), we may end-in a time complex ity of O (2 ^ n), we make a deep # copy of the graph first, and make it into a tree. During the process, we # can detect loops.
The time complexity is O (n ^ 2) where n is the number # of nodes. result = False Copied_graph = graphnode.deep_copy_graph (graph) #Cases: # Root 1 leads to a Loop--will is detected and the function would return.
# The loop'll has a link pointing back to an ancestor node or the # current node itself. # Root 1 leads to a dag--any link to a visited node (cannot is an ancestor # node or the current node itself) would Be detected and removed, and # made into a tree # root 1 leads to a DAG (call it DAG1), Root 2 links to Dag1- -no Loop can # involve DAG1. Reason:suppose there is a loop involving DAG1 and then # from a node that's a part of the intersection, we can go b
Ack to # It through the links, thus making DAG1 not a dag--contradiction. # so if Root 2 leads to a loop--the loop would be detected by checking the # DFS traversal stack # if all of RO OT 1..n-1 leads to DAGs, and root n leads to a loop, it'll # be detected only there # # accessed:used to Mark accessed nodes in the DAG. Its members is the # Names of ACCessed nodes.
When an accessed node was met through a link, # The traversal returns and the link is removed, because the link should
# not being added to the tree. Accessed = set () # of node name (string) # Traversal_stack:is used to record the loop to show to the user Travers Al_stack = [] for node_name in Copied_graph:result = Detect_loop_rec (copied_graph, node_name, accessed, Trave
Rsal_stack) If result: # Loop detected o_loop.extend (traversal_stack) break
return result
6th, to simplify, abstract and abbreviated variable name or function name, comment on its full name and its meaning. For example, when you use Winnt4wks to represent Microsoft Windows NT Workstation 4 i386 multiprocessor free, you should have the full name written on the right (or above) in a commented manner:
Winnt4wks:microsoft Windows NT Workstation 4 i386 multiprocessor free
object Winnt4wks;
Note In the above example, if the comment is written above the variable name, first use the variable name pilot, and then add a colon, followed by the explanation.
7th, do not annotate the self-explanatory code. This is natural. For example a piece of code everyone knows what to do, there is no need to write comments. Writing a comment is to interfere with the audio-visual, do not have to change the code in the future need to maintain, or forget maintenance, resulting in the misleading situation. For example, the comments in the following code are completely unnecessary in the production code:
Loop and print every element of the array for
(i = 0; i < arr. Length; i++) {
Console.WriteLine (arr[i]);
}
8th, do not write redundant annotations for frequently changing code. As mentioned before, the meaning of comments and code may be a little coincident. At this point, if a piece of code often changes, then, basically, the owner of the code knows what this code means because it has just been changed recently. The necessary annotations are still added, but the redundant information that is duplicated with the meaning of the code is less necessary and can be omitted.
As a programmer, with the above points in hand, you can write good annotations to make your code easier to read and maintain.