1. Summary
This is the second chapter of the bad Code series, in the article I will discuss with you how to evaluate the code as efficiently and objectively as possible.
After publishing something about the bad code (above), I found that this article was unexpectedly popular, and many people also described (TU) the problem of one or the other in their own code.
Recently the Department in the organization bootcamp, just I am responsible for training the code quality part, in the training course let everybody spend a lot of time to discuss, improve, perfect own code. Although the students just graduated for the code quality are very intentions, but the final appearance of the quality still did not achieve "very good" degree. The main reason is not to understand the good code "should" what.
2. What is a good code
The first step in writing code is to understand what good code is. When I was preparing for the Bootcamp course, I made a difficult case for this problem, and I tried to distinguish between "excellent", "good", "bad" by some precise definition, but in the process of summing up, the description of "What Is nice code" is mostly not operable.
2.1. Definition of good code
Search for "Elegant code" from the Internet and find the following definition:
Bjarne Stroustrup,c++ 's father:
- The logic should be clear, the bug is difficult to hide;
- Minimal reliance and ease of maintenance;
- Error handling is entirely based on a clear strategy;
- Near-optimal performance, avoiding code clutter and non-principled optimization;
- Neat code does only one thing.
Grady Booch, "Object-oriented analysis and design"
- Neat code is simple and straightforward;
- Neat code, read like a well-written prose;
- Neat code never obscures the designer's intentions, but has a small amount of abstraction and clear control lines.
Michael Feathers, "The Art of code change"
- Neat code always seems to be written by someone who cares about the quality of the code;
- Where there is no obvious need for improvement;
- The author of the code seems to take all the things into account.
It seems to be very reasonable, but the actual judgment is difficult to reference, especially for new people, how to understand the "simple, direct code" or "there is no obvious need to improve the place"?
And the practice process, many students also do face this problem: to their own code is always in a state of mind, or feel very good, but others think it sucks, and even a few times I and new classmates because of code quality standards for a few days to discuss, But no one can persuade us: we all insist that our standards for good code are correct.
After countless code review, I think this picture seems to sum up a little better:
The evaluation standard of code quality is somewhat similar to literary works in a sense, for example, the evaluation of the quality of the novel is mainly from its readers, which form a relatively objective evaluation by the individual subjective evaluation. Not relying on the number of words, or the author used some rhetorical devices such as seemingly completely objective but the actual lack of meaningful evaluation means.
But the code and the novel are somewhat different, it actually exists two readers: computer and programmer. As I said in the last article, even if all programmers don't understand the code, it can be understood and run by the computer.
So for the definition of code quality I need to analyze from two dimensions: subjective, part of the human understanding, and objective, the condition of running in the computer.
Since there is a subjective part, then there will be individual differences, for the same piece of code evaluation will be due to look at the code of the person's level of different conclusions, this is the problem that most newcomers face: they do not have an evaluation standard can be executed, so the code quality written out is also difficult to improve.
Some of the articles that introduce code quality are about tendencies or principles, although they are quite right, but the actual guidance does not work. So in this article I would like to try to show the standard of the evaluation code (which I think) is independent of the actual level.
2.2. Readable code
After a long trade-off, I decided to prioritize the readability: Would a programmer prefer to take over a bug-and-read project, or a bug-less project? If it's the latter, you can just turn off the page and do something more meaningful to you.
2.2.1. Verbatim translation
In a lot of books related to the quality of the code emphasized a point of view: The program is first to show people, the second is to be executed by the machine, I also agree with this point of view. In evaluating a piece of code can make people understand, I used to let the author translate this code verbatim into Chinese, try to form a sentence, then read the Chinese sentence to another person who has not seen the code to listen to, if another person can understand, then the readability of this code is basically qualified.
The reason for this way of judging is simple: other people do it when they understand a piece of code. The person reading the code will read a word, infer the meaning of the sentence, if only by the sentence can not understand, then need to contact the context to understand the code, if the simple contact context can not understand, may also have to grasp more details of other parts to help infer. In most cases, the more context you need to understand what a code is doing, the more it means that the quality of the code is worse.
The advantage of verbatim translation is that it allows authors to easily discover assumptions and readability traps that only they know and are not embodied in the code. The code that cannot literally translate the original meaning is mostly rotten code, such as "Ms stands for Messageservice", or "Ms.proc () is a message", or "TMP represents the current file".
2.2.2. Following the convention
Conventions include how code and documents are organized, how comments are written, coding style conventions, and so on, which are important for future maintenance of your code. There is no mandatory standard for what conventions to follow, but I prefer to abide by more people's engagements.
Keeping style consistent with open source projects is generally more reliable, followed by the company's internal coding style. But if the company's internal coding style and the current open-source project style conflict is more serious, it often means that the company's technology tends to be closed, or have some to keep up with the rhythm.
But in any case, complying with an agreement is better than creating some rules, which lowers the cost of understanding, communication and maintenance. If a project creates some strange rules on its own, it may mean that the author has not seen enough code.
Whether a project follows the conventions often requires some experience from code readers, or the need for static inspection tools such as Checkstyle. If you feel like there's nowhere to go, then there's no big problem with Google in most cases: you can refer to Google Code style, some of which have a corresponding Chinese version.
In addition, there is no need to struggle to follow the agreement in the end what is the proceeds, as if walking is left or right to good, even if the conclusion is not meaningful, most of the Convention as long as the compliance can be.
2.2.3. Documentation and annotations
Documents and notes are important parts of a program, and they are one way to understand a project or project. The two are positioned in some scenarios to coincide or intersect (for example, Javadoc can actually be considered a document).
The standard of the document is very simple, can be found, can read to understand, in general I am more concerned about these types of documents:
- For the introduction of the project, including the project function, author, directory structure and so on, the reader should be able to roughly understand what the project is doing in 3 minutes.
- For the new QuickStart, the reader should be able to complete the code construction and simple use within 1 hours according to the documentation.
- For detailed documentation of the user, such as interface definition, parameter meaning, design, etc., the reader can use the documentation to understand how these functions (or interfaces) are used.
A subset of the comments are actually documents, such as the Javadoc mentioned earlier. This can put the source code and comments together, for the reader clearer, also can simplify the maintenance of many documents.
There is also a type of comment that is not part of a document, such as a comment inside a function, which is responsible for explaining what the code itself does not say, such as "Why XXX is not here," or "Here you should pay attention to the XXX problem."
In general, I will first care about the number of comments: The number of comments inside the function should not be a lot, nor completely, the personal experience is to scroll a few screens to see about one or two more normal. Too many words may mean the readability of the code itself is problematic, and if none of this may mean that some hidden logic is not explained, you need to consider adding a little bit of comment appropriately.
The quality of the annotations should also be considered: the comments should provide more information than the code, based on the readability of the code. The more documents and comments are not the better, they can lead to increased maintenance costs. This part of the discussion can refer to the contents of the concise section.
2.2.4. Recommended Reading
The way of code cleanliness
2.3. Code that can be published
The new code has a more typical characteristics, because of the lack of experience in maintenance projects, writing code will always have a lot to consider. For example, when the test seems to be nothing unusual, the project was released after the discovery of a lot of unexpected situation, and after the problem did not know where to start the investigation, or only to let the system in an unstable state, relying on some coincidences reluctantly run.
2.3.1. Handling Exceptions
Novice programmers generally do not handle abnormal consciousness, but the actual running environment of the code is full of exceptions: The server will crash, the network will time out, the user will operate recklessly, malicious people will maliciously attack your system.
My first impression of a code exception handling capability comes from the coverage of unit tests. Most exceptions are difficult to reproduce in a development or test environment, and it is difficult for a professional Test team to simulate all the anomalies in an integrated test environment.
Unit testing can be a simple simulation of various anomalies, if the unit test coverage of a module is less than 50%, it is difficult to imagine that the code to consider the exception of the processing, even if considered, these exception processing branches have not been verified, how to expect the actual operating environment in the case of problems in good performance?
2.3.2. Handling concurrency
I have received a lot of resumes are written: Proficient in concurrent programming/familiarity with multi-threading mechanism, and so on, and they talk about the time also said the well-meaning, what lock ah mutually exclusive AH thread pool AH synchronization ah semaphore ah a bunch of nouns gushing. And give the candidate a real scene, let the candidate write a very simple concurrent programming applet, can write good but not much.
In fact, concurrent programming is also really difficult, if the difficulty of writing a good synchronization code is 5, then the difficulty of concurrent programming can reach 100. This is not alarmist, and many seemingly stable programs can still be problematic in the face of concurrent scenarios: for example, we have recently encountered a Linux kernel crash when invoking a system function due to a synchronization problem.
The key to high-quality implementation of concurrent programming is not whether a synchronization strategy is applied, but whether the shared resources are protected in the code:
- Memory accesses other than local variables have concurrency risks (such as accessing object properties, accessing static variables, etc.)
- Access to shared resources can also have concurrency risks (such as caching, databases, and so on).
- If the callee is not declared to be thread-safe, then there is a good chance of concurrency problems (such as Java's HashMap).
- All time-dependent operations, even if each step is thread-safe, there are concurrency problems (such as deleting a record first and then reducing the number of records by one).
The first three cases can be relatively simple to distinguish through the code itself, as long as the simple training of their own shared resources to call the sensitivity of the.
But for the last situation, it is often difficult to simply look at the code by looking at the way, even the two calls for concurrency problems are not in the same program (for example, two systems read and write a database, or concurrent calls to a program of different modules, etc.). However, as long as there is no lock in the code, access to shared resources "first do a, then B" logic, you may need to increase vigilance.
2.3.3. Optimizing Performance
Performance is an important indicator of a programmer's ability to evaluate, and many programmers relish the performance of the program. However, the performance of the program is difficult to see directly through the code, often with the help of some performance testing tools, or in the actual environment to perform in order to have results.
If only from a code perspective, there are two ways to evaluate the effectiveness of the implementation:
- The time complexity of the algorithm, the time complexity of the program running efficiency will inevitably be low.
- Single-step operation time-consuming, single-step operation to do as little as possible, such as access to the database, access to IO and so on.
In the actual work, you will also see some programmers are too keen to optimize efficiency, relative will bring the program legibility, complexity, or increase the duration and so on. The simple way to do this is to let the author say where the bottleneck is in the program, why there is this bottleneck, and the benefits of optimization.
Of course, whether it is optimization or excessive optimization, the best way to judge performance indicators is to use the data to speak, rather than simply look at the code, performance testing this part of the content is beyond the scope of this article, it is not in detail expanded.
2.3.4. Log
The log represents the difficulty of troubleshooting the program in the event of a problem, after (Jing) (Chang) Feng (CAI) Rich (Keng) programmers will probably encounter this scenario: Troubleshooting the problem when there is less than a log, the value of a variable does not know what is, leading to the analysis of the problem is not out of the question.
There are three evaluation criteria for logs:
- If the log is sufficient, all exceptions, external calls require a log, and a log is required on the ingress, egress, and path keys of a call link.
- The expression of the log is clear, including whether it can read, whether the style is uniform, etc. The evaluation criteria are the same as the readability of the code, and are not duplicated.
- The log contains enough information, including the context of the call, the external return value, the keyword used for the query, and so on, to facilitate the analysis of the information.
For online systems, you can generally adjust the log level to control the number of logs, so the code to print the log as long as the reading is not a hindrance, is basically acceptable.
2.3.5. Extended Reading
- "Release it!: Design and Deploy Production-ready software" (Don't read the Chinese version, the translation is really rotten)
- Numbers Everyone should Know
2.4. Maintainable Code
Compared to the first two types of code, maintainable code evaluation criteria are more ambiguous, because it is to correspond to the future situation, it is difficult for the average newcomer to imagine how some of the current practices will affect the future. However, in my experience, as a general rule, you can ask two questions on a recurring basis:
- What if he leaves the office?
- What if he didn't do this?
2.4.1. Avoid duplication
Almost all programmers know to avoid copying code, but this phenomenon is inevitably the killer of program maintainability.
There are two types of code duplication: intra-module repetition and inter-module repetition. No matter what kind of repetition, to a certain extent, the programmer's level is a problem, the module is more repetitive problems, if the same file can appear in a large number of duplicate code, it means that he is any magic code can be written out.
There is no need to read the code repeatedly for repetitive judgments, and the modern IDE generally provides tools to check for duplicate code with just a few clicks of the mouse.
In addition to code duplication, many programmers who are passionate about maintaining code quality are prone to another type of repetition: duplication of information.
I've seen some new people like to write a comment in front of each line of code, such as:
the length of the member list >0 and <200if (memberlist.size () > 0 && memberlist.size () < $) { // return memberlist;}
It may seem understood, but after a few years, the code becomes:
the length of the member list >0 and <200if (memberlist.size () > 0 && memberlist.size () < 200 | | (Tmp.isopen () && flag)) { //return memberlist;}
This may be changed later:
edit by Axb 2015.07.30// member list length >0 and <200//if (memberlist.size () > 0 && Memberlist.size () < 200 | | (Tmp.isopen () && flag)) {// Returns the current Member list //return memberlist; }if (Tmp.isopen () &&return memberlist;}
As the project evolves, the useless information becomes more and more, ultimately making it impossible to tell which information is valid and which is not.
If you find several things in your project that are doing the same thing, such as commenting on what the code is doing, or relying on annotations to replace the functionality of versioning, then the code is not called good code.
2.4.2. Module partitioning
The high cohesion within the module and the low coupling between the modules are the standards that are followed by most designs, and the complex functions can be broken down into smaller functional points that are easier to maintain through proper module partitioning.
In general, it is possible to evaluate the rationality of a module from the code length, the length of a class is greater than 2000 lines, or the length of a function is greater than two the screen is a more dangerous signal.
Another place that can reflect the level of module partitioning is dependency. If a module relies on particularly many, even if there is a cyclic dependence, it can also reflect the author of the module is poor planning, in the future in the maintenance of the project is likely to appear reaching situation.
In general, there are a number of tools that provide dependency analysis, such as the Dependencies Analytics feature provided in idea, and the use of these tools can be a great help in evaluating code quality.
It is worth mentioning that, in most cases, improper module partitioning is accompanied by very low unit test coverage: Complex Module unit testing is very difficult to write, or even impossible to complete the task. So the direct view of unit test coverage is also a more reliable way to evaluate.
2.4.3. Simplicity and abstraction
As long as you mention the quality of the code, you will inevitably refer to adjectives such as brevity and elegance. The word "concise" actually covers a lot of things, the code avoids repetition is concise, the design is enough abstract is concise, all the attempts to improve maintainability are actually trying to do subtraction.
Programmers who are inexperienced in programming often fail to realize the importance of brevity and are happy to tinker with complex gadgets. But complexity is the natural enemy of code maintainability and a threshold for programmer's ability.
Programmers crossing the threshold should have the ability to control the increasing complexity, summarize and abstract the nature of things, and embody their own design and coding. The life cycle of a program is also the iterative process from simple into complex to simple.
For this Part I can not summarize the simple evaluation criteria, it is more like a way of thinking, in addition to understanding, but also need to practice. Look more, think more, communicate more, many times can simplify things will greatly exceed the original forecast.
2.2.4. Recommended Reading
- Refactoring-Improving the design of existing code
- Design mode-the basis of reusable object-oriented software
- "Software Architecture patterns-understanding Common Architecture Patterns and when to use them"
3. Conclusion
This article mainly introduces some methods to evaluate the quality of code, some of which are more objective, some more subjective. As has been said before, the evaluation of the quality of the code is a subjective matter, although the article lists a lot of evaluation methods. But in fact, a lot of the code that I think is no problem will be spit by others, so this article is only a preliminary draft, more content will need to continue to be supplemented and perfected in the future.
While everyone has a different tendency to evaluate code quality, overall the ability to evaluate code quality can be likened to the programmer's "taste", and the accuracy of the evaluation will increase as your experience increases. In this process, the spirit of thinking, learning and criticizing needs to be maintained at all times.
In the next article, talk about how to improve the quality of your code.
Links: http://kb.cnblogs.com/page/526769/
(go) Improve code quality---One