Things about Rotten code (next)

Source: Internet
Author: User

Let's say you've read the first two articles of a bad code series: You know what bad code is, what good code is, but it's unavoidable to get into bad code (as you said before, there's almost no programmer who can completely avoid writing bad code!). The next question is: How to deal with the bad code around them.

1. Improved serviceability

Improving code quality is a big project, to start the project, starting with maintainability is often a good start, but it's just the beginning.

1.1. The paradox of refactoring

Many people think of refactoring as a one-off movement, the code is rotten can not change, or there is no new demand, called a bunch of people dedicated to take out a period of time to do refactoring. This can be effective in traditional enterprise development, but it is difficult to adapt to Internet development for two reasons:

    1. Internet development pay attention to rapid iteration, if you want to do large-scale refactoring, often need to suspend demand development, this is basically difficult to achieve.
    2. For projects that do not have new requirements, it often means that the project itself has gone through a period of development, even if the refactoring does not bring any benefits.

This creates a paradox: on the one hand, those systems that change frequently need to be reconstructed, while refactoring will delay the progress of development and affect the efficiency of change.

In the face of this contradiction, one way is to abandon the refactoring, let the quality of the code fall naturally, until the end of the life cycle of the project, choose to give up or back again. In some scenarios this approach is really effective, but I don't like it: why not do something more meaningful than letting an engineer waste every day's energy on meaningless things?

1.2. Refactoring Step by step1.2.1. Before you begin

The first step in improving your code is to set the IDE's refactoring shortcut to a handy key, which is important: the decision to refactor is often not how great your new design is, but how much time it takes to refactor itself.

For idea, for example, I would set the refactoring menu as a shortcut:

This way, when I want to refactor, I can open the menu at the moment, instead of using the mouse to go slowly, the shortcut keys can only be refactored for a few seconds at a time, but it can significantly reduce the psychological burden of refactoring engineers, and later on, small-scale refactoring should be as part of the daily development of code.

I divide the refactoring into three categories: internal refactoring, module-level refactoring, and engineering-level refactoring. The three categories are not because of what I'm classified as obsessive-compulsive disorder, but later I see the significance of refactoring for refactoring.

1.2.2. The internal reconfiguration of the module at any time

The purpose of the internal refactoring of the module is to comb the logic inside the module and split a huge function into a maintainable piece of code. Most Ides provide support for this type of refactoring, similar to the following:

    • Renaming variables
    • renaming functions
    • Extracting intrinsic functions
    • Extracting internal constants
    • Extracting variables

This type of refactoring is characterized by the basic focus of the changes in one place, the code logic is rarely modified and basically controllable, the IDE's refactoring tools are more robust, so there is little risk.

The following example shows how to refactor a lengthy function through the IDE:

example, we basically rely on the IDE to divide a lengthy function into two sub-functions, and then we can do some of the sub-functions of some bad code to do further small-scale refactoring, and two functions inside the refactoring can also use the same method. Each small-scale refactoring time should not exceed 60s, otherwise it will seriously affect the efficiency of development, resulting in the reconfiguration is overwhelmed by endless development requirements.

At this stage, you need to add some unit tests to the existing modules to ensure that the refactoring is correct. In my experience, however, some simple refactoring, such as modifying a local variable name, or extracting a refactoring such as a variable, is basically reliable even without testing, and if you want to select one of the quick completion modules for internal refactoring and 100% Unit test coverage, I might choose to complete the refactoring quickly.

The benefits of this kind of refactoring are mainly to improve the readability of the function level, and eliminate the super-large function, and lay a good foundation for further module-level splitting.

1.2.3. Only one refactoring at a time than the module level

The subsequent refactoring begins with multiple modules, such as:

    • Remove useless code
    • Move functions to other classes
    • Extracting functions to new classes
    • modifying function logic

The IDE often has limited support for this kind of refactoring, and occasionally some inexplicable problems, such as modifying the name of a class by accidentally modifying the constant string in the configuration file.

This type of refactoring is primarily about optimizing the design of code, stripping out unrelated coupling code, and during such refactoring you need to create a lot of new classes and new unit tests, and unit tests are necessary at this point.

Why do you create unit tests?

  • On the one hand, this kind of refactoring is difficult to cover all situations by integration testing because it involves modification of specific code logic, and unit tests can verify the correctness of the modifications.
  • More importantly, the code that doesn't write unit tests often means bad design: The module relies too much or a function is too heavy, imagine that you want to execute a function that simulates more than 10 input objects, and each object also simulates its own dependencies ... If a module can not be tested individually , then from the perspective of design, is undoubtedly unqualified.

It also needs to be verbose, the unit tests here say that only one module is tested, and the tests that rely on multiple modules are not included-for example, simulating a database in memory and testing the business logic in the upper code-such tests do not improve your design.

During this period there will also be some temporary logic for the transition, such as various adapter, proxies, or wrapper, which may have a few months to a few years to survive, and these seem to have little need to be done to control the refactoring scope, for example:

Java code
    1. Class Foo {
    2. String foo () {
    3. ...
    4. }
    5. }

If you want to change the function declaration to

Java code
    1. Class Foo {
    2. boolean foo () {
    3. ...
    4. }
    5. }

It is better to do this by adding a transition module:

Java code
    1. Class Fooadaptor {
    2. private foo foo;
    3. boolean foo () {
    4. return (). IsEmpty ();
    5. }
    6. }

The advantage of this is to modify the function without changing all callers, one of the characteristics of rotten code is that the coupling between modules is relatively high, often a function has dozens of calls, reaching. And once the overhaul is started, it tends to turn a seemingly simple refactoring into a big project for weeks, and this massive refactoring is often unreliable.

Each module-level refactoring needs to be carefully designed, pre-divided into what needs to be modified, and what needs to be done with compatible logic. But the actual hands-on changes should not be more than a day, if more than a day means that the refactoring changes too much, you need to control the change rhythm.

1.2.4. Engineering-level refactoring cannot be parallel to any other task

Unsafe refactoring is relatively large in scope, such as:

    • Modify Engineering structure
    • modifying multiple modules

I recommend that you do not use the IDE for this type of operation, and use the simplest "move" operation if you are using the IDE. Such refactoring unit tests have been completely non-functioning and require the coverage of integrated tests. However, there is no need to be nervous, if you only do "mobile", most of the basic smoke test can guarantee the correctness of the reconstruction.

The purpose of this refactoring is to split the hierarchy or type of code to cut off cyclic dependencies and structurally unreasonable places. If you do not know how to split, you can follow the following line of thought:

    1. Priority is split by deployment scenario, such as a part of the code is common, some code is self-used, you can consider splitting into two parts. In other words, the modification of a service can affect the B service.
    2. followed by the business type of splitting, two unrelated functions can be split into two parts. In other words, the modification of a function can affect the B function.
    3. In addition, try to control their own code cleanliness, do not cut the code into a large pile of tofu pieces, will give future maintenance work to bring a lot of unnecessary costs.
    4. The case can be review several times in advance, refer to the advice of the first-line engineers, avoid the actual hands when new problems emerge.

This type of refactoring can never be executed in parallel with normal demand development: Code conflicts are almost unavoidable and can cause everyone to crash. My approach is usually to rehearse before this kind of refactoring: Drag the module down to the approximate idea, find dependencies through the compiler, solve the problem of easy-to-handle dependencies on the daily line, and then focus the elite on the team, notify everyone to suspend development, spend up to 2, 3 days to focus on all the problems, New requirements are developed on the basis of new code.

If the historical burden is too heavy, this kind of refactoring can also be divided into several times: first roughly split into a few pieces, and then split apart. In any case, this type of refactoring must control the scope of the change, and a serious merge conflict may cause everyone on the team to recovered for several weeks.

1.3. Period of refactoring

A typical refactoring cycle is similar to the following procedure:

    1. In the development of normal requirements at the same time the internal reconstruction of the module, while understanding the original project code.
    2. In the demand Gap module-level reconstruction, the large module split into several small modules, increase the scaffolding class, supplementary unit testing, and so on.
    3. (if necessary, such as a project that is too large to cause frequent interaction problems) to perform a project-level split, during which all development work needs to be halted, and this refactoring does not make any other changes other than the changes brought by the mobile module and the mobile module.
    4. Repeat steps 1, 2
1.3.1. Some of the refactoring tips
    1. Refactoring only the parts that are often modified, if the code has not been modified for a year or two, then the benefits of the change are small, the refactoring can only improve maintainability, and refactoring of the non-maintainable code will not bring benefits.
    2. Restrain yourself from the urge to change a bit more, the impact of a failed refactoring on code quality improvements can be devastating.
    3. Refactoring requires constant practice, and refactoring may be harder than writing code.
    4. Refactoring can take a long time, and it may even be a few years (I've refactored a project for over two years), mostly depending on the team's tolerance for risk.
    5. Removing useless code is the most effective way to improve code maintainability, remember.
    6. Unit testing is the basis of refactoring, and if the concept of unit testing is not yet clear, you can refer to unit testing using the Spock framework.
2. Improve performance and robustness2.1.80% to improve performance

Performance this topic more and more people are mentioned, casually received a resume do not write on what is familiar with high concurrency, performance optimization and the like seem embarrassed to greet people.

Say a real thing, a few years ago in my work in a company's ERP project, there is a function is to generate a report. There is a person in the company that uses our system, he has to point out a report every day before work, export to Excel, send a message out again.

The problem is that the report takes 2, 3 minutes to generate each time.

I was young and crazy, see a two-minute report to generate the interest, turned out the paragraph did not know who wrote the code, found that the use of 3-layer loop, each time will go to the database to check the data, and then put together a bunch of data, a brain stuffed into a tableview.

What else can I do to face this kind of code?

    • I immediately killed that three-layer loop, outputting the data directly through a stored procedure.
    • The logic of SQL data calculation has been reduced by me, and some unnecessary outreach operations have been eliminated by me.
    • I also found a lot of useless controls generated by CTRL + V (then the Delphi), the controls are densely pasted on the display interface, just blocked by the big table in front of me, of course, I have deleted all these things;
    • I also did some assorted work when I opened the interface (e.g., to update the number of hits in the database), and I put these in the async task.
    • I don't think it's necessary to load all the data every time you open the interface (the TableView has thousands of rows, hundreds of columns!). ), so I hack the default tableview, each time I open the first calculation of the current actual display of how much content, the parameters to the stored procedure, initialize only load the data, the rest of the thread is loaded asynchronously.

After doing this, the interface needs to be displayed in less than 1s, but that's not what I'm talking about.

Then I went to the client company to show the operator the new module, click, Brush, the data came out. The man looked at me in horror and asked me if the data were not right.

Later, I added a function, the module will show a progress bar each time after opening, the title is "Verifying data ...", the progress bar is about 1 minutes to go, I told the person that the calibration data calculation is very large, will be relatively slow. Of course, in fact, that 60 seconds of program hair did not do, just a little update that progress bar (I also made an egg, in the progress of the time by pressing up and down about Baba words can be accelerated 10 times times Read bar ...) )。 The customer is very happy, said the feeling data is more accurate, of course, he did not find the egg.

I've written so much to make you understand the fact that most programs are not sensitive to performance. In a few performance-sensitive programs, more than half can solve performance problems by adjusting parameters; Finally, a handful of programs that need to modify code optimization performance are few.

What is price/performance? Back in the example I've done so many things, what is the benefit of everything?

    • The three-layer circular SQL to the stored procedure, probably let me spend a day, let the load time from 3 minutes into 2 seconds, the module loaded into a "swish".
    • A lump of things in the back probably took me more than a week, especially hack that tableview, let me even on the weekend to get in. And all the optimizations add up, probably optimized for about 1 seconds, this data is found through the log: Even my own, open the module does not feel any obvious difference.

Many of the interviewers I meet now say that program optimization always likes to say something iffy: Call stack, tail recursion, inline function, GC tuning ... But when I ask them: changing a normal function into an inline function is how much of a program is optimized for how fast it is running, but very few people answer it, or minced that it should be a lot, because the function is called many times. I asked how many times I would be called, and how long each time it was, I couldn't answer it.

So for performance optimization, I have two points of view:

    1. Optimize the main part, change one network IO to the benefit of the memory computation Yu Ying compiler optimizations and things like that. This part of the content can refer to numbers you should know, or write a for loop yourself, make an infinite i++ program, see how many times I can accumulate in a second, feel the CPU and memory performance.
    2. After performance optimization, you need to have quantitative data to clearly say which metrics have been improved after optimization. If someone writes a bunch of incomprehensible code for reasons such as "boost performance," Make sure he gives performance data: This is probably a rotten piece of code with little or no gain.

As for the specific optimization measures, there are several categories:

    1. Let the calculation close to the storage
    2. Time Complexity of optimization algorithm
    3. Reduce useless operations
    4. Parallel computing

The topic of performance optimization can also be a lot of content, but for this article is a bit irrelevant, here is no longer detailed expansion.

2.2. Determine the robustness of the 20%

A while ago listen to a technology to share, said that they are programmed to consider the effects of sunspots on CPU computing, or the farmer uncle's pig to the base station arch collapsed and other special scenes. If you want to optimize the robustness of the program, then sometimes you have to consider the impact of these extreme situations on the program.

Most people should not take into account the advanced problems such as sunspots, but we need to consider some of the common special scenarios, most of the programmer's code for some special scenarios will have more or less ill-conceived place, for example:

    • User input
    • Concurrent
    • Network IO

The conventional approach does find some bugs in the code, but in a complex production environment, there are always some completely non-imagined problems. Although I have been thinking for a long time, unfortunately, for robustness, I have not found any immediate solution, so I can only cautiously put forward a little bit of advice:

    • More test tests are designed to ensure code quality, but testing is not equal to quality, and you do tests that cover 80% scenarios where there is still a possibility of a problem in the 20% test. About the test is a huge topic, this is not the start.
    • Be careful to invent wheels. For example, the UI library, concurrency Library, IO client, and so on, as far as possible to meet the requirements of a mature solution, so-called "mature" means to experience more actual use of the test under the circumstances, in most cases, the effect of this test is better.
3. Improve the Living environment

After seeing so many things above, you can think of a scenario like this:

After you have done a lot of things, the quality of the code seems to have a qualitative leap. Just when you think you can finally get rid of the day of trampling on the day, a certain careless glimpse of a class and grow to thousands of rows.

You angry look at the submission of the log, to find out who is the culprit, the results found that every day will be submitted to the file so more than 10 20 lines of code, each change seems to be no problem, but over the years, a year in the past, the original spent Dickens reconstruction of the project has become a lump of rotten code ...

Any programmer who pursues the code has the potential to encounter this problem, the technology is updating, the demand is changing, the company personnel will flow, and the code quality always inadvertently secretly become worse ...

To improve the quality of the code, it often turns into a better living environment.

3.1.1. Unified Environment

The team needs a unified set of coding specifications, a unified language version, a unified editor configuration, a unified file encoding, if the conditions are best to use a unified operating system, this can avoid a lot of meaningless work.

As if the recent slag wave to the development of all replaced with a unified MacBook, a night before many problems have become not a problem: character sets, line breaks, Ides and other problems as long as a configuration file is resolved, there are no strange code conflicts or incompatible problems, And no one is suddenly submitting some of the wacky files in the code format.

3.1.2. Code Warehouse

The code warehouse is basically a standard for every company, and now the Code warehouse, in addition to storing code, can also take on some of the tasks of team communication, code review and even workflow, and now there are many such open source systems, like Gitlab (GitHub), Phabricator such excellent tools can make code management much easier. I have no intention of discussing SVN, Git, Hg or any other code management tools, even though the recent hot git has some problems with complexity and centralized management, I'm looking forward to having a tool that can replace Git.

The meaning of the Code warehouse is to allow more people to acquire and modify the code, thus improving the life cycle of the code, and the life cycle of the code itself is long enough for the optimization of the code quality to be meaningful.

3.1.3. Continuous Feedback

Most rotten code is like cancer, and when rotten code has a sense of impact, it's basically late, and it's hard to heal.

So it is important to find out in advance that the code is rotten, and this kind of work can rely on static checking tools like Checkstyle,findbug to find out the trend of code quality decline in time, for example:

    1. Generate a lot of new code every day
    2. Test coverage down
    3. Problems with static checks increase

With the code warehouse, you can combine this tool with the trigger mechanism of the warehouse, do the coverage, static code checking and so on each commit, jenkins+sonarqube or similar tools can complete the basic process: with the code submitted for a variety of static checks, run a variety of tests, Generate reports and provide reference for people.

In practice, there are a lot of different tools about continuous feedback, but there are only one or two things that are really useful, and most people don't go to the "Generate reports" page after each commit, or go to a system to see if the test coverage is getting lower. So a one-stop system will perform better in most cases. Instead of pursuing more functionality, you might want to integrate a limited number of functions, such as the integration of code management, regression testing, code checking, and review, just like this:

Of course, there is still more to be done about continuous integration, and there is no more space to say it.

3.1.4. Quality Culture

Different team cultures have a subtle impact on technology, there is no common culture about code quality, and each company has its own set of views and seems to make sense.

For myself, the idea of code quality is this:

    1. Bad code can't be avoided
    2. Bad code can't be accepted
    3. Rotten code can improve
    4. Good code can make work happier.

How to get people to agree on the quality of the code is actually a bit difficult, most of the technical staff on the Code quality of the view is neither agree nor oppose the neutral attitude, and the code quality is like the entropy value, regardless of always will be like the more chaotic direction of evolution, and write bad code is too low cost, So that an intern can ruin you for a week. It took you half a year to design a project.

So when it comes to improving the quality of your code, be sure to try to pull someone else on the team. While "booting the team to improve code quality" is a very hard thing to start with, once you have some supporters and a template to refer to, the rest of the work is much simpler.

This book, "Preaching the way: leading the team to embrace technological innovation," is the most of the ideas for code quality. Shouting slogans makes it hard to get other people to write high-quality code, allowing others in the team to appreciate the benefits of high-quality code, more convincing than shouting slogans.

4. Finally say two words

Optimizing code quality is an interesting and challenging task, and the challenge comes not only from how bad the code is, but also not just the code itself, but also the tools, habits, practices, development processes, and even all aspects of the team culture.

Writing this series of articles has spent more than half a year and has been in the state of writing a little bit of deletion: my own thoughts and practices about code quality are undergoing constant change. I would like to write some things that can be done on the ground, rather than shouting slogans, "Agile development", "test-driven" and so on a few nouns are over.

But in the process of writing the article will slowly find that many of the problems of the improvement method is really not one or two articles can be said to understand, the problem is often related to each other, all expanded to say even more than a book of information, so this article can only delete a lot of content.

I've been involved in a lot of good code-quality projects, I've been involved in some bad quality projects, I've improved a lot of projects, I've given up on projects, I've changed my code from the very first single, and I've been through a lot to lead the team to optimize the workflow. Anyway, about the bad code, I decided to cite a word from the book "Sermon":

"Better" is not a destination, but a direction ... There may be a lot of pretty good places between your current position and your future goals. You just have to focus on leaving your present position instead of worrying about where you're going. ”

Things about Rotten code (next)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.