Reading notes for cleaning up code

Source: Internet
Author: User
Tags mercurial

Deep parsing: Cleaning up bad code

2015-10-05 PHP Developer

(Click the public number above for a quick follow-up)

English: Niklas Frykholm

Bole Online-Tangxiaojuan

Website: http://blog.jobbole.com/28672/

Guess what! You are "inheriting" (receiving) a bunch of messy old code. Congratulations to you! It's all yours now. The confusing code may come from anywhere. The middleware, the network, may come from your own company.

You know there's a guy in a corner, and no one's going to do anything about it. Guess what he's been doing? Hard to write code, but a bunch of bad code.

You remember this module was written by a guy a few years ago before he left the company. This module has been patched by 20 different people, has done code repair, and they don't understand what the code is doing. Yes, that's the code.

Or you download the open source software from the Internet, you know it is very scary, but it solves a very special and very difficult problem for you, it may take you a few years to solve the problem.

Bad code is not necessarily a problem, as long as they are not wrong, no one will scoff at it. Unfortunately, the probability that they are not found is too small. Errors will be found. New features are needed and the new system is released. Now you have to deal with this horrible code and try to clean it up. This article provides some suggestions for this unfortunate situation.

0. Is it worth cleaning up?

The first thing you need to ask yourself is that the code is worth cleaning up. I'm not saying that when you ask if you want to clean up your code, you have to answer yes or No. Is that you are responsible for the code and that you have to face them all the time until the final code is you are willing to maintain, and you are proud to put in the code base.

If you think that even if the code looks scary, it's not worth wasting your time on a tight schedule to fix them. So you just made the most minor adjustments to save the urgent.

In other words, you can also consider code as your own, or as someone else's.

Both of these situations have pros and cons. Good programmers feel uncomfortable when they see bad code. They will take out torches and forks and shout: "Too messy, too messy." This is a good quality.

but cleaning up the code is a tedious task. It's easy to underestimate the time. Sometimes it's as time-consuming as writing code from scratch. and the short term does not bring any short-term effects.

Two weeks of time cleaning up the code does not bring any new functionality, but it is possible to introduce some new bugs.

On the other hand, if you don't clean up your code for a long time, it can be disastrous. Chaos is the killer of code.

How to weigh?

So, this is not an easy decision to make. There are a few things to consider:

How much do you expect to change this code? Do you want to just change this little error, or do you want to use this code more than once, so you'd like to "tune" It Better and add new features. If it's just fixing a bug, it's best not to scare it. However, if you need a long-term toss-up of this module, it's time to start cleaning it up and then save a lot of trouble.

Do you need or do you want to introduce an upstream update? Is it an open source project under development? If so, and you want to change the upstream code, you can't make a big change to the code or you'll experience a merge nightmare every time you pull the code. So you need to be a friendly team player, accept this error and send the code patch with your fix to the maintainer of the code.

How much work do you have to do? How many lines of code can you actually clean up in a day? We estimate that there are more than 100 lines, less than 1000 lines, OK, we assume 1000 rows. So if a module has 30,000 lines of code, you might need one months. Do you have so much time? Is it worth it?

Is it a function of your core? If this module is just an edge module, such as font rendering or image rendering, you probably don't care if it's a mess. You may not, in the future, replace it with something else, who knows . If this code is related to the core performance, you need to be cautious.

How bad is this piece of code? If the code is just a little bit bad, you may still be able to tolerate it. If it's unreasonable and frustrating, then we have to do it.

1. Build Test Cases

To clean up a piece of code seriously means taking a while to clean it up completely. You may destroy them.

If you have a good test case with a certain coverage, you will easily know what is broken, and you can quickly know what a foolish mistake you have made. Trying to save time building test cases is ridiculous throughout the process of cleaning up your code. Set up test cases. This is the first thing you need to do.

Unit tests are best, but all the code does not fit into the unit tests. If unit testing is too cumbersome, switch to integration testing. For example, a game level requires a character to complete a series of actions related to the code you clean up.

Such tests are more time-consuming, so it is not possible to test them once after each change, although this is the ideal situation. Because you put every change in the version control system, the situation is not so bad. So every time (for example, five changes) is tested once. When you find a problem, you can search through the binary to find out what happened in the last few commits that caused the problem.

If you find a problem that is not found in the test, make sure to add it to the test so that it can be tested in the future.

2. Using the Code version control system

Does anyone else need to be told to use the code version control system? I hope not.

Clean-up work is critical. You may have to make a lot of small changes. If something goes wrong and you want to review the version history, you may find it wrong.

If you're like me, you can sometimes refactor (clean up a stupid Class) and then realize that it's not a good idea, or that it's a good idea, but that everything will be easier if you do something first. So you want to quickly restore everything to the original and start over again.

Your company should already have a code control system, you can modify the different branches, without disturbing others in the case of arbitrary commit.

Even if this is not the case, you should also use version control. Download mercurial (or git), create a new warehouse, check out the code from your company's stupid system and put it here. Commit your changes in the library. When you're done, you can merge everything into that stupid system.

It only takes a few minutes to copy the library into a code control system. It's worth it. If you don't understand mercurial, spend one hours studying it. You'll be happy to do that. If you want, spend 30 hours learning git (I'm kidding!). It doesn't take so long. Now is the time to fight "nerd"! )

3. Make only one small change at a time

There are two ways to improve bad code: Revolution and Reform. The revolution is to burn everything with torches and write it again. Reform is only a small change on the basis of non-destruction.

This article is about the method of reform. I am not saying that the revolutionary method is never necessary. Sometimes the code is too bad and requires a revolutionary approach. But those who feel that the pace of reform is too slow tend to encourage reform, but often do not realize the complexity of the problem and, ultimately, no better than the existing system.

Joel Spolsky wrote a classic article that he did not fall into the trap of a tense argument.

The best way to reform is to make only one small change at a time, test it, and commit it. When a change is very small, it is easier to understand the consequences of the changes and to ensure that the changes do not affect existing functionality. If something goes wrong, you just need to check a little bit of code.

If you start to make changes and realize that the change is bad, you will not lose too much effort when you revert to the last commit. If you have a while to find out where there are minor errors, you can use binary search in version history to find the changes that are causing the problem.

The most common mistake is to make multiple changes at once. For example, when you remove the potential of an unnecessary class hierarchy, you find that the methods of the API are not like the way you like to use them, and you intend to reorganize them. Don't do this! Remove the hierarchy first, and then change the API after commit.

Smart programmers know how to organize, so they don't need to be too smart.

Try to find a way, along this path you can turn the code into what you want, with only a little change at a time. For example, the first step is to rename the method so that the name is more reasonable. Next, you can change the member variable to the parameter of the method. Then the algorithm becomes clearer, and so on.

If you start making changes and find that the changes are bigger than you originally imagined, don't be afraid to go back and use smaller, simpler steps to accomplish the same thing.

4. Do not clean code and fix code at the same time

This is the result of (3), but it is still important.

This is a common problem. You start looking at a module because you want to add a new feature. Then you find that the code is pretty bad, so you start to reorganize it and add new features.

The problem is that cleaning up the code and correcting the errors is a completely different goal. When you clean up the potential, you want the code to look better without changing its functionality. When you fix the error, you want to change the function. If you clean up your code and correct errors at the same time, it's hard to ensure that cleanup doesn't change.

Clean up the code first, and then add the new functionality on a clean base.

5. Remove features that you do not use

The time to clean up is proportional to the amount of code, complexity, and level of bad.

If the functionality of the code you are not currently using and will not be used in the foreseeable future, then delete it, which will reduce the number of code you are browsing and reduce the complexity (removing unnecessary concepts and dependencies).

You will clean up faster, and the final result will be simpler.

don't keep the code just because "who knows, you might need it someday". The code comes at a cost – it needs to be ported, corrected, read and understood . You have less code, it's better. Even in the most unlikely case, you need the old code, and you can find it from the code base.

6. Delete Most of the comments

Bad code rarely has good comments. These are usually the case:

Pointless:

Set x to 3

x = 3;

Incomprehensible:

Fix for CB (in-loop)

pos + = Vector3 (0,-0.007, 0);

Sowing fear and doubt:

Really we shouldn ' t be doing this

t = Get_latest_time ();

Downright lying:

P Cannot is NULL here

P->set_speed (0.7);

Look at the whole code. If a comment doesn't make sense to you, and it doesn't help you understand the code, delete it. Otherwise, you're wasting your brain power to understand a bunch of comments that don't help you understand the code (strongly agree)

Similarly, delete the code that has been commented out. If you still need it, it's still in your code warehouse.

Even if the comments are correct and useful, remember that you can also refactor your code. Maybe when you're done refactoring, these comments are no longer correct. There's not a single unit test in the world that can tell you if the comment is corrupted.

Good code requires little comment because the code itself is self-explanatory and easy to understand. Variables that have good names do not need annotations to explain their purpose. function If there is a good input and output, there is no special case when it is not necessary to explain. A simple, well-written algorithm is easy to understand without annotations. The assertion records the conditions and forecasts.

In most cases, it is best to delete all the old comments, focus on making the code clean and readable, and then add the code where it is needed – these comments reflect the purpose of the new API and your understanding of the code.

7. Avoid shared, changed states

A shared, changing state is the biggest impediment to understanding code, because it allows actions to be made at a distance, and a piece of code can change the behavior of another piece of code that is completely different. It is often said that multithreading is difficult. In fact, the problem is caused by threads sharing a state that can be changed. Multithreading is not complicated if you can avoid them.

If your goal is to write high-performance software, you should not be able to avoid all the changing states, but your code can still benefit from reducing it. Work hard to make sure you know exactly what the state is and where it has changed and why.

A shared, changed state comes from a different place:

Global variables. The most classic example. Now everyone knows the downside of global variables. But be aware (sometimes people forget) that global variables are the only ones that can cause problems in the shared change state. Global constants are not bad and sprintf are not bad.

Objects – Large bags with fun. Objects can assemble many methods and can undoubtedly share many mutable states (members). If a lazy programmer needs to pass some information between methods, she can create a new member, so she can read it and write it as needed. This is much like a global variable. How interesting! When an object has more and more members, the problem becomes more and more serious.

A huge function. You may have heard of them. This mysterious product perches at the very bottom of the darkest code caves. warn't bad programmers talking about them in shady bars, their sanity destroyed by the code they met: "I keep flipping down, I can't believe my eyes." There are 12,000 rows. "When the functions are long enough, their local variables will be as bad as global variables. It is impossible to know what effect a local variable will have after changing 2000 rows.

Reference and pointer parameters. When a reference and pointer parameter is not declared as const, it is passed into the function, which can act as a shared mutable state between the callee, the caller, and any object that can be passed the same pointer.

Here are some suggestions to avoid a shared, changing state:

The larger function is cut into smaller functions.

Cut the larger objects into smaller variables and put the related members together.

Turn members into private.

Declare the function const, returning the result, not the state that can be changed.

The function is declared static, and the value is obtained from the parameter, not from the shared state.

Avoid full use of objects, achieve pure functionality, and do not introduce side effects.

Declares a local variable as const.

Declare the pointer and the reference const.

8. Avoid unnecessary complexity

Unnecessary complexity is often the result of over-engineering-supported structures (such as serialization, reference counters, virtual interfaces, abstract factories, visitors, and so on) slow down the actual code that actually functions.

Sometimes it works because some projects start out with bigger ambitions and more than they actually do. More, I think, because the programmer read the idea behind the book of design patterns and the waterfall model, he thought that engineering would create more "solid" and "high quality" products.

Often, this cumbersome, rigid, overly complex model cannot adapt to functional requirements, which designers do not expect. Those functions may then be implemented in a hack manner, becoming the most overhead bolt and back door in the ivory tower, a mixed structure of insanity.

The way to cure over-engineering is Yagni (you're not gonna need it) – you don't need it! Build it only when you need something. Build more complex things when you need them, not before you need them.

Some practical ways to avoid unnecessary complexity:

Remove anything you don't use (as suggested above).

Simplify the necessary concepts and avoid unnecessary concepts.

Remove unnecessary abstractions and replace them with actual implementations.

Remove unnecessary virtualization and simplify the structure of the object.

If a setting has been used before, then avoid running the module with a different configuration.

9. So much more.

Now start cleaning up your "room"!

Reading inspiration

1, adjust the code, a small number of changes each time. Not much. This makes it easy to see the error.

How many places does this change code affect? Too many changes, it is difficult to understand the impact of the surface, difficult to find the problem.

Experience is: Change a little bit and then submit the test.

I found myself making a mistake. Too much Want to modify a lot at once. Feel that this function is simple, do not test, and then modify other places.

Change a lot of places, if there is a problem, you need to check a lot of places to troubleshoot problems.

The author tells us that there are too many places to modify at once. The result is a problem, which is more difficult to find.

The right way is to modify one point and publish a little.

Found on-line after a problem, you can easily rollback back.

Think: If it was a few days later, we found that the previous modification caused the bug. Want to fallback to the previous version. The trouble is that several versions have been released in the middle of the period. Fallback, will you return the previously modified code? This question deserves our consideration!

2, have a good name of the variable, do not need comments to explain their use. are often self-explanatory.

3. Why do you want to remove the code? Don't keep it there.

Many programmers feel that this code can be used later. So just temporarily comment out. However, the legacy code makes the system less concise and less clean. To mislead others by increasing the number of times a technician has been read. to confuse.

If you really need to, you can go to the code base and find it back.

In short, the direction is to make the code cleaner and cleaner. Improve code maintainability.

Summary: Invalid code (commented out) removed. Comments that are meaningless, or that are obsolete, are deleted. Purpose to make the code more concise.

4. Clean up the code and fix the tradeoff of error error codes.

It is difficult to clean and correct the code at the same time. First clean, then fix.

In addition, it has to be phased in. If it's not the core module, it doesn't matter if you have bad code. Maybe one day to replace, no longer use. If it is a function that affects core performance, be cautious.

This module is a long-term toss. It is worthwhile to spend some time optimizing the code to avoid a lot of trouble later on.

Thinking: There is this experience. Clean up the code first and make it clearer, so correcting errors becomes easier to spot. If you first correct the error on the basis of the original chaotic code, it may cause the correction to cause new problems, because the structure is not clear.

5, the original function code needs to be reconstructed. Set up test cases to test the impact of the changes. How do I build test cases?

Reading notes for cleaning up code

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.