Excerpt: Some Ideas about large-scale software refactoring
The reconstruction discussed here is"Implement a functionWhileDedicatedOngoingLarge Scale".
How far can you see the architecture design?
Simply put, what we need to do is to completely separate the UI code of a software from the core functions, and then make the core part into a separate product. Of course, everyone understands the principle of separation between the presentation layer and the business layer. These concepts are also included in the original architecture, but there is no strict requirement, it never takes the core part out for running independently. After nearly ten years of development, the UI dependency on the core layer in the Code is already quite serious, with static and source code compilation dependencies, there are also dynamic and runtime dependencies. At this time, it is quite difficult and time-consuming to extract its core functions (the amount of code is recorded in millions of lines ). Looking at some open-source CAD software on the Internet, many of them have clear core-UI division at the beginning and can run in the core mode or UI mode. For example, freecad. If the architect can think of this step and clearly divide it from the very beginning, he can be certain: 1. There is no need to spend so much manpower and material resources in the future; 2. the design will be much better in terms of its quality.
Of course, this is not necessarily far-sighted. Once combined with commercial interests, many good designs have to be abandoned. For example, your product is only intended for Windows users, and the project team is a Windows programmer-to launch products faster and better, you should not consider cross-platform.-But ten years later, the old guys decided to enter Mac ~~~ Therefore, this part is dedicated to human resources. Listen to your destiny.
Working Mode
Be sure to open a separate branch. In this way, you can close the door and "Do whatever you want" without affecting other teams. What you want here is:
Because it is a very large code base, some of your modifications may cause compilation errors in another place, or you may make a wide range of modifications using scripts, it may take half a day to build a complete build on the local machine (right, that is, using incredibuild). Check-in and ask the server to help build the build. You can continue to work, there are several build errors. It doesn't matter!
Before each check-in operation, you do not need to run those automated tests.
Of course, this kind of freedom is not advocated in many cases, but it greatly improves the efficiency.
In addition, because the number of refactoring changes is very large, it is necessary to regularly sync with main or trunk branck to control a single change within an acceptable range.
How to Ensure Quality
As mentioned above, we can check-in without running automated tests. How can we ensure the quality.
First, you must have automated tests-code-based unit tests, script-based functional tests, as long as they are automated, and the coverage is sufficient.-refactoring is ongoing, in the case of large-scale reconstruction, if there is no automated test, it is simply an emergency.
Because we work on our own branch, we only need to ensure that there is no regression when we return to main/trunk. What is the state in the middle? We don't have a high requirement. The general practice is:
- Run smoketest and related acceptance test every week to prevent some major problems.
- Every time we sync with main/trunk, we will spend about four days to "automation triage"-run all the automated test cases, after the report is obtained, analyze the report one by one or analyze the report one by one, because many failures are the same.
This method greatly improves the efficiency-you must know that it takes three days for dozens of servers to run all the cases together ~~~
How to manage code
Refactoring involves moving and splitting many files. Note the following two points:
- File history cannot be broken
How a file is changed step by step is very important-you can easily find out who has changed the file and how to change it. Moving or splitting files is very easy to lose history operations due to negligence. Make sure to use the SCM tool correctly to keep this information. For example, intergrate should be used in perforce, rather than simple add.
This involves intergration from branch to main. You moved a file on branch and modified it. Someone also made changes on Main. During intergration, you can easily lose the modifications made by others on the main, because the corresponding relationship is not established, so there will be no merge. I think different SCM tools should provide solutions. For example, perforce can describe the corresponding relationship in its branch spec.
Reconstruction Method
During work, I read refactoring-improving the design of existing code. The above describes many good methods and steps to improve the design, but it is basically useless. Because of the design, we have done research and solutions on which situation and how to modify it, and those steps are not very suitable. Visual assist provides a refactoring module that can be used in code of a general scale. However, there is no way to implement a lot of solution code. Besides, this only involves source code reconstruction. We also need to refactor the project/DLL.
We adopt the following method: for different situations, write Perl scripts to automate some tasks. For example, if I modify the name of a method, the script will search all the code, automatically check the file to be modified, and replace the new name. I remember that I wrote a lot of Perl scripts to automate perforce calls, vs calls, code modifications, engineering files, and so on.
Some details
I learned a lot about the reconstruction of such large software:
- In the face of some large software systems, it will be more confident.
- Develop the habit of automation. A large number of manual operations will be boring, time-consuming, error-prone, and do not have a sense of accomplishment. But switch the goal: write a program to automate the work, none of the above questions :)