Refactoring patterns: Part 3
Considerations for applying refactoring
Shi Yizhen (shiyiying@hotmail.com)
Technical Director of Zhejiang University lingfeng Technology Development Company
December 2001
Next to the second part of this article, we will continue to discuss the issues that should be taken into consideration when applying refactoring.
Any technology may have its own troubles. But when we use a new technology, we may not be able to find out the problems it brings, as Martin Fowler said:
When learning a new technology that can greatly improve productivity, it is hard for you to see that it cannot be used.
He compares the refactoring scenario with the object-oriented Scenario:
The situation is exactly the object 10 years ago. No. I don't think there are limits on objects. It's just because I don't know what the limits are, even though I know the benefits it brings.
But Martin Fowler and others have observed some problems that may be caused by refactoring. Let's take a look:
Database
Many applicationsProgramOfCodeIt may be very tightly bound to the database structure. To modify the code, you need to change the database structure and existing data.
O/R Mapping can be used to solve this problem. Professional o/R Mapping tools can be used to migrate relational databases. However, even so, migration requires extra costs.
If you use an OO database instead of a relational database, the impact may be reduced.
Therefore, we recommend that O/R Mapping or oo databases be used for every database application. Various enterprise-level application solutions such as J2EE provide such a framework.
If your code does not have such an isolation layer, you must manually or write special code to implement these migration functions.
Interface change and published Interface
Many refactoring operations (such as rename method name) actually change the interface. The object-oriented commitment gives you the freedom to change without changing the interface. But if the interface changes, you have to be very careful.
To ensure that the observed behaviors of the system remain unchanged, you must ensure that changes to these interfaces do not affect the code that you cannot obtain. If you have all the classes that use this interfaceSource codeYou only need to change these places at the same time.
However, if you cannot get all the code you use, you have to take additional approaches. In fact, if your code is a code library (such as the collection framework of Sun JDK) or a framework, this is almost inevitable.
To make the code that depends on your interface continue to work, you must keep the old interface. Now you have two sets of interfaces. One is old and the other is new interfaces that pass through refactoring. You must assign the call to the old interface to the new interface. Do not copy the entire function body, because this will produce a large number of repeated code.
This method can solve the problem, but it is very troublesome. Because refactoring usually involves the transfer of State and behavior between different classes, if a method moves from one class to another class, the dispatch method may require unnecessary intermediate states or parameters. This will make your code difficult to understand and maintain, and reduce the role of refactoring to a certain extent.
Therefore, this method should only be used for the transitional period. A certain period of time is provided for the user to allow the user code to be gradually transferred to the new interface. after a certain period of time expires, the old method will be deleted and the old interface will not be supported. This is also the significance of Java deprecated API.
Although it is possible to protect interfaces like this, it is very difficult. You need to maintain at least two sets of interfaces within a period of time to ensure that the original customer code using the old interface can continue to use your new code. Martin Fowler calls these interfaces published interfaces. Although it is impossible for you to avoid publishing some of your interfaces, or no one can use your code, premature publishing of unnecessary interfaces will cause unnecessary troubles, as Martin Fowler prompts us:
Don't publish interface prematurely.
Arm your design with refactoring ideas
If you do not understand the idea of OO, you cannot really use the OO language. Similarly, if you do not apply refactoring to your development process, you cannot use refactoring properly.
Refactoring has two ideas: it tells you to start with a simple design, because even if the code has been implemented, you can still use it to improve your design. However, on the other hand, it does not mean you can trust your graffiti. The advice I give you is:
Started simple but not stupid.
If you have designed stupid interfaces or even wrong interfaces at the beginning. In the process of program evolution, this Part may become the core of the system. Refactoring may take a lot of effort, and operations that change interfaces and classes may be the main content of this refactoring. Changes to core interfaces may quickly reach all levels of the system. If your overall structure is good, this ripple may disappear at a certain level. (Such as ring and Hierarchical Architecture .) If you do not have such an abstract mechanism and protection system, modifications to the core class will directly lead to changes to the entire system, which is unacceptable.
Therefore, when designing a class, you need to ask yourself a few questions. If this happens, how can I modify it to adapt? How can I adapt to that change? If you can think of a possible refactoring method, it proves that your design is feasible. This does not mean that you want to implement such a design, but ensures that your design will not push yourself into the dead corner. If you find that your code has almost no way to refactoring to meet new requirements, consider other ideas carefully.
Every time a company programmer asks me if a design is reasonable, I always ask a few questions: how do you adapt to this change and the possible changes. I also pointed out that there is no need to implement these changes now. I rarely directly answer him or give him an answer, but after thinking about the questions I asked them, programmers can always make a good judgment on their design, so as to find a good solution. Therefore, use the refactoring idea to consider your design.
Programming Language
Although refactoring is a method independent of programming languages, the programming languages you use often affect the refactoring efficiency more or less, thus affecting your enthusiasm for using refactoring.
Refactoring initially started with smalltalk. with the extremely successful refactoring in Smalltalk, more object-oriented communities began to extend refactoring to other language environments. however, different features of different languages sometimes facilitate the application of refactoring, but sometimes create obstacles.
Supports refactoring language features and programming styles
Static type check and access protection
Static type check can narrow down the scope of reference to the part of the program you want to refactoring. for example, if you want to change the name of a class member function, you must change the declaration of the function and all references to the function. if the program is large, it is difficult to find and change the reference.
Unlike dynamic language such as Smalltalk, the language used to check static types (C ++, Java, Delphi, etc.) usually has class inheritance and related access protection (private, protected, public), which makes it easier to search for reference to a function. if the renamed function is declared as private, the reference to the function can only be in its class or the class's friend class (C ++. if it is declared as protected, only this class, subclass, and member class (same as the package class) can be referenced to this member function. if it is declared as public, you only need to include, import in this class, subclass, friend class, and other classes that explicitly introduce this class ).
I want to raise another question that everyone should pay attention. Designing principles that should be applied as early as possible during initial software development and the entire development process is an important factor for the success of a software project. From the perspective of encapsulation or refactoring, defining member variables and member functions should begin with the highest protection level. Apart from the obvious examples, you 'd better first define the member variables and functions as private. With the further development of software, when other classes put forward "extra" requests for this class, you gradually relax the protection. The principle is: if you can put it in private, do not put it in protected. If you can put it in protected, do not put it in public.
Language Features and programming styles that complicate refactoring
Preprocessing commands
Some language environments usually provide preprocessing commands, such as C ++. Since preprocessing is not part of the C ++ language, it is usually difficult to implement the refactoring tool. Some studies have pointed out that the program often requires better structure analysis after preprocessing, And the preprocessing instruction information does not exist at this point. Once refactoring has no direct connection with the source code, the programmer is unlikely to understand the refactoring results.
Code that depends on the object size and implementation format
C ++ inherits from C, which makes C ++ very popular and makes it easier for programmers to learn. But this is a dual-sided edge. C ++ supports many programming styles, and some of them violate the basic principles of elegant design.
It is difficult to refactor code that relies on the object size and implementation format using the pointer, cast operation, and sizeof (object) of C ++. Pointers and cast intervene in the alias concept, which makes it very difficult to find all the code that has reference to this object. A common feature of these features is that they expose the internal expression formats of objects, thus violating the basic principles of abstraction.
For example, C ++ uses the V-table mechanism to express member variables in executable programs. The inherited member variables come first, and the class is defined later. A refactoring that we often use and think is safe is to push up fields, that is, to move a member variable in the subclass to the parent class. Because the variables inherit from the parent class rather than the definition of this class, the actual location of the variables in the executable program after refactoring has changed.
If all variable references in the program are accessed through class interfaces, such changes will not be problematic. However, if a variable uses pointer operations (for example, a programmer has a pointer to an object, knows that the variable is in the 9th bytes of the class, and then assigns a value to the 9th bytes using pointer operations ), the above refacoting process will change the behavior of the program. In a similar situation, if the programmer uses conditions such as if (sizeof (object) = 15), the refactoring results may affect the size of the object and thus become insecure.
Language complexity
The more complex the language is, the more difficult it is to formally form the language semantics. Compared with smalltalk and slightly complex Java, C ++ is a very complex language, which makes the research on the refactoring tool of C ++ program much lags behind smalltalk and Java.
Resolution Reference Method
Since most of C ++ compilation is a parsing reference, after refactoring a program, we usually need to compile at least a part of the program and connect the executable program to see the impact of testing refactoring. Instead, smalltalk and Clos provide explain execution and incremental compilation techniques. Although JAVA does not explain the execution, it explicitly puts a public class in a unit, reducing the cost of executing a series of refactoring. Since the basic method of refactoring is to make small changes in each step, each step of testing, for C ++, the cost of each iteration is relatively high, so programmers become reluctant to make these small changes.
ReflectionMeta-level program analysis and change
This may concern researchers more than practitioners. C ++ does not provide good support for meta-level program analysis and changes. You cannot find metaobject protocols like clos. These protocols are sometimes very useful for refactoring. For example, we can change the selected instance of one class to the instance of another class.ReflectionThe Protocol automatically changes all references to the old object to the new instance.
Although JAVA does not have such powerful meta-level functions as Clos, the development of JDK has shown that Java is very powerful in this respect. As in the above example, we can also do this in Java.
A summary
Based on the comparison above, we think Java is the best language for applying refactoring. Recent observations also confirm this [Lance Tokuda].
From the perspective of practitioners, the most popular refactoring literature currently uses Java language as an example, including Martin's refactoring. Currently, there are several refactoring tools that support Java and smalltalk, but almost none of the C ++ tools. In this case, the complexity of the language has a great impact.
Of course, this does not mean that C ++ programmers should not use refactoring technology, but require more efforts. Refactoring technology has proved itself to be one of the best ways for OO system evolution. Don't give up.