Presumably, Java users have used JDK containers, List, Set, and Map. Every day, the code runs in thousands of JVMs around the world, and every day programmers are using these classes. Do you know who wrote these cool code? Is Joshua Bloch. He used to work in Sun. Now he jumped to Google and Google is invincible. He is a master. Although he is a master, he is also writing code, so such a person's article must be down-to-earth. Instead, some of them only touch their mouths. During the meeting, they drew a bunch of boxes and lines on the whiteboard and asked others to come to coding ". Although I have been writing Java for many years, I have designed many things, but I have been greatly assisted in his experience and design. Although this was an interview in, the content is still worth reading. I translated this interview (Joshua Bloch: A conversation about design-JavaWorld ). Next, let's hear from others about his design conversation. As a Java developer, I mark my comments :)
Bill Venners: In the preface to your objective Java Language Programming Guide, let's talk about API design. I am also doing this job. If I manage a large software project, I will break down the entire system into subsystems so that developers can design the interfaces of these subsystems. These interfaces are APIs. Here, if we compare the API design concept with the popular extreme programming (extreme programming), what is the position of the API design concept in software development?
Joshua Bloch:In my own development experience, I have seen too many monolithic structures. It's like someone wants to design a record-oriented file system, instead of breaking down the entire system as you said just now. /* It is very important for every programmer to write such "stubborn Stone" Code */to break down the system, but do not ignore the good independent abstraction (freestanding into action) of each subsystem) design is equally important.
Compared with the decomposition subsystem, in fact, it is easier for people to design the software into a reverse dependency (reverse dependencies ). That is, when the underlying system is designed, it is only intended for the initial high-level system customers. This is especially true when there are inexperienced programmers. The variable names used by the initial high-level customer system will penetrate into the underlying module. At the end of development, no module can be reused. The result is the stubborn point I mentioned earlier.
In my book, I describe the reasons for decomposing subsystems so that subsystems can be decoupled. Sometimes the code you wrote can also be used well elsewhere, provided that this subsystem must be a well-designed and independent abstract system.
Venners:OO was declared early on to improve software reusability. But in fact, it is hard to achieve software reuse. Because the requirements of each person are always different from the functions provided by the existing software, people will write new software from the beginning. Maybe even if some people are designed not to be reusable, but people still try to write code. How important do you think is software reusability?
Bloch:I think reusability is extremely important, but it is also very difficult to implement. I am developing Java Collections Framework, java. math, and so on in Sun. These companies are reusable components because thousands of people are already using these Apis.
In my previous work, I found that 75% of the code I wrote was used by other systems. To achieve this high degree of reuse, I had to design the code very carefully and had to spend a lot of time to clearly separate the subsystems. I write unit tests for these subsystems for independent debug.
However, many developers do not. When you use extreme programming, the ultimate programming philosophy uses the simplest way to solve things. This is a good idea, but it is easy to misunderstand.
In fact, those eXtreme Programming masters did not advocate that they only use the fastest written code to solve the problem, and they did not let everyone give up the design. They advocate that the unused detail functions should be put first, and then added. This is a very important concept, because new functions can be added at any time, but existing functions cannot be removed at will. You can't say, "Sorry, we screwed up this feature. We need to cut it down because other code depends on it ". The consequence is that everyone will be crazy, so if you have any questions about the functional requirements, put it aside first.
Extreme Programming also emphasizes refactoring. A large amount of time spent in the reconstruction process is spent cleaning up code and APIs and extracting modules. These jobs are very important, and I suggest you do not fix these Apis too early. In other words, if you carefully design and divide the modules in the early stages of development, you can save the work at the end.
Venners:Why?
Bloch:It is proved by practice that it is difficult to refactor a large amount of code. /* I often see some people write a lot of messy code and use the name of refactoring as an excuse */if faced with a tightly coupled system, you need to find the duplicate code in every part of the code. It is a huge job to reasonably refactor so many codes. On the contrary, if the system is loosely coupled, you can easily modify the division between modules if you find any problems.
However, it is not a good way to contradict eXtreme Programming With the API design method I advocate. If you discuss programming issues with extreme programming masters like Kent Beck, you will also find that Kent Beck has also done a lot of design methods I advocate. /* Ha, maybe it refers to k Beck's JUnit design */
Now let's go back to the first question. If you are a manager, of course you should give your team members a good design before they start to write code. At the same time, you should not let the team members design every detail; you should ensure that they can complete the entire work, first do a minimum design.
Venners:In this way, it sounds like extreme programming is to recommend that you write the simplest set of functions to develop the system, rather than writing programs freely to make the system run.
Bloch:That's right. In fact, those who write programs at will often spend more time working than those who carefully design the modules. Of course, API design also takes time.
If the code written at Will is published as a public API, maintaining these low-quality APIs will become a huge burden and will lead to strong customer dissatisfaction.
Bill Venners:In your book, you can think about how to improve the code quality from the API perspective. Can you explain why you think so?
Josh Bloch:I am writing large-scale programs here. If you only encounter small-scale problems, it is relatively easy to write high-quality code. If you can break down the problem into various functions, you only need to concentrate on one thing each time so that you can do things better. In this way, you can write a large program into a small program.
In addition, module decomposition represents a key factor in software quality. Suppose there is a tightly coupled system, and when one of the modules is modified, the whole system will not work. However, if you use APIs to design a clear division between modules, you can maintain and improve a module without affecting other modules.
Bill Venners:Can you explain the "large-scale programs" and "small-scale programs" you just mentioned?
Josh Bloch:A large-scale program refers to a problem that cannot be solved by an independent small program, or the problem must be broken down into sub-problems. Large-scale programs include the inherent complexity of big problems. In contrast, "small-scale programs" are like the question of "how to sort a float array better.
Venners:Next we will discuss the trust in the customer's code ). To what extent should we trust the customer code? In your book, you talk about defensive copy to pass parameter objects. This defensive copy is intended for untrusted customer code. In this case, it is not to sacrifice efficiency in exchange for program robustness? For example, copying a large object will affect the efficiency. /* The defensive copy here refers to creating a new object to avoid transmitting the same reference, and then setting the values in the original object, just like deep copy */
Bloch:Obviously, this is a trade off problem. On the one hand, I will not focus too early on program efficiency optimization issues. Since this is not a problem, you do not need to care about it. Another consideration for this problem is the immutability of objects ). /* For example, String class */if an object cannot be modified, we do not need to copy it.
Of course, in some cases, the customer code can be trusted, and we can be sure that your code will not be called incorrectly. In this case, we can reduce program robustness for efficiency. We often see comments such as "Please make sure the caller of this function will not change this object" in the middle of the code written in C. Any C or C ++ programmer has made such comments. But sometimes you will forget that these objects cannot be modified. Even more troublesome is that although you know that these objects cannot be modified, you still accidentally pass them to other places in the program. At this time, you cannot control them and they will not be modified.
Compared with the security of client code, it is more convenient to directly make defensive copies or use unmodifiable objects. Unless the program requires high efficiency, the most direct method is not to directly pass these objects. The recommended method is to write the code first and check whether the running efficiency is high enough. In case the required efficiency is not met, we should carefully consider whether to relax the security restrictions.
In general, we should not allow problematic client code to destroy the programs we write. We hope to isolate program failures from modules so that program failures will not be transmitted between modules. This also includes preventing intentional hacking ). From a broader perspective, defensive programming also defends against sloppy code and low-quality documents, because some client code writers do not know whether they have the responsibility to modify these objects.
- Defensive copying and contract)
Venners:If I upload a defensive copy object in the constructor, should I write it into my document? If I do not write it, I will have the opportunity to remove defensive copies for efficiency in the future. But without writing the document, the client programmer cannot confirm whether the constructor has made a defensive copy. They may write defensive copy objects by themselves and pass them as parameters to the constructor. This results in two defensive copies.
Bloch:If we do not write defensive copies into documents, will client programmers modify these objects? The answer is clearly that only stubborn programmers will not modify these objects to avoid disrupting your code. But in fact programmers are not so stubborn-of course they will modify it. Since the document is not prohibited from doing something, the programmer will have to do it as appropriate. Therefore, we need to write defensive copies into documents so that even if the client code programmer is a stubborn programmer, he can make reasonable processing of input parameters according to the document.
Ideally, we should write defensive copies into the document. However, if you look at the code I wrote, you will find that I did not write these documents myself. I did defend against sloppy customer code in the code, but I didn't document it.
For code of a wide range of customers, there is a problem: people will look at your code and find that the document and code are inconsistent. In this regard, my general answer is: Well, I learned some new information later, and I will correct this question. So the problem with this situation is how careful you are to write a document. Maybe I am not careful enough, maybe I am not serious enough...
- API design and Reconstruction
Venners:How should we focus on API design in the reconstruction process? Because you said that developers do not need to perform many refactoring.
Bloch:This is my idea. In addition, refactoring is often an ex post facto API design. When you write the code and find that there are repeated code in many places, you will carefully design and take the code to a module. From this perspective, this is the same as the previous design. In fact, these two development methods always coexist, because development is a process of superposition (iterative. When you try to design the software from scratch, you can verify that the design is correct only when you finally use it. Even those with many years of development experience cannot be properly designed at once. /* I thought such a master could solve all the designs at the beginning */
Doug Lea and I often talk about this issue. When we use the code we write together, we will find that the Code may not always run. Then we can look back and find out the API design problems. Does this mean that Doug and I are both stupid? Of course not. Because no one can accurately predict the exact needs of an API before using it. This is why when writing interfaces and abstract classes, it is important to write as much implementation code as possible for these interfaces and abstract classes before submitting these Apis. It is too difficult to change the API after submission, so it is best to do this well in advance.
Venners:Should I trust the contract objects passed over? Recently I wrote an object that implements the Set interface so that it can be used on different virtual machines after serialization. This class inherits from the one you wrote.AbstractSet, I name itConsistentSet. And I wrote a constructor for it. This constructor acceptsSetAnd then I put this Set in the internal array.
Write thisConsistentSetClass, I suspect the one passed in through the constructor.SetWhether repeated elements exist. This violates the Set interface contract. Maybe I should check the imported Set for repeated elements? However, from the perspective of OO, such a check violates the basic idea that every object must be held accountable.
Bloch:I don't think you have a choice. You can only trust these objects that implement interfaces. Once someone violates these contracts, the entire system will be chaotic. A simple example is: equals objects must have the same hashCode. If someone violates this rule, neither hash table nor hash set can work normally.
In summary, once an object violates the contract, the objects that work with it will run abnormally, as in the example I just mentioned. Although I know that you are troubled by this problem, I think you still have to trust these contract interfaces. If you are really confused about this, you can ask some communities for suggestions and adopt the "trust but verify" strategy. The best strategy for this is to use assertion/* You can set Java parameters to switch this function */, because you can enable and disable assertions at any time. You can use assertions to verify whether the objects comply with the contract. Once the program runs abnormally, you can enable assertions to check what went wrong.
Lu Shengyuan <michaellufhl@yahoo.com.cn>