Dream code-a programmer's self-white (5)

Source: Internet
Author: User

This article declined to reprint http://www.weibo.com/0x2b

Dream code-a programmer's self-white (5)

In my reply, in addition to emphasizing memory usage, I expected to support arrays by extending the implementation of string. He really overestimated my results. But at the same time, he proposed some content that made me hard to understand and accept: I do not like to expose new interfaces. data must be in the char * format, and templates cannot be distributed and destroyed across boundaries. Standing on his preset standpoint, string must be char *, and raised many questions about string. In my opinion, it is rather ridiculous:


1. Does it mean that when loading data, if there are so many strings, will a large number of string objects be created ,? 2. If there are 1000 strings that only contain characters, do you want to create 1000 strings? The implication is that string is too wasteful. 3. Why do I separate the content of the string instead of the string object in the same memory block? 4. Any program will have its own string type and will only access the char * buffer of ADP. 5. Stick to the user's own string so that the string will be outside our DLL. The user should store the modified time back to the ADP runtime.


Question 1 is meaningless. Question 2 is not in line with the actual situation, and we have a lot of room for optimization. Question 3 has no clue. It can certainly be done through the appropriate Allocator, but what does it mean? What are the benefits? Question 4 is hypothetical. Question 5 can only be said to be insufficient basic skills. I suspect that he understands what he is saying when the string is outside the DLL?


For objects that span the DLL boundary, I found that not o is a problem, and many people have problems in understanding it. In fact, they do not understand why an object cannot cross borders and how problematic practices lead to problems. Even a long time later, it seems that when stringpointer was implemented, m also guided a Chinese W colleague on this issue, and then I replied that he was wrong. He also argued with me and wrote a code snippet to prove that he was right. In fact, cross-boundary lines are not easy to judge. You only need to check whether the code on both sides of the boundary is consistent with the assumption of the memory layout of the same object (or conventions, but there is no such agreement. If they are consistent, they can be different. Otherwise, they cannot be different. The Code consistency of object methods is not important at all. A debug
The build dll can sometimes be linked to another release build dll as an example. The heap memory is one of the most common problems. The reason is that the two sides of the boundary are not using the same heap management object (in fact, they are only some data structures ), but now the heap is no longer a problem in many cases, for example, the VC runs the library. In fact, the C ++ standard library has long passed objects across borders-not object pointers-for many years.


O redo the string support. It is decided to use a fixed-length buffer to store string! 32, 64,... 512 and so on. I was totally defeated. What's even more annoying is that O always puts his runtime design and the effect stored in XML together for discussion. What is the relationship between your runtime structure and the final storage? It is enough to clearly describe what the runtime looks like. We still cannot understand your runtime even if we read the XML stored 100 times. At that time, the worst thing to do was TD. We had to read the sample XML to write test cases for runtime. I don't know how to read TD!


After many years, I gradually began to understand O's ideas, where was the mistake. O At that time, he wanted to get a very "Low Level" runtime, and then added a good wrapper for users. The bottom layer was responsible for solving high-performance problems, and wrapper was responsible for solving ease of use. It seems that I had a similar idea when I first started writing a program. It is true that the software should be layered, but it is not such a division. If layering is a simple activity, the software would be too good. In addition, this hierarchy method is doomed to be difficult to escape from the abstract punishment, because different layers not only layer concepts, but also layer runtime. The more layers, the heavier the penalty. Here, I suddenly thought, is it true that many abstract punishments are such meaningless self-abuse? But I am not interested in seeing such code any more.


But to be honest, the attitude in O's mail angered me in a sense. We can work overtime to discuss his obscure designs, but O doesn't really understand the design and interpretation we give out. Just because I constructed an object through placement new, he could say "the Code logic is not clear to me" to reject our work for a few days-the first time I knew, the logic of placement new is confusing.


At that time, the other colleagues of ADP were not idle. I was impressed by the research on ZIP format. They wanted to support two things: zip64 and inplace editing. These two events are actually a signal. zip64 means that ADP wants to process large data, and inplace editing means that ADP wants to provide high-performance support when the program is running.


From the start of ADP, after only half a year, the failure was as early as the first success. I think the two main targets of ADP have all failed. One of the goals is the interchangeable file format, and the other is the unified object model within the company. For Objective 1, I wrote a long email, pointing out what to do and denying O's colleagues' use of memcpy to store data, it's a pity that the US side is not the case at all-even if my manager resends my email to expect it to be important. According to my point of view, ADP can be stopped and reflected at this time. Either change the target or correct the direction. The secondary goal, high-performance support during running, is already doomed to be impossible.


Although I was already disappointed with ADP at the time, I was not desperate and thought it was time to correct the mistake. I introduced DBC before. In a series of misunderstandings, I pulled out an error report mechanism, which is actually a log. I personally do not like log, and I do not agree with the use of log by many people. I think log is used to record the checkpoint of the workflow, rather than to verify the correctness of the program. The Program Correctness must be guaranteed by UT. Although log can also play a role in error diagnosis, it is only a by-product, like someone who loves to take videos, but it is obviously not used as evidence for solving the case, although it does play a role. I wrote a log design to O, which provides three interfaces, logger, formatter, and device. Used to filter log levels, format, and specify output devices respectively. The interface is designed for customization and replacement. In addition, there will be an assembly. I also stressed that the decision to initialize the log system should be handed over to the end user. This is not bad. O thinks it is good. However, it was this thing that finally made me make up my mind to stay away from the main work of ADP, so as to avoid damage to my reputation.

At that time, G colleagues suddenly inserted in, saying that the producer/consumer may be better, and then the implementation of logger can be very simple, as long as the data is packaged and then sent to a queue. In this way, the producer does not need to consider synchronization, and the Consumer processing is responsible for synchronization. The producer is user code, and the consumer packages the data to an internal queue, which is maintained by a worker thread, and the worker thread is responsible for formatting and writing the data to the device. The advantage is that log calls are not blocked. In this way, we only need a logger implementation.

At that time, multi-core and parallel computing were eagerly discussed. G has this idea naturally. Besides, the overall idea is not bad, but it is too imperfect. The first thing that makes people uncomfortable is the term. During the discussion of multithreading, I have never seen anything that does not need to be synchronized called a producer/consumer. I can only understand that a few days ago, he was bombarded by many threads. Secondly, it is enough to say that a logger is implemented. Looking back at the email today, we can clearly see the difference between me and G in the design software: I think about what logger wants to do and how it should be used by users, in those regions, users will need to expand and how to expand. g focuses on how logger should implement, how can it be completed, and how to use advanced technologies. I think a fundamental difference between the two is that I believe that libraries and products are in a coexistence relationship. Libraries must follow the open-close principle, while components in the G eye are closed.

In fact, G's so-called better choices have been solved in my design. My device interface has only one write function, which is very easy to implement, which means it is very easy to expand. This is intentional. To output data to multiple devices at the same time, you only need to implement a pseudo-device class and forward the data to multiple other device objects. As for whether the output blocking is not blocked, is it put in the queue? Is it necessary to start a thread? Is it necessary to consider it in the logger design? Leave it to the extension. Logger is also responsible for filtering log levels. First-time filtering is of course the minimum cost. Because the log usually needs to collect some execution data at that time, which means that the logs must be formatted before they can be packaged into the queue. This is costly. If there are many logs, the performance will be very problematic. Even if you move the formatted data to the work thread, it is costly to package the data into the queue. This also limits the data types that can be collected, and the data must be packaged. Obviously, log level filtering must be performed immediately or even optimized-theoretically, only one flag check overhead can be optimized.

Originally, this log is implemented in a DLL. g once asked to put it in a static library called core. O did not care about G's suggestion to redesign, but submitted the code to the core. As a result, we encountered a bug that two DLL links core and two log manager instances appeared. In fact, a solution has been provided in my original design. As long as the log system is initialized explicitly, the two log systems can use the same logger, formatter, and Device objects. G is very dissatisfied with this and thinks that we will not have this problem if we follow his advice. In fact, according to his requirements, the implementation will only have a bigger problem. How to start the worker thread will cause a lot of trouble. In fact, this is also true. In any case, G later gave his own design, and then let another colleague implement it. Since then, this thing has been suffering us. After a while, the performance went wrong. After a while, the DLL loading was locked and the program exited. If there are product requirements, we cannot start the service by default. If there are products, we need to start the service. Some products require automatic start later, and then do not require automatic start. The program crashes and cannot exit. This feature has attracted many crash reports for us. Of course, fixing such a bug can reveal the presence of ADP. how cool it is. Since that time, I have no motivation to think about the direction of ADP. I am just looking at it, but I am unwilling to do so.

In the days that followed, I found that I had never done anything, and ADP had nothing to make me look. Someone writes an iterator (not the STL style, first/next/valid style). In the constructor, all element pointers of the corresponding container can be copied to a container member variable of iterator, the advantage is that the traversal process is thread-safe, and the complexity is reduced from O (1) to O (n ). as for the iterator that I was forced to use the design by policy to combine the components, no one could change the code after writing the code. Do not understand memory
Model, the same dare to use lock-free. Create a lock for each element in the container-of course use the lock-free technology, and then the lock must be created on demand. Replace mutex with the atomic flag, and use yeild to let out the CPU instead of the wait lock. It's so unrestrained and passionate. It is a pity that there is no reborn Phoenix in the fire, only ashes.


At that time, I had several years of experience in multi-thread projects, parallel computing, and lock-free projects. At least I learned from textbooks and articles. Although CSP hasn't been cracked yet, it knows the direction. But what's the purpose? When I tried to repeat Andrei Alexandrescu with the lock-free sledgehammer, G warned: "general programming is very difficult, and abnormal security code is also very difficult, but compared with multithreading, the two are nothing more than baby's milk ". I thought I didn't tell them there was a kind of thing called MPI at that time, probably the only correct choice I made in the ADP project. Gradually, when other team members asked me questions about ADP, I began to be reluctant to explain the junk design. It is because I feel helpless when someone asks me with a humble face why I want to do this. I can't explain it, but I can only open it for myself. I said: This code is not written by me, and it is not designed by me. What is this sad thing.

ADP also pursues performance in a pathological way. For example, if memcpy is used, the traditional mutex is not used, and the boost. thread is replaced by TBB, the performance is really required. To copy an integer from the runtime of ADP, you must first query the map by string name to determine the offset, and then return shared_ptr as the newly connected objects. To read only one integer, we need to perform an lgn query and three memory allocations throughout the process! In runtime-related tests, memory allocation and release once took 70% of the execution time. What is the meaning of this result? efficient memory management? G is also unwilling to lag behind, implementing the implementation of a locally exclusive lock graph, the intention is to want multiple threads to access the same graph, if the operation part does not overlap, to avoid lock wait. I didn't understand the implementation of the graph from beginning to end, but I will test it. Result The fitting complexity of the test data is O (n ^ 4 )! I can't imagine how to sell such a thing to other teams with courage. Obviously it is a local variable, and stack allocation is enough. If it is necessary to get a new one, do you think the pointer is efficient? You can write a complete book on C ++.


Gradually, I found myself away from the core of development. I started to do various messy transactions, such as using scons on Linux and using swig to export APIs (I have to complain that swig does not support nested classes, killing me ), write the sample of Python/C #, modify memory leak, performance tuning, and upgrade the compiler. I even made a strange playback tool for the API call sequence (this was tortured by dynamic_cast, too many, and cannot be searched or replaced, because some do not need to be processed ). Either do something that looks a little difficult, such as writing a memory pool-because the boost is too slow and heap is optimized. There is also something that makes me crazy. Instead of boost, shared_ptr/weak_ptr has only one additional requirement: it can support intrusive counter at the same time. Shared_from_this does not meet this condition, and it still needs a separate counter. If you are interested, you can challenge this task. I can't do it anyway, but I can't do it. I have to create counter anyway, but no one in the United States who review the Code gives any comments.

However, I was happy that my baby was born before the Spring Festival of 08 years ago. That year, Shanghai had a lot of snow. However, the project is getting worse and worse. I'm not sure if there are people who feel the same as me. Maybe some people will not admit it if they feel it. I did very little work throughout the past 08 years. I should have pushed ADP so that it could be destroyed earlier, or would it have to last longer? Contrary to my pessimism, ADP is still involved in a lot of resources at the moment. My mood is getting worse and worse, so I want to kill ADP. I remember the prototype that I used to write an ADP? Even if I only have one person? However, there was almost no time in my spare time. Sometimes I wrote a few lines and couldn't find out what it meant for me to do so, so I gave up again.

By the end of, three ADP colleagues moved to protein, and I continued to work on ADP. Protein is also a library for processing 3D material packages. By the beginning of, ADP began to be integrated into protein. At this time, the company began to lay off workers. I am waiting to be cut, take a sum of money and leave. I don't want to waste any more on ADP. However, it is strange that it was not taken into consideration by me. Is it because I am still inexpensive? In any case, since it has not been laid, the work will continue.


PS: I know this problem is difficult to understand the terms and content. I want to reflect on all kinds of failures in the past from the technical level, but I will inevitably talk more about the technical mistakes I think, not just complaining like a grievance, that is not what I want. However, as I mentioned earlier, it is still difficult to communicate with technicians. What's more, it is an outsider? I don't want to write more clearly, but I really lack this ability. If specific problems are identified, I will modify them as appropriate. In general, I will be at a loss. If you can understand it, you are glad to understand it, and you have to follow him.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.