A indicates that I process the C/C ++ static library lib file, load it To the memory, analyze its binary format, and extract various data information (OBJ, segment, symbol, relocation, and stringtable ).
B indicates that I read the Lib file, modify and adjust some of the data, and then write a new Lib file. Before rewriting data, it is natural to load and analyze data and reuse.
A has been being tested and used properly. B,CodeIt has been carefully checked many times and should not be a problem. But why is the generated lib file invalid (not recognized by the linker) at the end )? After a long and tough troubleshooting process, we found that the root cause of the problem was: A modifies the individual bytes of data loaded into the memory (the modified lib data is no longer valid), and B processes and outputs the data based on a's modified memory.
In fact, from the perspective of A, it is completely reasonable to modify the Lib file data in the memory, because its task is to extract information in lib and achieve its purpose, the specific implementation is internal details, and it does not modify the Lib file as the input data. When a appears, there is no B, and A is not responsible for the use of B. From the perspective of B, it has done a lot of work and ensured that it has no mistakes in its work. This should be enough. Its code is okay.
Neither a nor B is a problem. If the two are integrated, the problem arises. This is a helpless ending. After careful analysis, both A and B have responsibilities, and B has greater responsibilities. B Does not check and confirm its input data, so it is assumed that it is a valid input. A's responsibility is to modify the input data in the memory, which is relatively obscure and unstable. It is a potential cause of errors and should be avoided intentionally.
Afterwards, I re-read the code in section A and found that the input data in the memory will be modified here as clearly indicated in some annotations. It indicates that this code was an unstable factor. But I finally did it. Why? This should start from the beginning. In the Lib file format (UNIX archive format), each OBJ has a corresponding archivememberheader struct, and three special member (first linker Member, second linker Member, longnames Member) there is also such a struct that stores the basic information of the corresponding member. The first member (name) stores the name of the member, the data structure is a 16-byte char array. According to the provisions in the document, the 16-byte char array has the following data formats: //, //, xxxx/,/N, less than 16 bytes followed by spaces. Note: There is no ending character '/0' in the C string. to return this name, either malloc is filled with memory, or write a'/0' to get in and return the first address. Considering less memory usage and faster execution efficiency, I chose the latter. Analyze several situations of name: the first two types of length are fixed, and there must be multiple spaces behind them. Rewrite the first space to '/0' and then OK. The third case is, the ending/character of the last identifier does not belong to the part of the name. Rewrite it to '/0' and then OK. Do not worry about exceeding the 16-byte boundary. In the fourth case, N indicates the offset of the string table in the longnames area, which is originally a string ending with '/0' and does not require special processing. Just calculate the first address of the text based on N. In short, the above practice avoids memory application and data replication, which is a reasonable choice, although there are some disputes.
Finally, how can we solve the problem? I still keep the practice in a and choose to change the data that a has changed in B. Well, I 'd rather change it back later than give up the previous modification. After all, you have made a decision before, evaluated your merits, and no need to deny it afterwards. Besides, the current selection method is no longer troublesome than the previous code modification.