yesterday, the server went down, the printed log was very strange, down in the pure virtual function call.
The log shows that the virtual function call of the combat object, the previous normal, some time after the "loss of polymorphism", directly transferred to the parent virtual function, throwing a pure virtual function downtime.
and win platform under normal operation, Linux will kneel, the old project Linux tools are not complete, debug version can not be compiled, only log;windows under also reproduce not come out.
The process of finding this bug is quite interesting. Record (*^__^*)
never encountered such a bug before, of course, no clue at first.
The first thought, C + + often memory overwrite, but can normal call to the ordinary virtual function, should not memset such things to write bad object.
further analysis, to be able to properly tune to the parent virtual function, the object virtual pointer must point to the correct parent virtual table.
Recall C + + constructor process:
1, suppose CBase has a few pure virtual function cobj inherit it
2, CObj the construction order: first constructs the CBase part, at this time the object first address's virtual pointer points to the CBase's virtual table ... Then construct the new part of CObj, rewrite the virtual pointer of the object's first address, point to the virtual table of Classobj
if the destructor is reversed in the corresponding order, the cbase* pointer stored in the container, after being destroyed, points to the object whose first address is rewritten to point to the CBase virtual table.
this will cause the log to see the situation.
but I am not sure whether the destructor will change the virtual pointer, according to the construction, the idea that the symmetry of the destruction is expected to be.
Online also did not find information, decided to write code experiment ~ ~ results will not
....... There's no clue.
at night the compiler optimizes the copy constructor, and the default generated copy constructor is not actually called (No side effects) and is directly optimized for byte copies.
the written test code does not explicitly declare the destructor, and will not be skipped by the compiler. So after the delete, the vptr of the first address still remains unchanged.
today to change the test code immediately, in the parent class with a destructor declaration, implementation ... Sure enough, the content of the object's first address was rewritten after the destruction.
obj* PB = new OBJ ();printf ("addr (%d) \ n", * ((int*) PB));Delete PB;printf ("addr (%d) \ n", * ((int*) PB));
at this point, you can confirm that the server is down, because the object is destroyed, the virtual pointer is rewritten as pointing to the parent virtual table, the business layer when the time to kneel.
(because the memory pool is used, there is no problem with dangling pointers)
the rest is good to check, the Delete object when a business module still holds its pointer, did not clean. Search the reference relationship of the battle object and find the problem in a few minutes.
There is a list of guards in the Battle Castle, and when the NPC enters, it puts its own pointer in the list and dies without clearing. when other people come to play this castle, the running battle process will be adjusted pure virtual function, downtime.
Epilogue:
feel this bug is very deep, can buckle a lot of places.
For example, why not downtime in win? The combat objects in the project are also not explicit destructors, should be optimized by the VS compiler, and Linux does not.
For example, if there is no memory pool, then both sides should appear dangling pointer, directly down the machine ... Early exposure to the problem, but better analysis of the location of the bug.
also, in the win environment, even without the pure virtual function of the downtime problem, but the bug is hidden deeper. Back to the business logic from the memory pool to take pointers, get that old, random change, and then the problem, you see is a lump of shit, ghosts know exactly where to change the bad ( ̄" ̄) ~
or our boss said good:
memory Pool If it's a new project, I guess I won't use it, and I'll use Tcmalloc. I still want to be able to engineering on engineering, C + + development or to the library of thinking go. Otherwise the old dig pit pits.
PS: No clue I did three things before work:
in the former C + + project group to describe the problem, asked "who has encountered the Midway tuning pure virtual function, server down the situation";
ask in the technical group to join;
ask questions, invite wheel brother, R big
The second day to see someone reply: Sub-class destruction, the virtual table will be rewritten to iobj the virtual table, the destructor of the pointer, you can adjust the virtual function of Iobj, the other virtual functions will hang
even if you do not think that the "destruction process may be optimized by the compiler," can also be under their guidance to find the problem.
use the experience of others ha B ( ̄▽ ̄) d
Bug:c++ Runtime Call pure virtual function