Linux on Coredump debug: Call stack stack top function address is 0 analysis combat

Source: Internet
Author: User

In the past few days, we have received the Coredump report, the call stack is as follows:

(GDB) bt
#0 0x0000000000000000 in?? ()
#1 0x0000000000432bb4 in chargingnode::canprocessed (This=0x7f87b40118e0, maxtimestamp=9000000000) at src/sl/ chargingfile.c:406
#2 0x0000000000445de4 in Bucketfileadapter::checkin (this=0x2192b98, Starttime=<value optimized out>) at src/sl/ bucketfileadapter.c:118
#3 0x0000000000446114 in Bucketfileadapter::start (this=0x2192b98) at src/sl/bucketfileadapter.c:87
#4 0x000000000043560e in File_reader_run (arg=0x2192b98) at src/sl/chargingfileadapter.c:234
#5 0x0000003657607851 in Start_thread () from/lib64/libpthread.so.0
#6 0x0000003656ee890d in Clone () from/lib64/libc.so.6

The stack top function address is 0x0.

This is a very interesting phenomenon, I have not encountered this case before.

Take a look at the code: the C + + code of the error line:

if (Chargingfile && (chargingfile->candecoded () = = False))

Using GDB to see the value of the chargingfile, the results are optimized (using the-o2):

(GDB) P Chargingfile
$ $ = <value Optimized out>

Then we analyze the C + + code corresponding to frame 1, chargingfile even if NULL can not lead to the core, so it is only possible that the chargingfile is not empty, but the call to the Candecode () function is a problem.

Corresponding Disassembly code:

(GDB) Disas
Dump of assembler code for function chargingnode::canprocessed (long):
0x0000000000432b20 <+0>: mov%rbx,-0x18 (%RSP)
0X0000000000432B25 <+5>: Lea 0x78 (%rdi),%RBX
0x0000000000432b29 <+9>: mov%rbp,-0x10 (%RSP)
0X0000000000432B2E <+14>: mov%r12,-0x8 (%RSP)
0x0000000000432b33 <+19>: mov%rdi,%rbp
0x0000000000432b36 <+22>: Sub $0x18,%rsp
0X0000000000432B3A <+26>: mov%rbx,%rdi
0x0000000000432b3d <+29>: mov%rsi,%r12
0X0000000000432B40 <+32>: callq 0x407820 <[email protected]>
0x0000000000432b45 <+37>: mov 0x18 (%RBP),%RCX
0x0000000000432b49 <+41>: mov 0x28 (%RBP),%RDX
0X0000000000432B4D <+45>: mov 0x38 (%RBP),%rax
0X0000000000432B51 <+49>: Sub 0x40 (%RBP),%rax
0x0000000000432b55 <+53>: Sub%rcx,%rdx
0x0000000000432b58 <+56>: Sar $0X3,%RDX
0X0000000000432B5C <+60>: Sar $0x3,%rax
0x0000000000432b60 <+64>: Add%RAX,%RDX
0x0000000000432b63 <+67>: mov 0x50 (%RBP),%rax
0x0000000000432b67 <+71>: Sub 0x30 (%RBP),%rax
0X0000000000432B6B <+75>: Sar $0x3,%rax
0x0000000000432b6f <+79>: SHL $0x6,%rax
0x0000000000432b73 <+83>: lea-0x40 (%rax,%rdx,1),%rax
0x0000000000432b78 <+88>: Test%rax,%rax
0x0000000000432b7b <+91>: je 0x432b83 <chargingnode::canprocessed (long) +99>
0x0000000000432b7d <+93>: CMPB $0x0,0x70 (%RBP)
0x0000000000432b81 <+97>: je 0x432ba0 <chargingnode::canprocessed (long) +128>
0x0000000000432b83 <+99>: mov%rbx,%rdi
0x0000000000432b86 <+102>: callq 0x407250 <[email protected]>
0X0000000000432B8B <+107>: Xor%eax,%eax
0x0000000000432b8d <+109>: mov (%RSP),%RBX
0x0000000000432b91 <+113>: mov 0x8 (%RSP),%RBP
0x0000000000432b96 <+118>: mov 0x10 (%RSP),%r12
0x0000000000432b9b <+123>: Add $0X18,%RSP
0x0000000000432b9f <+127>: retq
0X0000000000432BA0 <+128>: CMP%r12,0x68 (%RBP)
0X0000000000432BA4 <+132>: JG 0x432bd0 <chargingnode::canprocessed (Long) +176>
0x0000000000432ba6 <+134>: mov (%RCX),%rdi
0X0000000000432BA9 <+137>: Test%rdi,%rdi
0x0000000000432bac <+140>: je 0x432bb8 <chargingnode::canprocessed (long) +152>
0x0000000000432bae <+142>: mov (%rdi),%rax
0x0000000000432bb1 <+145>: Callq *0x20 (%rax)
---Type <return> to continue, or Q <return> to quit---
= = 0x0000000000432bb4 <+148>: Test%al,%al

......

See the red Assembler code? CALLQ is a function call instruction on X86, followed by an indirect address, not a direct function address.

This form of assembly code, I have seen only two cases: 1 is the kernel boot time in order to prevent compiler optimization and so on, 2 virtual function call.

So, the Candecode function is a virtual function? Check the next source, sure enough: virtual bool candecoded ();

That said, the%eax is the class vptr.

Prove it:

(GDB) I R
Rax 0x7f87b40c4670 140220818015856
RBX 0x7f87b4011958 140220817283416
RCX 0x7f87b4054478 140220817556600
RDX 0X4C 76
RSI 0x0 0
RDI 0x7f87b40cd480 140220818052224

According to the habit of x86_64 function calls, RDI is the value of Chargingfile.

(GDB) p * (chargingfile*) 0x7f87b40cd480
$7 = {_vptr. Chargingfile = 0x7f87b40c4670, NodeName = "", HostName = "Bucket", FileName =
"/incoming4cdrsch/reported/acr/bucket/bucket12/mas2_-_0000001709.20130704_-_2126+0800.inc", FileType = CDR_FILE_ TYPE_ACR, qid = {id =
-1}, timestamp = 1372944386, Bufferedchargingrec = False, Chargingnode = 0x7f87b40118e0, static Filestatusdir =
"/incoming4cdrsch//status", static Processedrecnumlimit = 300000, static acrfilestotalsize = 0, Decodeflag = False, LocalF Ileflag =
False, Fpfile = 0x0, Fpstatusfile = 0x0, Stopflag =false, statusfilename = Keyboardinterrupt:quit
, offset = 4184212, Recordnum = 5695, Totalrecordnum = 5695, Accumnum = 0, static batchsize = 1000}

Then look at the red color above and find that the value of vptr is the same as the value of Rax. The previous analysis is correct.

Let's take a look at what's in memory%rax (0x20) According to 0X0000000000432BB1 <+145>: Callq *0x20 (%rax) :

%rax=0x7f87b40c4670

0x20 (%rax) = 0x7f87b40c4690

Look at the memory:

(gdb) x/40x 0x7f87b40c4670
0x7f87b40c4670:0x4100434e 0x2d5f3253 0x00000211 0x00000000
0X7F87B40C4680:0XB4024F10 0x00007f87 0xb40cd470 0x00007f87
0x7f87b40c4690: 0x00000000 0x00000000 0x00000000 0x00000000

Look at the blue above, the address is 0x0. What does it mean? Children's shoes, the object is destroyed, the virtual function table is covered.

In combination with the code, it is found that the problem is caused by the improper use of mutex in the case of multithreading, causing the object to be released prematurely.

Summarize:

1. Hit the call stack stack top function address is 0, consider the virtual function table is destroyed, that is, the case of the object chant destruction.

2. Familiar with the usual x86_64 function call habits, RDI is the first function parameter.

3. Multi-threaded lock use must be aware of the scope, the lock is too small may not be enough, too big performance will be problematic.

Linux on Coredump debug: Call stack stack top function address is 0 analysis combat

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.