"Turn" uses DBX, KDB to analyze Coredump under AIX

Source: Internet
Author: User
Tags dbx

Source: https://www.cnblogs.com/fifteen/archive/2012/03/20/2407449.html

Recent work involves analyzing the core dump file, finding this good post, and then turning to my blog O (∩_∩) o~

Ps:

Where can you get dbx?
It is part of Bos.adt.debug

# lslpp-w/usr/bin/dbx
File Fileset Type
-------------------------------------------
/usr/bin/dbx Bos.adt.debug Symlink

The following transfers are from: http://www.aixchina.net/?6141/viewspace-18882

Introduction to the I Core dump analysis

environment variable Settings

The basic configuration parameters for each user, including the core size, can be limited through the/etc/security/limits file. or change the core size limit for the current environment with Ulimit.

By default, the app process uses the file name core when it generates core dump. To prevent the process core from overwriting each other under the same working directory, you can define the environment variable core_naming=true and then start the process so that a file named Core.pid.ddhhmmss is generated. You can use the file Core command to see which process is generating the core.

By default, the application process dump contains all of the shared memory, and if you want to exclude shared memory content at dump, you can set the environment variable core_noshm=true before starting the process.

The system has a parameter fullcore to control whether the complete core is generated when the program Coredump. To avoid information loss, it is recommended to open Fullcore. You can use the Lsattr–el sys0 query to open fullcore, use chdev-l sys0-a fullcore=true to change the Fullcore state to open. You can also call the Sigaction routine in the program to set the Fullcore, refer to the following test program:


Fullcore Setup Example

                
Test. C
#include <iostream>


int main (int argc, char* argv[])
{
Char str[10];
struct sigaction s;

S.sa_mask.losigs = 0;
S.sa_mask.hisigs = 0;
S.sa_flags = Sa_fulldump;


Std::cout << "Input str!\n" << Std::endl;
Std::cin >> str;
return 0;
}

Find Core Dump

The core of the application process is generated under its current working directory and can be used within the application to switch the current working directory using the CHDIR function. Use the PROCWDX command to view the current working directory of the process. The core of the system is generated under LG_DUMPLV and is transferred to the/var/adm/ras/directory (if there is enough space) on the reboot, and remains in the LG_DUMPLV and is likely to be overwritten at any time.

You can use Errpt-a to view detailed error messages that identify c0aa5338 Sysdump (System core), B6048838 Core_dump (Process Core), get the process of generating the core, and the location of the core file. Use SNAP–AC to collect dump information for the system.

Core Dump information collection

If possible, using DBX to analyze the results directly on the machine where the coredump is occurring is the most convenient method of analysis. In this case, be careful not to log in directly with the root user and then analyze with DBX, but you must do so under the user to which the application belongs, because the core may need to rely on some libraries in the context of the application runtime, so that the application's environment variables will be used.

If you need to retrieve the core information from the production machine in the laboratory analysis, you need to collect some relevant information. Process core analysis typically relies on at least the application executable program, and sometimes includes some runtime dynamic library information. If you need to collect complete core-related information, run the Snapcore <core path along with the name > < executable file and name, such as Snapcore./core./a.out, and then remove it under/tmp/snapcore The corresponding. Pax. Z file.

The normal collection process should be as follows:


Snap Core collection Process

 
# Snapcore./core./a.out
Core file "./core" Created by "a.out"

pas S1 () in progress ....

Calculating space required.

Total space required is 14130 Kbytes.

Checking for available space ...

Available space is 807572 kbytes

Pass1 complete.

Pass2 () in progress ....

Collecting fileset information.

Collecting error Report of Core_dump errors.

Creating Readme file.

Creating Archive file ...

Compressing archive file ....

Pass2 completed.

Snapcore completed successfully. Archive created In/tmp/snapcore.

# cd/tmp/snapcore
# ls
Snapcore_352276.pax. Z
# uncompress Snapcore_352276.pax. Z
# ls
Snapcore_352276.pax
# pax-r-F snapcore_352276.pax
# ls Note the need to ensure that a file similar to the following (executable file,/CORE/ERRPT/LSLPP/USR Directory, etc.):
README errpt.out usr
a.out lslpp.out
Core Snapcore_352276.pax
#

II example of using DBX to analyze core dump

DBX is a source-level debugging tool based on the command line interface under AIX. This document only provides some basic DBX analysis instructions, please refer to "General Programming concepts:writing and Debugging Programs" for the description of DBX for more information.


Preliminary analysis

                
#dbx <program name> Core

Example:

# dbx./test Core
Type ' help ' for help.
Warning:the core file is not a fullcore. Some info May
Not being available.
[Using memory image in core]
Reading symbolic information ... warning:no source compiled with-g

Segmentation Fault in Raise at 0xd022e1e4
0xd022e1e4 (raise+0x40) 80410014 lwz r2,0x14 (R1)

Shows the location where the current process executes when the core occurs (a specific row can be seen in the case of-G compilation):

(DBX) where
Raise (??) at 0xd022e1e4
Main (0x1, 0x2ff22d48) at 0X100019C4

Attention:

If you are analyzing an offsite core file, you need to use Snapcore to collect relevant core information. For dependent link libraries, note that you need to increase the -p ldpath=newpath:...  Reset link library path (only all dependent libraries have been linked to fully reproduce the core dump fault site), refer to the DBX Help documentation for more information.

# Cd/tmp/snapcore
# dbx–p/=./a.out Core
Type ' help ' for help.
[Using memory image in core]
Reading symbolic information ... warning:no source compiled with-g


Iot/abort trap in raise at 0XD01F4F60

List source information

List the program source code (list, you need to run the DBX command using-I to indicate the source search path, and use-G compilation) or the sink code (LISTI):

(DBX) Listi main
0x10001924 (main) 7c0802a6 MFLR r0
0x10001928 (main+0x4) bfa1fff4 stmw r29,-12 (R1)
0X1000192C (main+0x8) 90010008 STW r0,0x8 (R1)
0x10001930 (main+0xc) 9421ffa0 STWU r1,-96 (R1)
0x10001934 (main+0x10) 83e20064 lwz r31,0x64 (R2)
0x10001938 (main+0x14) 90610078 STW r3,0x78 (R1)
0X1000193C (main+0x18) 9081007c STW r4,0x7c (R1)
0x10001940 (main+0x1c) 83a20068 lwz r29,0x68 (R2)

Enumerating Variable Contents

Example code:

#include <iostream>
#include <signal.h>
int g_test = 0;

int testfunc (int &para)
{
para++;
return 0;
}

int main (int argc, char* argv[])
{
struct sigaction s;
S.sa_handler = SIG_DFL;
S.sa_mask.losigs = 0;
S.sa_mask.hisigs = 0;
S.sa_flags = Sa_fulldump;
Sigaction (Sigsegv,&s, (struct sigaction *) NULL);

Char str[10];
g_test = 0;

TestFunc (g_test);
Abort ();
}
# XlC Test. C-g

Take the global variable G_test example:

#print G_test shows the value of G_test

#print sizeof (g_test) shows the size of the G_test

#whatis G_test shows the type of g_test

#print &g_test shows the address of the G_test

#&g_test/16x displays the value of 16 consecutive WORD (? byte) from the beginning of the address of the G_test

If you do not use the-G compilation, you cannot dynamically obtain information such as the type, size, and so on of the g_test, but you can get the address of g_test and query the value of the area where the address is stored.

For example:

#./a.out
Iot/abort Trap (coredump)
# dbx./a.out Core
Type ' help ' for help.
[Using memory image in core]
Reading symbolic information ...

Iot/abort trap in raise at 0XD03365BC
0XD03365BC (raise+0x40) 80410014 lwz r2,0x14 (R1)
(DBX) Print g_test
1
(DBX) Whatis g_test
int g_test;
(DBX) print sizeof (g_test)
4
(DBX) Print &g_test
0x20000428
(DBX) &g_test/16x
0x20000428:0000 0001 0000 0000 0000 0000 0000 0000
0x20000438:0000 0000 0000 0000 0000 0000 0000 0000

Enumerating the contents of registers

List the contents of the Register:

(DBX) Registers

The following simulation of a simple core dump, assigning a value to 0 addresses raises the issue of core dump:

# dbx./a.out Core
Type ' help ' for help.
Warning:the core file is not a fullcore. Some info May
Not being available.
[Using memory image in core]
Reading symbolic information ... warning:no source compiled with-g


Segmentation Fault in Main at 0x10000348
0x10000348 (main+0x18) 90640000 STW r3,0x0 (R4)
(DBX) where
Main (0x1, 0X2FF22CCC) at 0x10000348
(DBX) Registers
$r 0:0x00000000 $STKP: 0x2ff22bf0 $toc: 0x20000414 $r 3:0x00000012
$r 4:0x00000000 $r 5:0x2ff22cd4 $r 6:0xdeadbeef $r 7:0x2ff22ff8
$r 8:0x00000000 $r 9:0x04030000 $r 10:0xf0577538 $r 11:0xdeadbeef
$r 12:0xdeadbeef $r 13:0xdeadbeef $r 14:0x00000001 $r 15:0X2FF22CCC
$r 16:0x2ff22cd4 $r 17:0x00000000 $r 18:0xdeadbeef $r 19:0xdeadbeef
$r 20:0xdeadbeef $r 21:0xdeadbeef $r 22:0xdeadbeef $r 23:0xdeadbeef
$r 24:0xdeadbeef $r 25:0xdeadbeef $r 26:0xdeadbeef $r 27:0xdeadbeef
$r 28:0xdeadbeef $r 29:0xdeadbeef $r 30:0xdeadbeef $r 31:0xdeadbeef
$iar: 0x10000348 $msr: 0x0000d0b2 $cr: 0x22282489 $link: 0X100001B4
$CTR: 0xdeadbeef $xer: 0x20000020
Condition status = 0:e 1:e 2:e 3:l 4:e 5:g 6:l 7:lo
[unset $noflregs to view floating point registers]
[unset $novregs to view vector registers]
In Main at 0x10000348
0x10000348 (main+0x18) 90640000 STW r3,0x0 (R4)
(DBX) Print $r 3
0x00000012
(DBX) Print $r 4
(nil)

This example is relatively simple, from the final assembly instruction "STW r3,0x0 (R4)" Can simply see that the program core dump is due to 0 address (0+R4) deposit (R3 Register value) caused.

View Multithreading-related information

If the following environment variables take the default OFF value, the system completely disables the appropriate debug list, which means that the DBX command will not show any objects:

Aixthread_mutex_debug

Aixthread_cond_debug

Aixthread_rwlock_debug

can use

Export Aixthread_mutex_debug=on

Open Aixthread_mutex_debug.

  • View Thread Information

    (DBX) Print $t 1//print basic information for T1 threads

    (dbx) attribute

    (DBX) condition

    (DBX) Mutex

    (DBX) Rwlock

    (DBX) thread

    For example:

    (thread_id = 1, State_u = 4, priority = $, policy = other, attributes = 0x20001078)

  • Toggles the current thread (the default current thread is to receive a core trigger signal)

    (DBX) thread current [Tid]

    For example (> indicates the current thread at core dump):

    (DBX) thread
    Thread state-k Wchan state-u k-tid Mode held scope function
    $t 1 Wait 0x31bbb558 running 10321 k No pro _ptrgl
    $t 2 wait 0x311fb958 running 6275 k No pro _ptrgl
    > $t 3 run running 6985 k No pro _p_nsleep
    $t 4 wait 0x31bbbb18 running 6571 k No pro _ptrgl
    $t 5 wait 0x311fb9d8 running 7999 k No pro _ptrgl
    $t 6 wait 0x31bf8f98 running 8257 k No pro _ptrgl
    $t 7 Wait 0x311fba18 running 8515 k No pro _ptrgl
    $t 8 wait 0x311fb7d8 running 8773 k No pro _ptrgl
    $t 9 Wait 0x311fbb18 running 9031 k No pro _ptrgl
    $t wait 0x311fb898 running 9547 k No pro _ptrgl
    $t wait 0x311fb818 running 9805 k No pro _ptrgl
    $t wait 0x311fba58 running 10063 k No pro _ptrgl
    $t wait 0x311fb8d8 running 10579 k No pro _ptrgl
    (DBX) Thread current 3
    (DBX) where
    _p_nsleep (??,??) at 0xd005f740
    Raise.nsleep (??,??) at 0xd022de3c
    Sleep (??) at 0xd0260344
    Helper (??) at 0x100005ac
    (DBX) Thread current 4
    Warning:thread is in kernel mode, not all registers can be accessed.
    (DBX) where
    Ptrgl._ptrgl () at 0xd020e470
    Raise.nsleep (??,??) at 0xd022de3c
    Raise.nsleep (??,??) at 0xd022de3c
    Sleep (??) at 0xd0260344
    Helper (??) at 0x100005ac
    (DBX)

Limitations of Core Dump analysis

Do not expect to be able to rely on core dump analysis to solve all the problems, here is a simple example of buffer overflow, in this example, because the buffer overflow overwrite the call stack information, thus completely lost the positioning basis:

[Email PROTECTED]/TMP#&GT;XLC test. C-g-O test2
[Email protected]/tmp#>
[Email Protected]/tmp#>./test
Input str!

012345678901234567890123456789
Segmentation Fault (Coredump)
[Email protected]/tmp#>dbx./test2 Core
Type ' help ' for help.
[Using memory image in core]
Reading symbolic information ...

Segmentation Fault in Test2. At 0x34353634
0x34353634 (???) Warning:unable to access address 0x34353634 from core
(DBX) where
Warning:unable to access address 0x34353634 from core
Warning:unable to access address 0x34353634 from core
Warning:unable to access address 0x34353630 from core
Warning:unable to access address 0x34353630 from core
Warning:unable to access address 0x34353634 from core
Warning:unable to access address 0x34353634 from core
Warning:unable to access address 0x34353630 from core
Warning:unable to access address 0x34353630 from core
Warning:unable to access address 0x34353634 from core
Warning:unable to access address 0x36373841 from core
Test2. () at 0x34353634
Warning:unable to access address 0x36373839 from core
Warning:unable to access address 0x36373839 from core
(DBX)








System Dump Analysis

environment variable Settings

The current dump configuration information for the system can be viewed through "sysdumpdev–l":

[Email protected]/#>sysdumpdev-l
Primary/dev/hd6
Secondary/dev/sysdumpnull
Copy Directory/var/adm/ras
Forced copy Flag TRUE
Always Allow dump FALSE
Dump compression on

Note that older versions of AIX "Always allow dump" may be turned off by default, it is recommended to open when the system is crash, you can use command sysdumpdev–k, or use Smitty, System Environments-> ; Change/show Characteristics of System Dump menu settings.

Sysdumpdev–l get statistics about the recent dump generated by the system:

#>sysdumpdev-l
0453-039
Device Name:/dev/hd6
Major Device Number:10
Minor Device Number:2
size:18885120 bytes
Uncompressed size:113724523 bytes
Date/time:sat Jul 14:20:22 beist 2007
Dump status:0
Dump completed successfully
Dump copy filename:/var/adm/ras/vmcore.2.z

In order to ensure that the system appears crash, dump device can save the dump information, need to reasonably configure the size of the dump device, you can use Sysdumpdev–e to estimate the space required by the system dump. The general recommended dump device value is 1.5 times times the size of the sysdumpdev–e estimate.

environment variable Settings

This document only provides some basic dump analysis instructions, please refer to "KDB Kernel debugger and KDB command" for more information.

Preliminary analysis

The KDB dump file analysis requires the use of the kernel file/unix that generated the dump, which is typically collected by SNAP–AC. The preliminary order is as follows:

#kdb./dump./unix

Example:

#kdb./dump./unix
The specified kernel file is a 64-bit kernel
./dump mapped from @ 700000000000000 to @ 70000007da53bd5
Preserving 1317350 bytes of symbol table
First symbol __MULH
Component Names:
1) MiniDump [2 entries]
2) Dmp_minimal [9 entries]
3) proc [481 entries]
4) THRD [1539 entries]
5) RASCT [1 entries]
6) LDR [2 entries]
7) ERRLG [3 entries]
8) MTRC [entries]
9) LFS [1 entries]
) BOS [2 entries]
One) IPC [7 entries]
VMM [Entries]
alloc_kheap [Entries]
Alloc_other [Entries]
RTASTRC [8 entries]
EFCDD [Entries]
Eidedd [1 Entries]
Sisraid [2 entries]
AIXPCM [5 Entries]
Scdisk [Entries]
) LVM [2 entries]
JFS2 [1 Entries]
) TTY [4 entries]
Netstat [ten entries]
GOENT_DD [7 Entries]
Scsidisk [Entries]
EFSCSI [5 Entries]
Dump_statistics [1 Entries]
Component Dump Table has 2456 entries
START END <name>
0000000000001000 0000000003bba050 Start+000fd8
f00000002ff47600 f00000002ffdc920 __ublock+000000
000000002ff22ff4 000000002ff22ff8 environ+000000
000000002ff22ff8 000000002FF22FFC errno+000000
f100070f00000000 f100070f10000000 pvproc+000000
f100070f10000000 f100070f18000000 pvthread+000000
PFT:
Pvt:
Id.................... 0002
Raddr ..... 0000000000686000 eaddr ..... F200800030000000
Size ..... ..... 00040000 align ... and so on. 00001000
valid. 1 Ros .... 0 fixlmb.1 seg .... 0 wimg ... 2
Dump analysis on Chrp_smp_pci power_pc power_5 machine with 8 available CPU (s)
(64-bit Registers)
Processing symbol Table ...
... what?... done?.?????

Analysis Command Example

Status to see the processes that each CPU is running at dump, such as:

0) > Status
CPU TID tslot PID pslot proc_name
0 2580f5 14c0f6 332 cron
1 12025 d01a
2 1020BB 258 1580c6 344 expr
3 1502B f01e Wait

CPU <id> command Toggles the current CPU, the default current CPU is CPU0:

(0) > CPU 1

(1) >

Basic status and related information of the printing system:

(0) > Stat

When printing the system dump, the kernel stack:

(0) > F

Lke is used to list the relevant system file information for the kernel code:

(0) > Lke 003de9cc

Show the last command when the system dump:

(0) > Dr IAR

Displays the log information for virtual storage management, where Exception value of 0000001C means Pagingspace exhausted:

(0) > Vmlog

Display information for the process table:

(0) > Proc

To display information about the thread table:

(0) > th

Display the system's ERRPT information:

(0) > Errpt

ERRORS not READ by Errdemon (ORDERED chronologically):

Error Record:
Erec_flags ........ 1
Erec_len .......... 54
Erec_timestamp ..... 46dcdd9d
Erec_rec_len ....... 34
Erec_dupcount ...... 0
Erec_duptime1 ...... 0
Erec_duptime2 ...... 0
erec_rec.error_id ..... Dd11b4af
Erec_rec.resource_name. Sysproc
00007FFF FFFFD000 00000000 003de9cc ... =......
00000000 00020000 80000000 000290b2 ....... .....

< END >

"Turn" uses DBX, KDB to analyze Coredump under AIX

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.