[WebKit] data structure of C ++ class and Its Application in Disassembly

Source: Internet
Author: User
Tags ustring

When debugging information and Source Code cannot be used in disassembly, you can view the data content. The data structure is better processed. If it is a C ++ class, you need to make a summary.


Basic-pod?

The key to the arrangement order of C ++ member variables is whether the pod (plain old data) type is used to determine whether vptr is required. The pod type will maintain the same data arrangement sequence as struct, but the virtual function, destructor, and copied value assignment function cannot appear in the class definition, otherwise, the compiler adds a virtual table pointer. The following two figures show the sequence of member variables in the inheritance relationship.

In the disassembly process, when the class definition is known (based on open-source Safari), the member content of the class can be accurately output. IOS safari disassembly instance 1
This is a simple example. a c function: js_export jsvalueref jsevaluatescript (

Jscontextref CTX, jsstringref script, jsobjectref thisobject, jsstringref sourceurl, int startinglinenumber,

Jsvalueref * exception); the purpose of this example is to view the content of the Second Script parameter to see If safari has executed a special script. 2. According to Apple's open-source javascriptcore code, we can see that the script is an opaquejsstring and its definition is as follows: typedef struct opaquejsstring * jsstringref;
The member variables of each class are listed below: Class threadsaferefcountedbase {

PRIVATE:

Int m_refcount ;};
Struct opaquejsstring: Public Threadsaferefcounted<Opaquejsstring>
{Uchar * m_characters; unsigned m_length;

};


The combined data format is:
3. Break a breakpoint at the function entrance and view the data. * For details about how to retrieve parameters, refer to [IOS reverse engineering] to obtain the current instance handle in assembly language debugging.

(Lldb) P/X' * (int *) ($ EBP + 12 )'

(INT) $1 = 0x19aa1df0 <-this is the script pointer value.

(Lldb) Mem read '$1'
0x19aa1df0: 01 00 00 00 00 10 84 18 94 08 01 00 00 00 00 00 ................

This is the content of the script pointing to the data. You can enter it in order. * m_refcount = 1 * m_characters = 0x18841000 * m_length = 0x010894
The script content is shown below:

(Lldb) Mem read '* (int *) ($ EBP + 12) + 4)'-C 64
0x18841000: 2f 00 2a 00 0a 00 20 00 2a 00 20 00 43 00 6f 00/... *. C. O.
0x18841010: 70 00 79 00 72 00 69 00 67 00 68 00 74 00 20 00 p. Y. R. I. G. H. T ..
0x18841020: A9 00 20 00 32 00 30 00 31 00 30 00 20 00 41 00... 2.0.1.0....

0x18841030: 70 00 70 00 6C 00 65 00 20 00 49 00 6e 00 63 00 P. P. L. E. I. N. C.

If you use m_length to specify the length (the parameter after-C, multiplied by 2 because the data is UTF-16), you can display all content: (lldb) MEm read '* (int *) (* (int *) ($ EBP + 12) + 4) '-C' * (int *) ($ EBP + 12) + 8) * 2'

Add a parameter-O to lldb to store the file, and then use a script to convert the data into real script data: (Lldb)  Mem read '* (int *) ($ EBP + 12) + 4)'-C'* (Int *) ($ EBP + 12) + 8) * 2'-O/xxxx/captured.txt
IOS safari disassembly instance 2This example is a little complex. Not only does the class have many inheritance layers, but it is not a pod type, so vptr is available. For functions

Javascriptcore 'jsc: Evaluate (JSC: execstate *, JSC: scopechainnode *, JSC: sourcecode const &, JSC: jsvalue, JSC: jsvalue *);
The goal is to print the content of the JSC: sourcecode parameter. In fact, this function is called by the above function. The Collection class definition steps are the same. Here, the total structure of a merge is determined as follows: m_provider
M_ptr

Ustring m_url;
Textposition m_startposition;
Bool m_validated;
Sourceprovidercache * m_cache;

Bool m_cacheowned; string m_source; M_urlAnd M_sourceIs the key data. In addition, not all m_providers of sourcecode will contain m_source members, depending on the situation (stringsourceprovider). This issue is not considered here.
The following is the analysis data: (Lldb)  P/X' * (int *) ($ EBP + 16 )'
(INT) $66 = 0xb015bc04
(Lldb)  Mem read 0xb015bc04
0xb015bc04: F8 42 98 00 00 00 00 DF 9C 02 00 01 00 00 00. B ..............
0xb015bc14: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
M_provider {m_ptr} = 0x189842f8m_startchar = 0m_endchar = 0x029cdfm_firstline = 0x01
Further read m_provider (Lldb)  (Lldb) MEm read 0x189842f8-C 64
0x189842f8: 68 A3 C9 03 02 00 00 00 C8 F3 5f 19 00 00 00 00 00 h ........._.....
0x18984308: 00 00 00 00 00 2B 03 53 80 7d 5f 19 00 00 00 00 00 ...... +. S .}_.....
0x18984318: 8C A3 C9 03 80 47 6e 18 06 00 00 00 08 00 00 00 ...... GN .........
0x18984328: 34 43 98 18 00 00 00 00 10 16 90 6C 64 00 72 00 4C ...... lD. R.


Vptr = 0x03c9a368 ==> RefcountedbaseInt m_refcount = 0x02 ==> Sourceprovider
Ustring m_url = 0x195ff3c8

Textposition m_startposition m_line = 0x0 m_column = 0x0
Bool m_validated = 0x53032b00
Sourceprovidercache * m_cache = 0x195f7d80

Bool m_cacheowned = 0; // false ==> StringsourceproviderUstring m_source
{M_impl {m_ptr }}= 0x03c9a38c
Read the URL. Here is a stringimpl pointer:

(Lldb) Mem read 0x195ff3c8
0x195ff3c8: 06 00 00 00 3D 00 00 00
DC F3 5f 19 00 00 00 00 ...... = ....._.....

0x195ff3d8: 40 00 00 00 68 74 703a 2f 2f 62 31 2E 62 73 @... http://b1.bs
The string content is the address at the 8-byte starting address offset. Where: m_refcount = 0x06 ;( Offset: 0 byte) m_length = 0x3d ( Offset: 4 bytes) m_data8 or m_data16 = 0x195ff3dc ( Offset: 8 bytes)

Use the following command to output the complete URL:

(Lldb) Mem read '* (int *) ($ EBP + 16) + 8) + 8) '-C' * (int *) ($ EBP + 16) + 8) + 4)'
0x195ff3dc: 68 74 74 70 3A 2f 62 31 2E 62 73 74 2E 31 32 http://b1.bst. 12
0x195ff3ec: 36 2E 6e 65 74 2f 6e 65 77 70 61 67 65 2f 72 2f 6.net/newpage/r/
0x195ff3fc: 6a 2f 6D 2f 6D 2D 33 2f 70 6D 2E 6a 73 3f 76 3D J/M/M-3/PM. js? V =

0x195ff40c: 31 33 36 39 38 30 39 35 32 36 34 33 31 1369809526431 use the following command to read the complete script content (Lldb)  Mem read '* (int *) ($ EBP + 16) + 32) + 8 )'
-C '* (int *) ($ EBP + 16) + 32) + 4) * 2'

Reprinted please indicate the source: http://blog.csdn.net/horkychen

For more information about pods, see:
Pod types

For more information about virtual table, see virtual method table.

For more information about how to obtain parameters in disassembly, see here: Get the current instance handle in assembly language debugging.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.