Memory consistency model for shared-memory multiprocessors learning notes (1)

Last Update:2018-12-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Recently, it is related to the semantics and usage of memory barrier in Linux kernel. Therefore, I will conduct an in-depth reading and research, and record my thoughts as follows.

1. sequential consistency Model

For programmers, the most intuitive memory consistency model for programming in SMP systems is sequential consistency (SC. In fact, the memory consistency model can be extended to the consistency model in Distributed Systems ). Because it is the closest to the Order Model in the up system. Lamport defines SC as follows:

A Multiprocessor is sequentially consistent if the result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence
In the order specified by its program.

This definition is divided into two parts: 1) from any processor perspective, all memory operations (memory operations) sent by all processors in the system) all are completed in the same order (sequential order) in sequence (complete ). 2) in the whole sequence, the Operations sent by any single processor are completed in the order specified by the code run by this processor (Program order. In fact, this definition specifies the atomic and program order requirements that the SC Model must maintain. The following code describes this model. uppercase letters are variables in the system shared memory, lowercase letters are private variables in a single processor, and all variables are initialized to 0 unless otherwise specified.

P1 P2

A1: a = 1; A2: U = B;

B1: B = 2; B2: V =;

Code 1

If code 1 finds u = 2 after a certain execution, we will take it for granted that V = 1. Because we intuitively think that in this operation, if u = 2, read operation A2 is completed after write operation B1, therefore, it is determined that the read operation B2 must be completed after the write operation A1. In this way, the expected result is V = A = 1. Therefore, it can be inferred that the system may complete all operations in the order of (A1, B1, A2, B2. This is the SC model. It is also the simplest, most intuitive, and strictest model.

The inference for Running code 1 is based on the second nature of the SC Model-code order, which requires that all operations be completed in the order specified by the Code. Unfortunately, almost all real systems, due to the adoption of a variety of hardware or compiler optimization technologies, do not "directly" meet this model. These optimization techniques may cause the system to violate any one of the SC definitions-atomicity or code order. Let's look at a model that violates atomicity. According to the first article defined by Lamport, atomicity means that write operations on the same address are completed in the same order from the perspective of any processor.

P1 P2 P3 P4

A1: a = 1; A2: U = A; A3: W = a; A4: A = 2;

B2: V = A; B3: x =;

Code 2

Assume that the system running Code 2 uses a cache, and P2 and P3 each have a copy. In this way, after running, P1 sends the update value of a to P2 and P3 respectively. Similarly, P4 sends the update value 2 of A to P2 and P3 respectively. Due to various system delays, update messages may arrive at P2 and P3 in different order. For example, from the perspective of P2, the update value 1 of P1 is prior to the update value of P4. From the perspective of P3, the update value of P4 is prior to the update value of P1. In this case, the result (U, V, W, x) = (,) is generated. Obviously, this result does not conform to the SC Model and violates the atomicity of the SC model. The models described later are all transformed Based on the SC Model (relaxation ).

2. Relaxing the write to read program order

W-R relaxation model is a variant of SC Model, IBM-370 is a specific implementation of this model. This model allows a previous write operation and a subsequent read operation to be executed in disorder (W-R reordering) when the two operation addresses are different ). Example:

P1 P2

A1: a = 1; A2: B = 1;

B1: U = B; B2: V =;

Code 3

In the SC Model, code 3 can only produce (u, v) = (), and (), but cannot produce () results. However, this result is possible in the W-R relaxation model. For example, the code in this model may be completed in the order of (B1, A2, B2, A1), that is, the read operation B1 can be completed before the write operation A1 is completed.

Note that in this model, W-R operations on the same address cannot be done in disorder, that is, if the two addresses of the W-R operation are the same, the two operations must be done in the order of code. More broadly speaking, a competitive operation (competing operations) means that at least one write operation exists in multiple operations of the same address, and the code sequence of these operations must be maintained.

3. Relaxing the write to Write Program order

Similar to the W-R relaxation model, the W-W relaxation model allows, in addition to W-R disorder, W-W disorder, that is, two write operations, where two operation addresses are different, ability to do (W-W reordering) in disorder ). Example:

P1 P2

A1: a = 1; A2: While (flag = 0 );

B1: B = 1; B2: U =;

C1: Flag = 1; C2: V = B;

Code 4

In the SC model, or even the W-R relaxation model, code 4 has only (u, v) = () a reasonable result. In the W-W relaxation model, () are reasonable results. For example, the code in this model may be completed in the order of (A1, C1, A2, B2, C2, B1), that is, the write operation C1 can be completed before the write operation B1.

In this model, it is also important to note that the order of code for W-W operations on the same address is maintained.

4. Relaxing the read to read and read to write order

This model is looser than the previous model, which allows R-R operations and R-W operations to be done in disorder. Similar to the previous model, this model requires that the code order of competing operations be maintained.

References:

Memory consistency model for shared-memory multiprocessors by Kourosh gharachorloo

Memory barriers: A hardware view for software hackers by Paul E. mckenney

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Memory consistency model for shared-memory multiprocessors learning notes (1)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Memory consistency model for shared-memory multiprocessors learning notes (1)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support