Multi-core programming--storage model

Source: Internet
Author: User

recently looking at the UNIX system in modern architecture-the SMP and caching technology of kernel programmers-copy something as a note!
The sequential storage model forces the memory operations (load and store) to be executed in the order in which they are executed in the sequence in which they appear in the instruction stream that accompanies the program. It also specifies that load and store operations done by different processors will also be sorted in some order, but not in a deterministic way. This kind of storage model should be the easiest to understand, and even think that the actual MP also works like this. However, this storage structure is very backward, and modern processors should have eliminated this structure.
The book mentions an example, as follows;

Processor One Processor Two
Store%r1, A Store%r1, B
Load%R2, B Load%R2, A

In the QIANGDINGSU (strong order) system, there are four possible kinds of storage sorts. However, it is certain that the last instruction must be the load directive, which means that at least one load instruction will read the new value saved by the other processor. The Dekker algorithm is the use of such a property. The Dekker algorithm is a critical area technique that is implemented using a read-change-write memory operation (Test-and-set or xchg similar instructions) in hardware that does not have atoms.

enumState {unlocked, LOCKED};typedefstruct{Charstatus[2];CharTurn;} lock_t;voidInitlock (lock_t *Lock){Lock->status[0] = UNLOCK;Lock->status[1] = UNLOCK;Lock->turn =0;}void Lock(volatilelock_t *Lock){Lock->status[cpuid ()] = LOCKED; while(Lock->STATUS[OTHERCPU ()] = = LOCKED) {if(Lock->turn! = CPUID ()) {Lock-status[cpuid ()] = unlocked; while(Lock->turn = = Othercpu ());Lock->status[cpuid ()] = LOCKED; }    }}voidUnlock (loct_t *Lock){Lock->status[cpuid ()] = unlocked;Lock->turn = Othercpu ();}
TSO (Complete storage sequencing):

The sequence rule of memory operation is weakened by the relative strong sequence of the complete storage sequence. First, an arbitrary sequence of load instructions from the memory, assuming that they all use separate registers, can be executed in any order. All save instructions are determined in order, and are therefore named Full storage sequencing (total store ordering). The fully-stored sequence MP, such as SPARC, provides a atomic-swap instruction that considers the loading and saving operations of a single location as an integral integral operation.
When a atomic-swap instruction is issued, he is placed in the save buffer, which is processed in FIFO order relative to the previous store instruction. However, it will block more memory operations until the save buffer is empty and the atomic Exchange operation is complete.

The Dekker algorithm does not work under TSO, because the order of the instructions in the first table is more likely, without the rule of the preceding analysis. Under the TSO structure, the lock directive needs to be implemented in the following way.

voidlock(void lock_t *lock_status){    while(swap_atomic(lock11)        ;}
PSO (partial storage sequencing):

PSO is relatively looser, and the order of the store directives is not necessarily executed in FIFO order. However, store operations in the same location in the store cache can be guaranteed to execute in program order. The load command will still look at the store buffer to see if there is a hit and, if there is one, return the data from the store instruction that was recently executed to that location. In PSO also need atomic_swap instruction to implement the lock, its row and TSO when the introduction is the same. The atomic Exchange operation prevents further memory operations from being emitted, but the store cache is not guaranteed to be empty when the atomic Exchange operation is performed, which means that the atomic operation ends before the previous store instruction completes. But since the order was fired sequentially (the guarantee now looks reluctant), the store and load instructions in the critical section have not yet been launched. Therefore, the lock implementation of PSO and TSO is basically the same.
So, what about when you unlock it? Consider the following section of the critical section instruction flow:

atomic-swap  /* get lock */ load%r1 , Counter /* received The original value */ add %r2 ,%r1 , 1  /* accumulate counter */ store%< Span class= "hljs-built_in" >r2 , Counter /* Save new Value */ clear%r3  store%r3 , lock /* release spin lock */ code> 

Consider the countdown to the first instruction and the penultimate instruction, all of them are store instructions. In the PSO structure, the order of the store instructions in different locations is indeterminate. If the release spin lock instruction is completed before the new value command is saved, the entire critical section is destroyed. Because this is absolutely not allowed, the SPARC architecture includes a directive called store-barrier, which forces a certain order to occur. In the words of POSIX multithreaded programming, a memory barrier is a glimpse of a moving "wall." The order of instructions between the walls is not necessarily, but the instructions on both sides of the wall are in relative order.
Therefore, the method of releasing the spin lock under the PSO structure is as follows:

voidunlock(volatile lock_t *lock_status){    store_barrier();    *lock_status();}

In fact, I read the purpose of this book is to understand the "is Parallel programming hard, and, if so, what Can I do about It?" This book seems to be only electronic version of the Concurrent Programming Network has a translated version called "in-depth understanding of parallel programming V1.0", the individual feel that the two books together to see very cool, because they focus on different, can be complementary understanding.

Blog address: Intheworld's Column

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Multi-core programming--storage model

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.