Overview
No single application can fully enable this processor. When a thread encounters a long wait event, synchronous multithreading also allows the commands in another thread to use all execution units. For example, when a thread encounters a high-speed cache miss, another thread can continue to execute. Synchronous multithreading is a function of the POWER5 and POWER6 processors and can be used with the shared processor.
SMT optimizes the performance of business transaction processing load by up to 30%. SMT is a good choice when it focuses more on the overall throughput of the system than the throughput of individual threads.
However, not all applications can achieve performance optimization through SMT. Applications whose performance is limited by execution units, or those that consume the memory bandwidth of all processors, will not improve their performance by executing two threads on the same processor.
Although SMT can enable the system to recognize logical CPUs (lcpus) that are double the number of physical CPUs, this does not mean that the system has twice the CPU capacity.
SMT technology allows the kernel to run two different processes at the same time to reduce the total time required for multi-task processing. There are two advantages to doing so. One is to improve the computing performance of the processor and reduce the time required for the user to get the results. The other is to improve the performance of energy efficiency and use shorter time to complete the task, this means saving more energy for the rest of the time. Of course, there is a general premise for doing so-to ensure that SMT will not repeat the mistakes made by HT, and to provide this guarantee is the branch prediction design that is outstanding in the core micro-architecture. [1]
Edit the synchronization mechanism of this synchronization section with multiple threads
1. Event
Using events to synchronize threads is the most flexible. An event has two States: the triggered state and the unfired state. It is also known as having a signal or having no signal. There are two types of events: manual event resetting and automatic event resetting. After the manual reset event is set to the excitation state, it will wake up all the waiting threads and remain in the excitation state until the program re-sets it to the unfired state. When an automatic reset event is set to the excitation status, it will wake up the "one" Waiting thread and automatically restore it to the unfired status. Therefore, it is ideal to use an automatic reset event to synchronize two threads. The corresponding class in MFC is CEvent .. By default, the constructor of CEvent creates an event that is automatically reset and is not activated. There are three functions to change the event status: SetEvent, ResetEvent, and PulseEvent. It is ideal to use events to synchronize threads. However, in actual use, calling SetEvent and PulseEvent for automatic reset events may cause deadlocks, be careful.
Multi-thread synchronization-event
Among all kernel objects, the event kernel object is the most basic. It contains a usage count (the same as all kernel objects) and a BOOL value (used to indicate whether the event is an automatic reset event or a manual reset event ), there is also a BOOL value (used to indicate whether the event is in the notified or not notified status ). The event can notify a thread that the operation has been completed. There are two types of event objects. One is manual resetting, and the other is automatic resetting. The difference is that when a manually reset event is notified, all threads waiting for the event change to schedulable threads. When an automatically reset event is notified, only one thread in the thread waiting for the event changes to a schedulable thread.
When one thread executes initialization and notifies another thread to execute the remaining operations, the event is most frequently used. In this case, the event is initialized to the state of not notified. After the thread completes its initialization, it sets the event to the notified State, another thread waiting for the event becomes a schedulable thread after the event has been notified.
When the process starts, it creates an event with no notification Status manually reset and stores the handle in a global variable. This makes it easy for other threads in the process to access the same event object. The program creates three threads at the beginning. These threads are suspended after initialization and wait for the event. These threads need to wait for the file content to be read into the memory, and each thread will access the file content. One thread counts words, the other thread runs the spelling check, and the third thread runs the syntax check. The code starting part of these three thread functions is the same. Each function calls WaitForSingleObject. This will pause the thread until the file content is read into the memory by the main thread. Once the main thread prepares the data, it calls SetEvent to send a notification to the event. At this time, the system enables all three auxiliary threads to enter the schedulable state, which obtain the c p u time and can access the memory block. All three threads must access the memory in read-only mode. Otherwise, a memory error occurs. This is the only reason why all three threads can run simultaneously. If there are more than three CPUs on the computer, theoretically the three threads can truly run at the same time, so that a large number of operations can be completed in a short time.
If you use auto-reset events instead of manual reset events, the behavior characteristics of the application are very different. After the main thread calls S e t E v e n t, the system allows only one auxiliary thread to change to a schedulable state. Likewise, it cannot be guaranteed which thread the system will change to the schedulable state. The remaining two auxiliary threads will continue to wait. A thread that has changed to a schedulable state has exclusive access to the memory block.
Let's re-compile the thread function so that each function calls the S e t E v e n t function before returning (as the Wi n M a I n function does ).
When the main thread reads the file content into the memory, it calls the SetEvent function, so that the operating system will make one of the three waiting threads Become A schedulable thread. We do not know which thread will be selected as the scheduling thread first. When this thread completes the operation, it will also call the S e t E v e n t function to make the next one scheduled. In this way, the three threads will be executed sequentially. The operating system determines the sequence. Therefore, even if every auxiliary thread accesses the memory block in read/write mode, no problem occurs. These threads are no longer required to regard data as read-only data.
This example clearly shows the difference between manual reset events and automatic reset events.
The P u l s e E v e n t function changes the event to the notified status, and then immediately changes to the not notified status, this is like calling the R e s e t e v e n t function immediately after calling s e t E v e n t function. If the P u l s e E e v e n t function is called on a manual reset event, any or all threads waiting for the event will become schedulable threads. If the P u l s e E e v e n t function is called on the automatic reset event, only one thread waiting for the event changes to a schedulable thread. If no thread is waiting for the event at the time of event issuance, it will not be of any effect [2].
2. Critical Section
The first advice to use a critical region is not to lock a resource for a long time. The length of time here is relative, depending on different programs. For some control software, it may be several milliseconds, but for some other programs, it can be up to several minutes. However, after entering the critical section, you must exit as soon as possible to release resources. What if it is not released? The answer is no. If the main thread (GUI thread) is to enter an unreleased critical section, the program will be suspended! A drawback of the Critical zone is that the Critical Section is not a core object and cannot be known whether the thread entering the Critical zone is dead or not. If the thread entering the Critical zone fails, the Critical resources are not released, the system cannot know and cannot release the critical resource. This disadvantage is compensated by Mutex. The implementation class of the Critical Section in MFC is CcriticalSection. CcriticalSection: Lock () enters the critical section, and CcriticalSection: UnLock () leaves the critical section.
3. Mutex
The functions of the mutex are similar to those of the critical zone. The difference is: Mutex takes more time than the Critical Section, but Mutex is the core object (the Event and Semaphore are also) and can be used across processes, in addition, waiting for a locked Mutex can set TIMEOUT, so that the Critical area is not known as the Critical Section, and the system will always die. The corresponding class in MFC is CMutex. Win32 functions: Create a mutex CreateMutex (), open the mutex OpenMutex (), and release the mutex ReleaseMutex (). Mutex's ownership does not belong to the thread that generates it, but the last thread that waits for the Mutex (WaitForSingleObject, etc.) and has not performed the ReleaseMutex () operation. A thread has Mutex, just like entering the Critical Section. Only one thread can own the Mutex at a time. If a thread with a Mutex does not call ReleaseMutex () before returning, the Mutex is discarded, but the Mutex can still be returned when other threads wait for the Mutex (WaitForSingleObject, etc, and get a value of WAIT_ABANDONED_0. It can be seen that a Mutex is exclusive to Mutex.
4. Semaphore
Semaphores are the most historical synchronization mechanism. Semaphores are the key elements to solve the producer/consumer problem. The corresponding MFC class is Csemaphore. Win32 function CreateSemaphore () is used to generate semaphores. ReleaseSemaphore () is used to unlock the lock. The current value of Semaphore indicates the number of currently available resources. If the current value of Semaphore is 1, another lock action can be successful. If the present value is 5, five lock actions can be successful. When Wait... If the current value of Semaphore is not 0, Wait... Return immediately. The number of resources is reduced by 1. When you call ReleaseSemaphore () resources plus 1, the total number of resources will not exceed the initial set.
Edit some technical questions about multithreading in this section
1. When is multithreading used?
2. How to synchronize threads?
3. How do threads communicate?
4. How do processes communicate?
First, let's answer the first question. The thread is actually mainly used in four main fields. Of course, each field is not absolutely isolated. They may overlap, but each program can be attributed to a specific field:
1. offloading time-consuming task. The time-consuming computing is executed by the auxiliary thread, which makes the GUI have a better response. I think this is the case where we are considering the most threads.
2. Scalability. The most common consideration of server software is that multiple threads are generated in the program, and each thread performs a small job, so that every CPU is busy, so that the CPU (generally multiple) it is complicated to have the best usage rate to achieve load balancing. I would like to discuss this issue later.
3. Fair-share resource allocation. When you send a request to a heavy-load server, how long can you get the service. A server cannot serve too many requests at the same time. It must have the maximum number of requests, and sometimes some requests must be processed first. This is the job of thread priority.
4. Simulations. The thread is used for simulation testing.
Edit the communication between threads in this section
A thread often needs to pass data to another thread. The Worker thread may need to tell others that the job is finished, and the GUI thread may need to hand over a new job to the Worker thread.
Through PostThreadMessage (), messages can be transmitted to the target thread. Of course, the target thread must have a message queue. Message-based communication has great advantages over standard technologies such as using global variables. If the object is a thread in the same process, you can send custom messages and pass data to the target thread. If the thread is in different processes, communication between processes is involved. We will talk about it below.
Communication between processes:
When threads belong to different processes, that is, when they are located in different address spaces, communication between them must span the boundary of the address space, you have to adopt some methods to communicate with different threads in the same process.
1. Windows defines a message: WM_COPYDATA, which is used to move data between threads, regardless of whether the two threads belong to the same process. At the same time, the thread that accepts this message must have a window, that is, it must be a UI thread. WM_COPYDATA must be sent by SendMessage (), and cannot be sent by PostMessage (), which is determined by the life cycle of the data buffer to be sent, out of security needs.
2. WM_COPYDATA efficiency is not too high. If you require high efficiency, you can consider using Shared Memory ). To use shared memory, set a shared memory area, use shared memory, and synchronize the shared memory.
Step 1: Set a memory shared area. First, CreateFileMapping () generates a file-mapping core object and specifies the size of the shared area. MapViewOfFile () gets a pointer pointing to available memory. If the file-mapping is generated by the Server in the C/S mode, the Client uses OpenFileMapping () and then calls MapViewOfFile ().
Step 2: Use shared memory. The usage of shared memory pointers is troublesome. We need to use the _ based Attribute to allow pointers to be defined as a 32-bit offset value starting from a certain point.
Step 3: clean up. UnmapViewOfFile () refers to the pointer obtained by MapViewOfFile (), and CloseHandle () refers to the handle of the file-mapping core object.
Step 4: synchronous processing. You can use Mutex for synchronous processing.
3. IPC
1) Anonymous Pipes. Anonymous Pipes is only used for point-to-point communication. This is the most useful communication method when a process generates another process.
2) Named Pipes. Named Pipes can be one-way or two-way, and can span the network, the step is limited to a single machine.
3) Mailslots. Mailslots is a radio communication. The Server process can generate Mailslots. Any Client process can write data, but only the Server process can fetch data.
4) OLE Automation. OLE Automation and UDP are both higher-order mechanisms that allow communication between different processes or even between different machines.
5) DDE. DDE Dynamic Data Exchange is used in 16-bit Windows. This method should be avoided as much as possible. [3]
Content from network resources: http://baike.baidu.com/view/2808915.htm