No single application can fully load the processor. When a thread encounters a longer wait-time event, synchronous multithreading also allows instructions in the other threads to use all the execution units. For example, when one thread has a cache miss, another thread can continue to execute. Synchronous multithreading is the function of the power5™ and power6™ processors and can be used with shared processors.
SMT performance optimization for commercial transaction processing loads can be up to 30%. SMT is a good choice when paying more attention to the overall throughput of the system rather than the throughput of individual threads.
But not all applications can achieve performance optimization through SMT. Applications whose performance is constrained by the execution unit, or those that deplete the memory bandwidth of all processors, will not be improved by executing two threads on the same processor.
Although SMT allows the system to recognize a logical CPU (LCPU) that doubles the number of physical CPUs, this does not mean that the system has twice times the CPU capacity.
SMT technology allows the kernel to run two different processes at the same time to compress the total amount of time required to multitask. There are two benefits to doing so, one is to increase the processor's computational performance, reduce the time it takes for the user to get the results, and the other is better energy efficiency performance and a shorter time to complete the task, which means more power consumption is saved in the remaining time. Of course there is a general premise to this--to ensure that SMT does not repeat the errors made by HT, and that this guarantee is a very good branch predictive design in the core microarchitecture. [1]
Synchronization mechanism of synchronous multithreading
1. Event
Synchronizing Threads with events (event) is the most resilient. There are two states of an event: the firing state and the non-excited state. Also known as signal status and no signal status. There are two types of events: Manual reset events and automatic reset events. When a manual reset event is set to the firing state, all waiting threads are awakened, and remain in the firing state until the program resets it to the non-fired state. When the auto-reset event is set to the firing state, the "one" waiting thread is awakened, and then automatically reverts to the non-fired state. Therefore, it is ideal to synchronize two threads with an auto-reset event. The corresponding class in MFC is CEvent. The CEvent constructor creates an auto-reset event by default and is not in the fired state. There are three functions to change the state of an event: Setevent,resetevent and PulseEvent. It is a good idea to synchronize threads with events, but it is important to note that the use of SetEvent and pulseevent for auto-reset events can cause deadlocks and must be taken care of.
Multithreading Synchronization-event
In all kernel objects, the event kernel object is the most basic one. It contains a usage count (as with all kernel objects), a bool value (to indicate whether the event is an automatically reset event or an event that is manually reset), and a bool value that indicates whether the event is in a notified state or an unspecified state. Event to notify a thread that an operation has been completed. There are two types of event objects. One is a manual reset event, and the other is an automatic reset event. They differ in that when a manually reset event is notified, all threads waiting for the event become scheduled threads. When an automatically reset event is notified, only one thread in the thread that waits for the event becomes a scheduler thread.
Events are used most frequently when one thread performs an initialization operation and then notifies another thread to perform the remaining operations. In this case, the event is initialized to an unspecified state, and then when the thread finishes its initialization operation, it sets the event to the notified State, and the other thread that waits for the event becomes a scheduled thread after the event has been notified.
When this process starts, it creates an event that is manually reset to the non-notification state, and the handle is saved in a global variable. This makes it very easy for other threads in the process to access the same event object. At the beginning of the program, three threads were created, which were suspended after initialization and waited for events. These threads wait for the contents of the file to be read into memory, and each thread accesses the contents of the file. One thread makes a word count, another thread runs a spell check, and the third thread runs a grammar check. The beginning of the code for these 3 thread functions is the same, and each function calls WaitForSingleObject., which causes the thread to pause until the contents of the file are read into memory by the main thread. Once the main thread has the data ready, it calls SetEvent, giving the event a notification signal. At this point, the system causes all 3 worker threads to enter the dispatch state, they all gain C P u time and can access the memory block. All 3 threads must have read-only access to memory, or a memory error will occur. This is the only reason that all 3 threads can run at the same time. If the computer is equipped with more than three CPUs, theoretically this 3 thread can actually run simultaneously, so that a lot of operations can be done in a short period of time
If you use an auto-reset event instead of an event that is manually reset, there is a big difference in the behavior characteristics of the application. When the main thread calls S e t e v e N T, the system only allows one worker thread to become a scheduled state. Similarly, there is no guarantee that the system will make which thread becomes a scheduled state. The remaining two worker threads will continue to wait. A thread that has become a scheduled state has exclusive access to a block of memory.
Let's rewrite the function of the thread so that each function calls the S e t e N T function before returning (just as the WI n M a i n function does).
When the main thread reads the contents of the file into memory, it calls the SetEvent function so that the operating system makes one of the three waiting threads a scheduled thread. We do not know which thread the system will select first as a scheduled thread. When the thread finishes the operation, it will also call the S e t e v e n T function so that the next is dispatched. In this way, three threads are executed sequentially, and in what order, it is the operating system's decision. So, even if each worker thread accesses memory blocks in read/write mode, there is no problem, and these threads will no longer be required to treat the data as read-only data.
This example clearly shows the difference between using a manual reset event and an auto-reset event.
The P u l S e v e n T function causes the event to become a notified state and then immediately becomes an unnamed state, which is like calling the R e s e t e-N T function immediately after calling S E t e v e N T. If you invoke the P u l s e-e n-t function on an event that is manually reset, then any thread or all threads waiting for the event will become a scheduled thread when the event is emitted. If you call the P u l e-e N-t function on an automatic reset event, only one thread that waits for the event becomes a scheduler thread. [2] If no thread waits for the event when the event is emitted, it will not have any effect.
2. Critical section
The first advice to use a critical area is not to lock a resource for long periods of time. The long time here is relative, depending on the program. For some control software, it may be a few milliseconds, but for some other programs it can take up to a few minutes. However, after entering the critical section, the resources must be released as soon as possible. What happens if I don't release it? The answer is not how. If it is the main thread (GUI thread) to enter a non-released critical section, hehe, the program will be hung! A disadvantage of a critical region is that the Critical section is not a core object, and it is not known that the thread entering the critical section is alive or dead, and if the thread entering the critical section is hung, the critical resource is not released, the system cannot be informed, and there is no way to release the critical resource. This drawback is remedied in the mutex (mutex). The corresponding implementation class of Critical section in MFC is CCriticalSection. Ccriticalsection::lock () enters the critical section and Ccriticalsection::unlock () leaves the critical section.
3. Mutex
The function of the mutex is similar to the critical area. The difference is that the mutex takes more time than the critical section, but the mutex is the core object (Event, Semaphore also), can be used across processes, and waits for a locked mutex to set a timeout, It will not be as critical section does not know the situation of the critical area, and has been death. The corresponding class in MFC is CMutex. The WIN32 function has: Create a mutex CreateMutex (), open the Mutex OpenMutex (), release the Mutex ReleaseMutex (). The ownership of a mutex does not belong to the thread that generated it, but to the last thread that waits for the mutex (WaitForSingleObject, and so on) and has not yet performed the ReleaseMutex () operation. Threads have mutexes as if they had entered the critical section, only one thread at a time can have the mutex. If a thread that owns a mutex does not call ReleaseMutex () before returning, the mutex is discarded, but when the other thread waits for the mutex (WaitForSingleObject, etc.), it can still return and get a wait_ Abandoned_0 the return value. Being able to know that a mutex is discarded is unique to the mutex.
4, Semaphore
The semaphore is the most historical synchronization mechanism. Semaphore is a key factor to solve the producer/consumer problem. The corresponding MFC class is CSemaphore. The Win32 function CreateSemaphore () is used to generate the semaphore. ReleaseSemaphore () is used to unlock the lock. The present value of semaphore represents the number of resources currently available, and if the present value of semaphore is 1, there is a lock action that can be successful. If the present value is 5, then there are five locking actions that can be successful. When you call wait ... Such functions require locking, if the semaphore present value is not 0,wait ... Return immediately, the number of resources minus 1. When the number of ReleaseSemaphore () resources is called plus 1, the total number of resources initially set is certainly not exceeded.
some technical questions about multithreading
1. When to use multithreading?
2, how to synchronize threads?
3, how to communicate between threads?
4, how to communicate between processes?
To answer the first question, the thread actually applies mainly to four main areas, of course, the areas are not absolutely isolated, they may overlap, but each program should be attributed to an area:
1, offloading time-consuming task. The worker threads perform time-consuming calculations, and the GUI has a better response. I think this should be one of the things we think about using threads the most.
2, Scalability. Server software most often consider the problem, in the program to generate multiple threads, each thread to do a small job, so that each CPU is busy, so that the CPU (generally multiple) has the best utilization, to achieve load balance, which is more complicated, I would like to discuss this issue later.
3, Fair-share resource allocation. When you make a request to a heavily loaded server, how much time does it take to get the service. A server cannot serve too many requests at the same time, there must be a maximum number of requests, and sometimes priority is given to certain requests, which is a thread-priority job.
4, simulations. Threads are used for simulation testing.
Communication between Threads
Threads often have to pass data to another thread. The worker thread may need to tell others that its work is done, and the GUI thread may need to hand a new job to the worker thread.
With PostThreadMessage (), messages can be passed to the target thread, but the target thread must have Message Queuing. Using messages as a means of communication has great advantages over standard technologies such as global variables. If the object is a thread in the same process, you can send a custom message, pass the data to the target thread, and, if the thread is in a different process, involve communication between the processes. Here's what you'll get.
Communication between Processes:
When threads belong to different processes, that is, when they are stationed in different address spaces, the communication between them needs to cross the boundary of the address space, and it is necessary to take some methods that are different from the different threads in the same process.
1. Windows specifically defines a message: Wm_copydata, which is used to move data between threads, regardless of whether the two thread belongs to a process. The thread that accepts this message must also have a window, which must be the UI thread. Wm_copydata must be sent by SendMessage (), cannot be sent by PostMessage (), etc., which is determined by the lifetime of the data buffer to be sent, for security reasons.
2, wm_copydata efficiency above is not too high, if the requirements of high efficiency, you can consider the use of shared memory (GKFX). Use shared memory to do this: set a piece of memory-sharing area, use shared memory, and synchronize the shared memory.
The first step: set a piece of memory sharing area. First, createfilemapping () produces a file-mapping core object and specifies the size of the shared area. MapViewOfFile () Gets a pointer to the available memory. In the case of the C/S mode, the server side generates File-mapping, then the client uses openfilemapping () and then calls MapViewOfFile ().
Step two: Use shared memory. The use of shared memory pointers is a hassle, and we need the _based property to allow the pointer to be defined as a 32-bit offset from the beginning of a point.
Step three: Clean up. UnmapViewOfFile () hand over the pointer obtained by MapViewOfFile (), CloseHandle () Surrender file-mapping core object handle.
Fourth step: Synchronous processing. You can synchronize with a mutex.
3. IPC
1) Anonymous Pipes. The Anonymous pipes is only used for point-to-point communication. This is the most useful way to communicate when a process produces another process.
2) Named Pipes. Named pipes can be unidirectional, can also be bidirectional, and can span the network, stepping limited to a single machine.
3) mailslots. MailSlots is broadcast-type communication. The server process can produce mailslots, and any client process can write the data in, but only the server process can fetch the data.
4) OLE Automation. Both OLE automation and UDP are higher-order mechanisms that allow communication to occur between different processes, or even between different machines.
5) DDE. DDE Dynamic Data exchange, used in 16-bit Windows, is currently a way to avoid using it. [3]
Content from network resources: http://baike.baidu.com/view/2808915.htm
Windows Multi-threaded synchronization