This chapter is selected from the Intel official documentation and translated by myself.
Note that the following English word cache is used. If C is in uppercase, that is, cache, it indicates the term "cache". If C is in lowercase, that is, cache, indicates the verb -- indicates saving data to a high-speed buffer storage.
Intel 64 and the IA-32 architecture provide the ability to manage and enhance the execution of multiple processors connected to the same system bus. These include:
1. consistent management of the lock bus and/or cache required to perform atomic operations on the system memory
2. serialized commands. These commands are only applicable to Pentium 4, Intel Zhiqiang, P6 family, and Pentium processor.
3. An advanced programmable interrupt controller (APIC) located on the processor chip (See Chapter 10th "advanced programmable interrupt controller (APIC )"). This feature is introduced by the Pentium processor.
4. A Level 2 cache (level 2, L2 ). For the Pentium 4, Intel Xeon, and P6 family processors, L2 cache is included in the processor package and tightly coupled with the processor. For Pentium and intelease processors, pins are provided to support external L2 cache.
5. A three-level cache (level 3, L3 ). For Intel Xeon processors, L3 is included in the processor package and tightly coupled with the processor.
6. Intel hyper-Threading Technology. This extension to Intel 64 and IA-32 architecture allows a single processor coreConcurrencyExecute two or more threads. (See section 8.5 "Intel hyper-Threading Technology and Intel multi-core technology ").
These mechanisms are particularly useful in symmetric multi-processor (SMP) systems. They can then also be used when an Intel 64 or IA-32 processor shares a system bus with a special purpose processor, such as a communication, graphics, or video processor.
These multi-processing mechanisms have the following features:
1. Maintain system memory consistency-when two or more processors attempt to access the same address in the system memory at the same time, some communication mechanisms and memory access protocols must be available to promote data consistency, in some cases, a processor is allowed to temporarily lock a memory location.
2. maintain cache consistency-when a processor accesses data cached on another processor, it must not receive the correct data (note, when one processor reads an address unit, the other processor modifies the address unit before but writes it to the cache. In this case, you must have a mechanism to ensure that the previous processor can read the updated data, rather than "dirty" data .). If a processor modifies the data, all other processors accessing the data must be able to access the modified data.
3. Predictable write order for memory-in some cases, memory write operations are observed externally in exactly the same order of programming (note: this is implemented using the memory fence mechanism ).
4. Distribute interrupt processing among a group of processors-when several processors run concurrently in one system, it is useful to have a central mechanism to receive interruptions and distribute them to available processors for service.
5. Improve system performance by exploiting the multi-thread and multi-process nature of operating systems and applications of the same generation.
The cache mechanism and cache consistency of Intel 64 and IA-32 processors are discussed in Chapter 11th. The APIC architecture is described in Chapter 10th. Bus and lock memory, serialized commands, memory sequence, and Intel hyper-threading technology are discussed in the following sections of this chapter.
8.1 lock atomic operation
8.2 storage order
8.3 serialized commands
Initialization of 8.4 multi-processor (MP)
8.6 hardware multithreading support and Topology Detection
8.7 intel hyper-threading technology architecture
8.8 multi-core architecture
8.9 programming considerations for hardware multi-thread performance Processors
8.10 idle and blocking Management