Original from: http://blog.csdn.net/hipercomer/article/details/27580323
Translated from: HTTP://WWW.HARDWARESECRETS.COM/ARTICLE/EVERYTHING-YOU-NEED-TO-KNOW-ABOUT-THE-QUICKPATH-INTERCONNECT-QPI/610/1
Since Intel has a CPU, it has been using an external bus called the front-end bus (Front Side bus, FSB). The front-end bus is a channel to the CPU that is shared by memory and I/O. The next generation of Intel processors will have a built-in memory controller, so the processor will provide two channels: the memory bus that connects the CPU and memory, and the I/O bus that connects the CPU and I/O. This new I/O bus is called the Fast Interconnect channel (QuickPath Interconnect, QPI). In this article, we'll explain how it works.
Figure 1 Traditional Intel processor Architecture
Figure 1 shows the traditional Intel processor architecture, and Figure 2 is the next generation Intel processor architecture for the processor's built-in memory controller.
Figure 2 Next generation Intel processor architecture
In fact, AMD has adopted similar architectures in Athlon CPUs since 2003. Now, all AMD CPUs are already built-in memory controllers. They use the called HyperTransport Bus to achieve I/O communication. Although HyperTransport and QPi have the same goals and similar working mechanisms, they are not compatible.
Incidentally, technically qpi and HyperTransport are not called buses but simply point-to-point connections. A bus is a set of conductors that allow multiple parts to be connected at the same time. A point-to-point connection refers to connecting only two parts. Although technically it is not the right bus, for the sake of simplicity and we still call it.
Next we talk about how QPI works, like HyperTransport, QPI provides two channels (lane) for CPU and chipset connections, as shown in 3. Doing so allows the CPU to receive (read) and send (write) I/O data at the same time, while the traditional FSB cannot achieve this goal.
Figure 3 The Fast Interconnect channel provides separate input and output paths
When it comes to chipsets, Intel will implement a single-chip solution. Because the built-in memory controller in the CPU is almost equivalent to having the North Bridge built into the CPU. The functionality of the chipset in Figure 3 is similar to the South Bridge, functionally like an I/O hub (I/O hub) or called "IOH" in Intel's jargon.
Each channel (lane) can transmit 20 bits at a time, where 16 bits are actual data and 4 bits are redundant error correcting codes (CRC). The first version of the QPI has a working frequency of 3.2GHz and is capable of transmitting two data in one clock cycle (that is, DDR technology), which makes the bus operating frequency appear to be at 6.4GHz (Intel uses GT/S to represent 1 billion transmissions/ seconds). Since 16 bits can be transmitted at one time, we can easily calculate the peak 6.4ghzx16bits/8=12.8gb/s of data transmission on one channel. Perhaps some people say that the maximum theoretical peak for QPi is 25.6gb/s because they consider the bandwidth provided by the two channels simultaneously. But we do not agree with this approach, as we can say that the highway is capped at 130 miles per hour because the one-way speed limit is 65 miles per hour as meaningless.
So, compared to the FSB, QPI transmits less data but can work at higher frequencies. Currently, the fastest front-end bus frequency 1600MHz (actually working at 400MHZ, but the transmission of four data in a clock cycle) can also reach 12.8gb/s, which is as fast as QPI, but QPi is provided 12.8gb/s on each channel. And the FSB has to transmit I/O data and memory data at the same time, so the qpi available bandwidth is higher.
QPi is faster than HyperTransport. The current hypertransport can reach a maximum speed of 10.4gb/s, but the current phenom processor can only reach 7.2gb/s. So the Intel Core i7 CPU's external bus is 78% faster than AMD. Other series of AMD CPUs such as Athlon, Athlon X2 transmission frequency is lower, only can achieve 4gb/s.
Figure 4 Differential pair transmission
Along the electronic transmission continue down, each bit of the transmission using a differential pair (differential pair), shown in Figure 4. Therefore, two conductors are used for each transmission. The QPI's two channels used a total of 84 conductors, which is almost half the number of wires required by the FSB in the Intel traditional CPU. So, the third point is that the front-end bus can use fewer wires (the first advantage is to separate memory and I/O requests, and the second is to separate the read-write path Channels).
QPI uses layered technologies that are similar to those in the network architecture, with a total of four layers, namely the physical layer, the link layer, the routing layer, and the protocol layer.
Next we look at the advanced techniques introduced in QPI.
Power mode
As shown in 5, the QPI provides the following two power modes, called L0 and l0s and L1, respectively. L0 refers to the QPI full-speed runtime mode, in the l0s state, in order to save energy, the data cable and the circuit that drives these lines will be switched off. In the L1 state, everything will be closed. Of course, the wake-up time of the L1 will be longer.
Figure 5 Power Management
Reliable mode
We mentioned that the QPI data path is 20 bits wide, and what we didn't mention is that QPI allows each channel to be set to 4 five-bit wide small channels, as shown in 6. For the server market environment, this segmentation method can improve the reliability. However, for desktop environments, QPI does not provide this functionality.
When this feature is turned on, if the receiver discovers that its connection to the transmitter has been physically compromised, it will close the damaged part and transmit fewer bits at a time. This obviously reduces the data transfer rate, but at least the system does not crash.
Figure 6 Reliable Mode configuration
"Go" CPU fast Interconnect channel (QPI) detailed