To buy a dual-core computer, you must first understand the dual core.
Look at Intel and AMD's respective dual-core, we focus on learning their differences, only after understanding to know, choose which is better, or more cost-effective.
To be correct, the use of Intel and AMD's caching is not the same, and can not be directly compared to the version of the cache is wrong.
The AMD L1 cache is not comparable to the L2 cache and the Intel L1 cache L2 Cache size, so don't be promiscuous. Intel's L1 cache is the data code instruction cache, while the AMD L1 cache is the real data read/write cache. Interl L1 Cache (computer automatic shutdown) stored data in the L2 cache (computer automatic shutdown) address, L1 Slow does not have real data, so you see the Intel CPU L1 cache are relatively small.
In contrast, AMD L1 cache (computer automatic shutdown) is the actual data, when the L1 cache full, and then save the data to the L2 cache, so you can see the AMD CPU L1 cache is relatively large, 128K.
Because the L1 cache is slower than the L2 cache, AMD CPUs are more efficient on the cache than the INTEL CPU.
And speaking of the size of the L2 cache, we emphasize that the INTEL CPU L2 cache is very large, but the L2 cache in general use does not play any role, but rather waste the consumer money.
CPU Processing data probability
The probability of CPU using 0-128k cache is 80%
The probability of CPU using 128-256k cache is 10%
The probability of CPU using 256-512k cache is 5%
The probability of CPU using 512-1m cache is 3%
The probability of a CPU using a larger cache is 2%
So it's not very useful to say that the cache is too large.
AMD and Intel's Memory control architecture is different, only a few data can not reflect the actual situation, in fact, AMD's architecture is not a bottleneck, Intel's shared FSB architecture needs and other hardware equipment contention bandwidth, latency is also large, The aim of the big L2 is also to reduce the impact of the FSB bottleneck.
Dual-core processors can be said to be the biggest bright spot in the CPU field. After all, X86 processors have developed today, and the traditional way to increase performance by increasing the number of branch prediction units, cache capacity, and elevation frequency seems to be hard to get through. As a result, when a single core processor seems to have come to an end, Intel and AMD have invariably launched their own dual-core processor solution: Pentium D, Athlon x2!
A dual-core processor, simply by integrating the processor core of two (a computer without sound) on a CPU substrate, and connecting the processor cores through a parallel bus. Dual core is not a completely new concept, but only the most basic, simplest and easiest type of CMP (Chip Multi processors, single chip multiprocessor).
First, processor collaboration mechanism
AMD Athlon X2
Athlon X2 is actually from the Athlon 64 evolved, with two (computer has no voice) Athlon 64 core, with the design of independent caching, two (the computer does not sound) core has its own independent cache resources, and through the "System Request Interface "(System request interface, referred to as Sri) makes Athlon X2 two (the computer does not sound) a core of collaboration more closely. The SRI unit has a high-speed bus connected to a two-level cache of two (with no sound on the computer), and if two (computer-not-sound) core cache data needs to be synchronized, it can only be done through the Sri unit. This design can not only make the CPU resource cost smaller, but also effectively utilize the memory bus resources, do not need to occupy the memory bus resources.
Pentium D
Like Athlon X2, Pentium D two (the computer has no sound) the core of the two cache is isolated, but there is no dedicated interface to design collaboration, but only in the front-end bus part of the simple combination of the design, the disadvantage is that the need to consume a large number of CPU cycles. That is, when a core of the cached data changes, the data must be sent through the front-end bus to the North Bridge chip, and then from the bridge chip to the memory, and another core to read the data through the North Bridge, that is, Pentium D can not be like Athlon X2, in the CPU internal data synchronization, Instead, it needs to be synchronized by accessing the memory, which consumes more time than Athlon X2.
二、二级 Cache Contrast
Second-level caching for CPU processing power, this can be from the same company's product line on the high and low end of the product manifest. Second-level caching as a buffer of data, its size has a significant significance, the larger the cache means that the amount of data can be accommodated, which greatly reduces the speed of the bus and memory can not match the processing speed of the CPU, and waste CPU resources. In
In fact, a larger cache means that you can exchange more data at once, and you can significantly reduce the occurrence of cache errors and speed up access to data, making the overall performance higher.
For now, AMD's CPUs are designed for a two-level cache, due to the manufacturing process, or relatively small, high-end up to only 2 m, a lot of low-end products only 512K, this data processing will bring some adverse effects, especially the processing of large amounts of data. Intel on the contrary, in this respect, such as the Pentium D core integration of the 2M two cache, which in the processing of data has a greater advantage, in high-end products, or even the integration of 4M two cache, can be said to be the N-fold of AMD. Data from some of the actual tests also show that the two-tier cache has a larger Intel score than the two-level cache of less AMD.
Three, Memory architecture comparison
Beginning with Athlon 64, AMD began to use the memory controller integrated into the CPU core design, the advantage of this design is to shorten the CPU and memory of the data exchange cycle, formerly used memory controller integrated in the North Bridge chipset design, to integrate into the CPU core, This CPU does not need to go through the North Bridge, direct access to the memory operation, effectively improve the processing efficiency at the same time, but also reduce the North bridge chip design difficulty, so that the motherboard manufacturers to save the cost. But this kind of design improves the performance at the same time, also brought some trouble, one is the compatibility problem, because the memory controller integrates in the core, does not resemble inside the North bridge chip inside, the compatibility is poor, this gives the user to buy the memory time to bring some unnecessary trouble.
In addition to poor memory compatibility, because of the use of the core integrated memory controller, the choice of memory types is also a big constraint. In the current memory market, it is obvious that the transition is already like the DDR2 generation, and Athlon 64 has so far integrated only the DDR memory controller, in other words, the existing Athlon 64 does not support DDR2, which not only restricts performance, but also limits user choice. But Intel's CPU does not have this kind of trouble, only needs the North Bridge to integrate the corresponding memory controller, can easily choose to use which kind of memory, the flexibility enhances many.
There is also a problem, if the user adopts integrated graphics, AMD this design will affect the performance of integrated graphics. At present, integrated graphics mainly through the dynamic allocation of memory as video memory, when the use of AMD platform, integrated in the North Bridge chip in the core of the graphics card to be able to memory operation through the CPU, compared directly to the memory operation, the delay is much longer.
Four, platform bandwidth comparison
With the advent of the mainstream dual-core processors, as well as the support of the 945 and 955 series motherboards, Intel's front-end bus will be upgraded to 1066Mhz, with the latest DDR2 667 memory, further improving I/O bandwidth to 8.5gb/s, and memory bandwidth reaching 10.66gb/s, Compared to AMD's current 8.0gb/s (I/O bandwidth), 6.4gb/s (memory bandwidth), Intel is far higher and has a significant overall performance.
Five, power comparison
In terms of power consumption, Intel is still a bit taller than AMD, but the near-term has improved. Intel since the introduction of the Prescott core, due to the use of 0.09 micron process, integrated with more L2 cache, the transistor is thinner, resulting in leakage phenomenon, but also increase the leakage power consumption, more transistors have brought power consumption and increase in heat. To improve the power and calorific value of the Prescott core processor, Intel ported the EiST (Enhanced Intel Speedstep Technolog) previously applied to the mobile processor to the current mainstream Prescott core CPU, To ensure effective control to reduce power consumption and calorific value.
AMD has added cool ' n ' quiet technology to reduce the CPU's own power consumption, its working principle and Intel's SpeedStep dynamic regulation technology similar, are through the regulation of frequency multiplier and so on to achieve the effect of reducing power consumption.
In fact, Intel's CPU power is now higher than AMD, the main reason is that its internal integration of transistors far more than the AMD CPU, coupled with the operating frequency than the AMD's CPU much higher, this will become more power. However, this issue will be effectively addressed in the upcoming Intel Next Generation CPU Architecture Conroe. In fact, Conroe is the current Pentium M architecture, it continues the majority of the advantages of Pentium m, such as lower power consumption, in the case of low frequency has been able to achieve better performance and so on. As can be seen, in the future Intel will transfer the mobile platform Conroe to the desktop platform to achieve unity.
Six, pipeline comparison
Since stepping into the P4 era, Intel's CPU internal pipeline level is higher than AMD's. The previous Northwood and Willamette core lines were 20 levels, up almost one times relative to the PIII or Athlon XP's 10-level assembly line. And at present the market uses Proscott core CPU line to be 31 level. Many people will have questions, why to add a long flow line in fact, the length of the pipeline to the main frequency is still quite large. The longer the assembly line, the greater the potential for increased frequency, if the branch forecast failure or the cache is not, the longer delay time, for this in the NetBurst architecture, Intel will be 8-level command acquisition/decoding pipeline separation, and the Proscott core has two (computer no sound) Such a 8-level assembly line, so strictly speaking, Northwood and Willamette Core has a 28-level assembly line, and Proscott has 39-level pipeline, is now Athlon (K8) architecture line two (computer no sound) times.
I believe many people know that the long line is insufficient, but whether there is a long line of understanding the advantages of the pipeline in the NetBurst internal functions, each clock cycle can handle three operands. This is the same as K7/K8. Theoretically, the NetBurst architecture each clock executes 3 instructions times the clock speed, is the final performance, this shows that the frequency supremacy theory has its theoretical foundation. In order to calculate the performance of this, then K8 also not netburst opponents. However, there are many factors affecting performance, the most important is the branch prediction failure, the cache is not, the directive correlation three aspects.
These three aspects of the problem of each CPU will encounter, but a variety of solutions and effects there are differences. and NetBurst Natural long pipeline is both its biggest advantage and its biggest disadvantage. If the branch prediction fails or the cache does not occur, the Prescott core has 39 cycles of latency. This is much more than any other schema delay time. However, due to the high frequency of its work, coupled with a large capacity of two-level cache to some extent, to make up for the deficiencies of the NetBurst architecture.
However, the problem of pipelining in Intel's next generation of CPU architecture Conroe has been better resolved, this way, large-capacity cache, as well as a lower assembly line, with dual-core design, so that future Intel CPU performance more excellent.
AMD believes that its dual-core Opteron and Athlon-64 X2 conform to the true dual-core processor guidelines, and implicitly that the Intel dual-core processor is only "dual-core", implying that it is "pseudo double core", claiming that their own is "true double core", the true and false dual nuclear in the outside world caused controversy, Also for consumers to choose to bring inconvenience.
AMD believes that its dual core is "true double core", is that it does not simply integrate the two (computer-not-sound) processor core into a silicon wafer (or die), which adds a "System request Interface" to the single core compared to the INTERFACE,SRI "Crossover switch" (crossbar switches). Their role is based on the introduction of AMD to the two (computer does not sound) the core of the task of arbitration, and the implementation of nuclear and nuclear communication. Together with the integrated memory controller and HyperTransport bus, each core can have exclusive I/O bandwidth, avoid resource contention, achieve smaller memory latency, and provide greater expansion space, allowing dual nuclear power to easily expand into multi-core.
In contrast to its own "true duo", AMD calls the dual-core architecture of Intel's dual-core processor, the Pentium Extreme version and the Pentium D processor, "dual-cores." AMD said that they simply integrated two (the computer does not sound) a complete processor core, and connected to the same bandwidth-limited front-end bus, this architecture will inevitably lead to their two (the computer does not sound) a core scramble for bus resources, thereby affecting performance, And it's hard to add more processor cores to Intel's dual-core architecture.