The bulldozer "is the core of AMD's radical redesign, and will become the next generation of AMD's High-performance processor technology for the client and server sectors, with an increase of 33% core and approximately 50% performance compared to the Opteron 6100 series."
As a new generation of processor architecture, AMD "Bulldozer" will adopt 32nm SOI technology, which makes "bulldozer" compared to "magny-cours" Hao long processor can increase the power without increasing the premise of 33% of the core number, increase the throughput of 50%.
Unlike all previous processors in AMD, "bulldozers" are designed with "modular" design, each "module" containing two processor cores, somewhat like a single core processor with SMT enabled. Each core has its own integer scheduler and four proprietary pipelines, and two cores share a floating-point scheduler and two 128-bit fmac multiplication accumulators.
Turbo core full kernel acceleration technology
The Turbo core technology is primarily meant to speed up the clock for some workloads that are not fully consumed to the fullest extent. With a variety of workloads, the turbo core is used to maximize the performance of 500 MHz. Most importantly, Turbo core acceleration refers to the acceleration of all cores, and some of the nuclear acceleration techniques are significantly different, and previous nuclear acceleration technologies may require the closure of some cores to accelerate only partial cores. Turbo Core technology allows up to 500 MHz of all nuclear growth, and if some nuclear operations are shut down, the acceleration will exceed 500 MHz. At the same time, we have further optimized the memory controller to improve the memory throughput.
In addition to each core exclusive 4 integer calculation pipeline, on the floating-point operation, "bulldozer" employs "FLEXFP" technology, two core share a floating-point scheduler and two 128-bit FMAC multiplication accumulator, can be combined, Each clock cycle can be completed two times with 64-bit double precision or 4 32-bit single precision calculation. If a core does not carry out floating-point operations, then another core can occupy the two 128-bit fmac, 4 times in a clock cycle or 8 times a single precision calculation, AMD named it AVX mode. This technique guarantees the floating-point computing power of bulldozers and does not sacrifice performance for "sharing" in high-performance computing.
New interfaces and new processes
The Bulldozer processor will use the socket am3+ interface, 942 pins, different from the current 938 pin socket AM3 interface, the benefit of which can support ddr3-1600 memory and advanced energy-saving technology, and am3+ will be the am3+ will be the last generation of AMD pin grid array ( PGA) package, the contact grid array (LGA) will be used later, and when Fusion Fusion processor arrives, it will use the LGA AF1 new interface with up to 1591 contacts, support DisplayPort 1.2 standard, PCI-E 3.0 specification (32 channel), four channel memory.
Enhanced Memory controller
8 years ago AMD first launched integrated memory controller, according to AMD in this field of experience and very good technology, but also in this generation of products to improve the overall performance of the memory controller. First of all, the memory controller in the efficiency of the targeted redesign and improvement, so as to achieve 30% memory performance improvement. On the basis of 30% performance, the memory support 1600MHz frequency, can obtain the additional 20% performance. Combined, the memory controller 50% throughput elevation can be achieved.
Support AVX instructions and SSE instructions at the same time
Flex FP is the most innovative floating-point computing technology ever made by AMD, and each module has a flex FP for floating-point operations. Using a traditional 128-bit encoding means that each kernel will have a separate floating-point unit of operations. Compared with friends, if the 128-bit coding premise, AMD executed a number of times more. If it is a 256-bit AVX encoding, bulldozer can put two floating-point units together for execution. So in 256-bit code execution mode, the number of executions is the same as that of a friend. But Bulldozer has a very big advantage, that is, can execute 256 bit AVX instruction and SSE instruction simultaneously. And friends can not do this, they can only in AVX or SSE Select one, such advantages can allow bulldozer in high-performance computing, media codec and in some technical operations have a higher performance.
More advanced Power management technology
The second integer core in each module requires only 12% of the total core area, which, at the chip level, will only add a 5% of the circuit to the entire kernel. More core, less space, which is obviously conducive to increasing unit power consumption, unit cost performance.
The
Power consumption size is determined by the number of powered clocks, depending on how many transistors need to be powered up by performing a common instruction (operation). In the percentage of the maximum clock power supply, the normal application state and idle state, Bulldozer has a very good energy performance. At the same time in each unit of energy consumption optimization, can be in various units to power off. High-performance computational energy consumption is mainly due to floating-point operations, while the general application operation is the highest in the execution unit. At the same time idle, AMD technology can do for those completely unnecessary cores, the power to completely shut down. Last year, AMD products had a major transformation, AMD launched a new slot, 2011 launched the bulldozer can use 2010 years of slots. And friends to launch a new platform, while the introduction of new slots, which also makes AMD more dominant.