Those things on Tianhe 2 (1)

Source: Internet
Author: User
Tags xeon e5

I believe everyone knows about Tianhe 2, ranking first among the top 2013 and 2014 top500, which is one time faster than the top 2nd Titan, what kind of architecture does Tianhe 2 use to achieve this capability? Let's take a look at it.


Tianhe No. 2 Model for TH-IVB-FEP, using the central processor and co-processor computing architecture layout:

Tianhe 2 has a total of 16,000 computing nodes, each of which is equipped with two Xeon E5 12 core central processors and three Xeon Phi 57 core coprocessors (operational accelerator card, I .e. mic card ). A total of 32,000 Xeon E5 processors and 48,000 Xeon Phi co-processors, a total of 3.12 million computing cores.


* Processor/CPU:

The CPU in the node is Intel 0.2112 GHz Xeon E5-2692v2 12 core processor, based on the intel Ivy bridge micro-architecture (Ivy bridge-Ex core), using a 22 nm process, peak performance tflops.


* Coprocessor/Apu:

Computing acceleration uses Intel integrated multi-core architecture Xeon Phi 31s1p coprocessor, which runs at a clock frequency of 1.1 GHz and has 57 x86 cores (61 in fact, because there is a conflict of computing cycles when all cores are enabled, four x86 cores are first masked). Each x86 core can run two threads by special hyperthread technology, the peak generation efficiency is 1.003 tflops.


* Memory:

Each node has 64 GB primary memory, and each Xeon Phi coprocessor carries 8 GB memory on board. Therefore, each node has a total of 8 GB memory. The total memory is 1 and 375tib (1.34pb ).

In fact, the onboard coprocessor itself is an independent machine with an independent operating system. Its onboard memory is also used independently, which is completely separated from the node memory, there is no shared memory between the two, so the node cannot use the coprocessor onboard memory, and the coprocessor cannot use the node memory.


* External Storage:

12.4pib capacity hard disk array


* Cabinet/Rack/motherboard, computing array

  • The main boards, racks, and cabinets are all manufactured by inspur group. There are a total of 170 cabinets, including 125 computer cabinets, 8 service cabinets, 13 communication cabinets, and 24 storage cabinets, each cabinet can contain 4 boxes, each of which can contain 16 boards, and each Board has two computing nodes.

  • In each computing array, each motherboard is divided into two parts: Apu module and CPM module. Apu module carries five Xeon PHI and CPM module carries one Xeon Phi + four Xeon E5. Note that the computing array consists of multiple nodes. One motherboard has four CPUs and six APUs. One node includes two CPUs and three APUs, that is to say, a motherboard has two nodes, and 16,000 nodes need 8,000 motherboard, which does not include a front-end processor.

  • The APU module and CPM module are connected using the PCI-E 3.0 16x interface provided inside the CPU, but only support to PCI-E 2.0 16x, due to the hardware limitations of Xeon Phi, the data transmission rate of a single channel is 10 Gbps.


* Front-end Processor

The computing node front-end processor is a 4096 FT-1500 16-core HPC V9 framework processor developed by the Chinese National Defense Technology University, with a 40 nm process and 1.8 GHz operating time, with a thermal design power consumption of 65 watts, the peak performance is 144 gflops. Intel Xeon E5-2692v2 22nm 12-core 2.2 GHz peak performance 211 gflops.

What is the purpose of the front-end processor? You need to know that Tianhe 2 has so many processors and each processor has multiple cores. You need to allocate an computing task to a large number of processors on average, this requires task scheduling to manage the allocation time period and execution order, and specify when the task runs, how many processors are required, and on which processors The task runs. It is a bit similar to the flight control center's scheduling of aircraft or the vehicle management center's scheduling of vehicles, but it is only a processor.

This is one of the few places on Tianhe 2 that can be used with Chinese-made processors. This is also where the domestic Qilin Operating System (modified based on the Linux source code) exists.


* Network connection

Tianhe 2 adopts the self-developed express-2 internal interconnected network. The high-speed Interconnected Architecture Uses the photoelectric hybrid transmission technology. It has 13 switches, and each switch has 576 ports. The connection medium is photoelectric hybrid. The specific controller is an ASIC dedicated Integrated Circuit named NRC. It adopts the 90nm process and is 2577 pin. The throughput of a single NRC is 2.56tbps. the network interface of the terminal also adopts a NIC with a similar structure, but the size is slightly smaller, with a 675 pin. The network interface is connected in PCIe 2.0 mode and the transmission rate is 6.36 Gb/s. In addition, the latency is also very low at 12000 nodes, which is only 85us.

This is another place to use domestic chips.


* Operating system and related software:

  • Redhat Enterprise Linux Server Release 6.2 (kernel 2.6.32-220 customized version): The system is installed on 16,000 computing nodes, later, we plan to change 6400 nodes to Kirin Kylin cloud Linux (customized Ubuntu edition in China)

  • Opensstack (canonical release): includes customized Chinese versions of ubuntu server (Kylin cloud Linux), UBUNTU openstack, and Ubuntu juju (cloud service process engine ). Openstack is now running on 256 nodes and will be deployed on more than 6400 nodes in the future.

  • Kirin operating system: the operating system is modified based on the Linux source code and runs on the domestic Apsara processor (FT-1500) at the front end for task scheduling management. The Job Management System uses Slurm.


Unfinished, To be continued ......

This article is from the "Tony Park" blog, please be sure to keep this source http://ittony.blog.51cto.com/6242212/1551329

Those things on Tianhe 2 (1)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.