Baidu's high-performance computing system (mainly backend data training and computing) currently has 4000 nodes, more than 10 clusters, and the largest cluster Scale is more than 1000 nodes. Each node consists of 8-core CPU, 16 GB memory, and 12 TB hard disk. The daily data volume is more than 3 PB. The planned architecture will have more than 10 thousand nodes, and the daily data volume will exceed 10 pb.
The underlying computing resource management layer uses the agent to schedule different types of computing to the MPI StructureAlgorithmAnd map-Reduce and Dag algorithm applications. With the allocation of scheduling, the computing data of each region of the HPC high-performance computing cluster and large-scale distributed cluster can be used.
Baidu uses HCE to optimize the sorting, compression, decompression, and memory control of streaming jobs and provides mapreduce interfaces for C ++.
Content related to Baidu HCE. HCe is a C ++-based hadoop environment and a full-featured C ++ environment. It can avoid the disadvantages of Java for releasing memory and resource applications, it bypasses all Java joints when calling data, greatly improving algorithm efficiency.