Collaboration mechanism of computing and data in large data technology
Wang Peng Huang Liu Fengan Handsome
The large data system, also known as the data-oriented high-performance computing system, is similar to the traditional high-performance computing systems, and its computation and data storage are usually distributed systems based on the cluster implementation. Based on the coordination mechanism of computation and data, this paper compares the high performance computing and data-oriented high performance computation, and points out that the cooperative mechanism of computing and data determines the basic structure and performance of large data system. The integration of Distributed file system and computation through assistance mechanism is the foundation of automatic parallelization of large data system. Different from the computing-oriented high-performance computing system, large data systems are used to segment data and to transfer computing to data migration as the main principle of collaborative mechanism, and realize the automatic parallel batch processing of massive data. Metadata mapping method, hash mapping method and flow topology method are the basic methods to realize computation and data collaboration, especially the real-time large data processing can be realized by using the flow topology method.
Collaboration mechanism of computing and data in large data technology