In the hardware high-speed innovation and software highly intelligent today, with "large data" as the representative of a new round of education information construction wave pushed to us. In order to meet the explosive growth of users and data volume, Tongji University, together with the leading domestic IT solutions and service providers in the dawn, in the comprehensive integration of cloud computing platform and existing assets, based on the use of DS800-F20 storage systems, GridView Cluster Management system, And the Hadoop distributed computing platform constructs the industry leading large data flexible processing platform, which makes Tongji University take a new step in the field of information science and interdisciplinary research.
Infrastructure platform requires more flexibility and efficiency to deliver differentiated services
Tongji University, a century-old school, is a comprehensive university with 9 major disciplines, such as theory, engineering, medicine, literature, law, philosophy, economy, management and education. As a research-oriented university, Tongji is one of the first universities approved by the State Council to set up a graduate school, as well as a high level institution of national "211 Project" and "985 Project". However, with the expansion of users, the diversification of types, the difference of levels, high-performance computing as the main infrastructure of cloud computing, its traditional use mode can not fully meet the growing demand for users, and restricts the quality of cloud services.
According to the introduction, in the traditional high-performance computing, all the services are running in the High Performance computer service node, leading to the end of the load is too large, prone to paralysis, thus affecting the normal use of users. In addition, in such an open and shared operating environment, it is impossible to meet the needs of different user environments and the use of user resource granularity, and the security of user information can not be guaranteed. In addition, because the software and hardware deployment and operation are very complex, for the individual needs to be completed, almost no re-use of the ability, resulting in a huge human and time cost waste, affecting its ease of use and efficiency.
"Because of these problems, our new generation of high-performance computing platforms should be more consistent with the work processes in large data environments, and must provide sufficient flexibility in design, implementation and management to meet the needs of different applications," said Tongji University. Provide sufficient flexibility to interact with different application systems. And the flexible large data processing platform of Tongji University needs to have the hardware and software which can be reconstructed repeatedly, and also need to integrate the original IT resources and avoid the waste of information investment. ”
Fully integrate existing resources to realize flexible construction standards
In response to the above needs, Shuguang company and responsible for the construction of the Institute of Telecommunications, Tongji University in close cooperation with the existing assets and network situation has been studied and analyzed. In the end, Tongji University used a flexible processing platform based on the dawning company. The platform is partitioned, and its application will cover three service partitions and a shared storage center, namely network Information Service, traffic information analysis, medical data analysis and storage center.
"Tongji University large data flexible processing platform"
The program is built on a partitioned basis, fully integrating the cloud computing platform and the existing assets of Tongji University (IBM, Dell and HP servers), and other analysis nodes and service nodes are using the dawning efficient server. The centralized storage system is built with Shuguang Ds800-f20, which supports Fc-san/ib-san two architectures and uses dual redundancy controller, which supports efficient 8GbFC and high-bandwidth 20GbIB host interface, stable and efficient. In addition, server management is based on the test of a large scale cluster management system gridview in various cloud computing, with outstanding performance in terms of stability and usability.
In order to ensure the smooth and safe work flow of the whole platform, the GridView implements four logic levels in each compute node, namely: Hardware information collection layer, resource integration Platform layer, core module layer and service delivery layer. The four-layer logic systematically gives priority to the information resources collected from each monitoring node system. When information is provided to the upper level through the collection layer, the data information is classified and stored in the database, which provides data service to the upper layer as meta data. The independent development of each module, the sharing of integrated public platform data information, can be free to cut the metadata to adapt to the flexible requirements of module development. And according to the demand, add new modules to the platform, so that the effective compatibility between the modules, to provide users with a unified web interface, to achieve the management and operation of the cluster, the integration of resource information to the Portal form performance, the historical data analysis, to provide users with a unified job scheduling interface.
Because the company uses Hadoop technology to build analysis cluster and centralized storage system, eliminate the resource island, each node can be interconnected according to the business intensity equilibrium, and its unique scalability enables the platform to expand with the data expansion to achieve the true flexible computing platform standards.
Significantly reduce management costs zoning management high price highest
After the solution deployment, the existing IT platform to resolve the lack of support capacity, management difficulties and many other issues, Tongji University's leadership and the subjects of the major data flexible processing platform to give a high evaluation. Through summarizing the application effect, we can find that the flexible processing platform brings many concrete benefits:
Ø fully integrate the existing resources including cloud platform, server, and protect the original investment;
Ø the distributed search engine architecture uses the partition form, makes the system each component to be fully decoupled, the upper layer software deployment is more flexible, greatly facilitates the research user to carry on each kind of system adjustment and the deployment;
Ø for different components using zoning management, partition configuration physical machine and virtual machine resources, emphasizing the optimal allocation of resources, access to a very high cost performance;
Data network and Business network separation, so as to achieve data communication and business communication between the operation of the goal of mutual impact;
Ø the Management network communication load is lighter, the management operation relies on the data network construction, realizes the whole network unified management, simultaneously the management network does not need to configure the switch and so on other basic equipment, reduced the purchase and the management cost.
Tongji University Project Director said: "With the dawn company to set up a large data flexible processing platform, changed the information construction of embarrassing situation, enhance the support for more emerging applications." Its partition design and application realize the value maximization of the platform, and also provide more efficient and flexible application flow for information subject and interdisciplinary research field. ”
(Author: Wang Editor: Wang)