A Industry background and user needs analysis
Geo-data storage system has both common characteristics of storage devices in conventional data centers and its unique place. Analytical processing, interpretation, reservoir and other geo-applications are not difficult to find, in addition to high reliability, high-performance, flexible and convenient online expansion, and other requirements, there are the following typical features:
• A complete geoscience It architecture that should be geared towards a variety of geo-application systems to better meet their different needs. For the processing system, the traditional server is SMP parallel machine system, the MFlops has strict requirements, the user may be using the IBM SP Series server, or SGI Origin high-end server is holding heavy processing tasks; More and more users are turning to the PC Linux cluster to provide powerful CPU processing capabilities, so different times of processing servers should be able to share seismic data, minimize data transfer or even format conversion of the heavy work.
• In the future, with the new Interpretation tool (5X), pre-Storage data Interpretation tool (5X), slow (4D) earthquake (3X) and shear wave data, the amount of data will increase at an astonishing 225 times times (5´5´3´3) high speed , so the storage system should fully meet the capacity requirements of future applications, while having the ability to increase capacity online.
• For the seismic path interpretation system, advanced interpretation systems mostly use high-end SGI workstations to meet the demanding graphics, while ordinary interpretation work can be done on HP, SUN, IBM workstations, single area interpretation may have multiple technical personnel involved, so storage devices should have heterogeneous platform, The ability to share data with multiple users.
• With the development of field processing technology and Wintel technology, other applications, whether processing or interpreting, are beginning to support UNIX and Windows two versions, and therefore supporting both the CIFS protocol and the NFS protocol as an indispensable indicator of storage device data sharing.
• For three-dimensional seismic data, whether it is processing interpretation, or virtual reality application system, are faced with a one-time load of large data body work, coupled with the application of the intermediate results, large-capacity file system and large-size single file storage equipment is the efficient completion of production tasks reliable guarantee.
• Process and explain the integrated IT architecture, reduce the amount of data transfer, deal with the cooperation of the interpreter, optimize the process of the whole seismic data, improve the production efficiency and maximize the utilization of seismic data.
• Effective disk to disk processing, shorten the production cycle, enhance competitiveness.
Based on the above considerations, we propose the following solutions to the tender requirements.
Two Geo Data Storage System solution
1. Program Design Principles
--to fully meet the requirements of the tender for the performance and function of storage equipment.
--Select industry-leading technology and products to ensure the advanced nature of the scheme, fully consider the compatibility and interoperability with the original equipment, the maximum protection of user investment.
--A mature combination with geo-application systems.
--The storage system design is completely based on the existing computer and network equipment industry open standards, adapt to the user's existing network system hardware environment requirements.
The security of data and the high reliability of the system, storage System is responsible for storing the data of the whole center, is a typical key business, can not stop, the high reliability of the system has higher requirements. The high reliability of the storage platform as the core of the system is the most important. Because of the centralized storage scheme, all the relevant data are stored on the unified platform, and any failure of the storage platform will cause great impact. Therefore, the data security of storage platform and the high reliability of system are especially important.
--high performance of the system, storage system to store data for a large number of users. The performance of the entire storage system is also a critical requirement because the total amount of data will reach terabytes and how to meet concurrent access for multiple clients in such large amounts of data. And considering the future growth of the business, the amount of data will continue to increase, the number of clients will increase, the system's performance should be well adapted to the future expansion and expansion of the needs.
--The scalability/scalability of the system, as the basic requirements of centralized storage, storage systems should be able to support a large storage capacity, can centrally store different platforms of enterprise data, so that while preserving the benefits of distributed processing while implementing the core information
-Centralized storage and centralized management, over time, technology development and environmental changes, the user's data volume will increase rapidly, many new users or new requirements will continue to produce, so the storage system scalability is highly required. Although we have fully considered the reservation of system storage capacity in this scheme, the scalability requirement of storage system will be very urgent with the development of the business. This is mainly manifested in the smooth expansion of the storage system capacity and the smooth connection to the new host system to minimize the impact on the normal business.
Data sharing and the system's multi-platform support capability, as the basic requirement of centralized storage, the storage system must be able to connect different platforms simultaneously to meet the needs of data centralization and sharing in the future.
Flexibility and the simplicity of system management, because the storage system data volume is very large, how to effectively manage a large number of data, including data backup/recovery, the storage System management presents a great challenge. System managers need an efficient way to achieve comprehensive storage system monitoring, including real-time data performance monitoring, error monitoring, error status identification, and so on. In addition, as a centralized storage platform, because of the number of servers to be connected to the front-end, how to flexibly partition and dispatch the capacity between multiple server platforms is also a great challenge to the management of storage systems.
--from the overall requirements of the tender, fully consider the requirements of the backup system, the use of mature and reliable LAN free backup technology, to achieve large data volume of high-speed backup, to reduce the burden of system managers, to ensure that the handlers and interpreters even when the backup can work properly.
2. Scenario Topology
To fully meet the bidding requirements, we chose NetApp fas920c as a centralized storage device. As shown in the following illustration: