Transferred from: http://blog.csdn.net/guanhui1997/article/details/72840769
Industrial Big Data ramble 12: Real-time database and sequential database
In the field of industrial Big Data database storage, in addition to the traditional relational database and distributed database, there is a kind of database is very common, and is very necessary, is the real-time database and timing database.
Real-time database is not just a database, but a system, including the various types of industrial interface data collection, mass monitoring of the compression, storage and retrieval, based on monitoring data feedback and control functions.
The emergence of real-time databases was primarily to address areas where relational databases were less adept at the time, including:
1. Real-time Read and write operation of massive data
Industrial monitoring data requires the acquisition speed and response speed are milliseconds, a large enterprise tens of thousands of or even hundreds of thousands of monitoring points are often the case, such a large amount of high-frequency data, if stored with a relational database, due to the concept of the design of the relational library itself, resulting in it is difficult to perform hundreds of thousands of per second of data read and write operations, The real-time database can realize the real-time reading and writing of massive data by switching to the time-scale data structure and high-frequency cache technology of fast reading and writing design.
2, large-capacity data storage
Because data acquisition is a huge amount of monitoring data, if the traditional database for storage, it will occupy a lot of storage space, if we use a relational database to save 10,000 monitoring points, each monitoring point to collect a double-precision number of data per second, even regardless of factors such as index, also need 5-6t storage space, This does not include factors such as storage time associated with the monitoring point, and, if both are included and indexed, 15t-20t storage space is required. Real-time database uses a special compression algorithm, including the Harvard Mann algorithm, revolving gate algorithm and some two compression algorithm, compression ratio generally can reach about 30:1, plus for time and index special processing, the storage capacity can be reduced to 1/40 of the relational library, therefore, The above example requires only 500G of space to be able to store effectively.
3. Data acquisition with integrated industrial interface
Due to historical and monopoly reasons, the current industrial communication, transmission protocols a wide range of real-time libraries are generally integrated with a large number of industrial protocol interfaces, various types of industrial protocols can be analyzed and transmitted. Meanwhile, with the development of real-time database, the interface software is gradually independent, that is, it can be deployed on 1 computers with the core of real-time database, and can be deployed on the interface machine separately, thus providing better scalability and stability.
4, integrated control function, can achieve real-time control
The real-time database generally provides a downlink control interface and is written at high speed. The efficiency of writing is heavily dependent on the efficiency of the interface communication and the execution mechanism. Therefore, the real-time database is mostly from the industrial control software manufacturers to develop, they have a wealth of experience in the industry to write. Even so, after all, industrial systems have strict timing requirements, and the database from read to write, there will be a time lag, therefore, the real-time database is generally not suitable for fast switching volume control.
In the era of cloud computing, some of the drawbacks of real-time databases are slowly being revealed.
First of all, because the real-time database is based on time-scale processing, it can only be simple use of time to query and retrieval, of course, the major vendors have also developed a number of tools, but in any case the richness of the search can not be compared with the relational library.
Second, because real-time libraries are sold to large industrial enterprises, so the price is expensive, in the internet of things, for small and medium-sized industrial enterprises, is a small cost.
Thirdly, the traditional real-time library is not convenient and flexible in the deployment, the transmission also considers the industrial network, seldom considers the internet, and is not suitable for the deployment of the current cloud computing environment.
This time, the emerging timing database emerges. Time series database in the 2017, there was a lot of open source and commercial products, the time series database is the database that holds the time series data, and needs to support the basic functions of fast writing, persistence, multi-latitude aggregation query of time series data. Timing database is mainly the data storage part of real-time database, but because it adopts new technology, it greatly expands the capacity of data, besides data point and timestamp, it also provides description of data such as label and content, and provides various aggregate queries to compensate for the defects of real-time library.
However, the timing database does not provide features such as industrial interfaces and downlink controls, which require developers to develop themselves or interface with the timing library.
Of course, whether it is a real-time database or timing database, are in rapid development, both sides will certainly learn from each other, learning from each other, will provide better, more products for industrial big data use.
"Reprint" Industrial Big Data ramble 12: Real-time database and sequential database