"2014 China Database Technology Conference" Memory calculation: Percent Memory Database architecture evolutionPosted on May 5, 2014 by admin
"IT168 Database Conference Live report " April 10, 2014-12th, the Third China Database Technology Conference (DTCC 2014) in Beijing Wuzhou Crown International Hotel kicked off. During the three-day meeting, the Conference will be focused on big data applications, data architecture, data management, traditional database software and other technical fields, and will invite a number of domestic top technical experts to share. On the basis of preserving the traditional theme of database software application practice, this Conference will expand to the fields of big data, data structure, data governance and analysis, business intelligence and so on, in order to satisfy the urgent needs of the majority of practitioners and industry users.
Since 2010, the domestic leading IT professional website IT168 Joint Itpub, Chinaunix Two technical communities, has held four sessions of China Database Technology conference, each session of the Conference on the scale of more than thousands of people, the Conference gathered in the domestic level of the highest data architect, Database management and Operations engineers, database development engineers, research and development directors and IT managers and other technical people, is currently the most popular and most popular database technology Exchange event. This year is the five anniversary of China Database technology Conference, the General Assembly will continue to share the purpose of best application of it, around the traditional database and big data two technical mainline, in the current IT technology and management rapid background, more in-depth discussion of the current situation of database technology and future development direction, And our practical experience and lessons in this transformation process.
Today is the second day of the DTCC conference, in session 3, high-level architect headman from percentage points brings us a wonderful share of the evolution of the percent memory database architecture.
▲ headman: Percentage of high-level architects
Today data is no longer expensive, but getting value from massive amounts of data becomes expensive, and getting value in time is more expensive, which is why big data is becoming more popular in real-time computing. In today's internet era, it is possible to compute massive amounts of data in real time.
Headman talks about the 5 major trends in memory computing now: 1, data presentation, time is money. 2, facing the processing of huge amounts of data. 3, disk IO becomes the bottleneck of parallel computing. 4, for different industries, to respond to a variety of business needs, from different dimensions to deal with, analysis of data. 5, the requirements of the memory database.
Internet company Data Pyramid
In the percentage recommendation engine, it takes hundreds of milliseconds to get personalized recommendations for the current user from a huge amount of data. The traditional way of Rdb+memcache is obviously not enough, only full-memory computing can be so efficient. The recommendation engine system and other applications are greatly dependent on the memory database, which has high requirements for the data reliability, high availability and data consistency of the memory database. There are also significant differences in the requirements for an internal database in different scenarios. After several architectural changes, the platform-level memory database tends to stabilize.
▲ The framework principle of the percentage recommendation engine Bre
▲bre Real-time computing: Lambda architecture schematic
▲bre Real-time computing based on in-memory database
The memory database is the main main of the Bre
1, data Real-time update: User behavior, user preferences, commodity information, recommended algorithm results, cluster monitoring data ...
2, massive data: More than 10 kinds of categories, 1 billion order of magnitude entries, terabytes of storage capacity
3, high concurrency, high throughput: 100,000 levels per second read-write times, GB of data volume
4. High reliability and high availability: data hardening, disaster recovery, backup
▲ Percentage Memory Database evolution, showing the change of application technology
The limitations of the BRE 0.x memory database: the need to manually maintain the routing table, easily lead to unbalanced load, poor labor cost scalability.
The limitations of the BRE 1.x memory database:
1. memcached cannot be used as a database: Cannot cure data, cannot enumerate data, and does not have good control over data expiration
2, the separation of reading and writing leads to complex system
3, simple kv can not meet the demand: Big value causes network card bottleneck
Data serialization/deserialization consumes system resources
4, the expansion is not easy: the use of virtual nodes cause the need to recalculate all data distribution
BDM's in-memory database: Final consistency
1. Read-write asynchronous model (lambda schema)
2, Master hangs, at this time also not synchronized to the slave data (from the message Queue replay data recovery; algorithm data regeneration, continuous output; Slave upgrade to master, original master restored as slave)
3, slave hang off, after recovery data resynchronization
Summary:
Percentile memory database includes system architecture, processing flow and application of real-time computing framework and data query framework. This paper focuses on the framework principle of the percentage recommendation engine Bre, the real-time computation of the BRE based on the memory database, and the evolution of the framework in terms of the methods and techniques commonly used in the real-time computing algorithms of percentage companies. Help your business grow faster by continuously improving data size and processing efficiencies in real-time computing.