The hottest topic in the database industry this week is memsql, whether "old bottled new wine" has sparked a lot of debate, but also raised questions about whether product technology is important or that DBAs are important. There are some introductions to memsql on the web, mostly from official documents. In this article, Curt Monash, a prominent independent analyst in the database industry, also published his views on Memsql.
What the hell is Memsql?
Memory-relational database
Only stand-alone versions are currently available, transparent fragmentation and basic replication features are under development and are expected to be released this fall
Subset of SQL-92
Compatible with MySQL (except for SQL overwrite issues)
Performance of Memsql
Read performance is about 10% worse than memcached.
Write performance is about 20% stronger than memcached.
A 64-core, 1/2 TB machine can run to 1.2 million inserts per second
Under the same conditions, 500 million records can be loaded within 20 minutes.
The company situation of Memsql
A total of 15 employees
Some customers have put memsql into production environment
Currently GA two versions, free edition limit GB RAM, Enterprise Edition Unlimited, according to the database size charge
The discussion on Memsql focuses on performance, including:
Data is organized through a hash table and a Jump List (skip lists). Memsql that the scalability of the jump table on multi-core is very good
Query mode can be compiled into C + +
mvcc/No read lock
Lightweight Write lock
Can be adjusted for durability, you can fully durable run memsql, or you can set a buffer size to limit the amount of data you can afford to deal with.
In fact, the query for precompilation is hardly an innovation, nor is it the industry's only. Previous attempts, including Qliktech, Streambase and Paraccel, have all done similar things. And Memsql is characterized by:
Compiled into C + + that you can read as long as you want to.
parameterized, if a query includes a series of parameters, it will be stored for more other parameters to run in the future
persistence, stored compilation queries are not lost if the server is down
Each query takes up only a few kilobytes of space, and in the early Memsql customers, they will not have more than thousands of query modes stored. So Memsql is very optimistic about the impact of compiling these queries, and the way you use the LRU algorithm to free up space may not be considered.
The persistence of Memsql uses a pre-write log to a disk (traditional or SSD), while sending snapshots to other disks. The durability of the design is continuous, but it is not clear in the full-durability scene.
Some of the other notable technical details include:
Memsql can be run in a multiple temperature environment, manually controlled by DDL. In other words, newer data is put into memsql, and old data is placed in MySQL
One thing I think is best practice, Memsql team also uphold the idea that in two or more server RAM to be confirmed after the first time to submit write
Parallel GROUP BY, this Memsql team is proud
Memsql do not compress data, they believe that data compression is required in OLAP load
Memsql Insert performance is very high, so their target customer base is those who have frequent transactions of the system users