The following content is provided by [Wu Si Chen Emy of Sciences] www. infoq. three years ago, comcnarticlesnosql-dynamo published an article on infoq about two representative distributed storage design ideas. Three years later, let's take a look at the changes in the past few years. In fact, in the past three years, there have been two things which have never been more cool.
The following is an analysis of two representative distributed storage design ideas published on infoq three years ago provided by [Wu Si Chen Emy of Sciences] http://www.infoq.com/cn/articles/nosql-dynamo. Three years later, let's take a look at the changes in the past few years. In fact, in the past three years, there have been two things which have never been more cool.
The following content is provided by [Wu Si Chen Emy of Sciences]
Http://www.infoq.com/cn/articles/nosql-dynamo three years ago in infoq published an article on two kinds of particularly representative distributed storage design ideas analysis, three years later, today let's take a look at the past few years of change.
In fact, in the past three years, there have been two things that have never been more cool.
I. dynamo's typical riak features
Cassandra represented such projects several years ago. It has fixed features: horizontal scaling, no central node, multi-backup, final consistency, general performance, and suitable for massive data. Because cassandra has too many failure cases in the industry, it can be avoided. In the past two years, riak developed with erlang has emerged.
1.1 erlang
This is the biggest feature of riak, because the unique ability of the language in the distributed field makes the source code of riak very concise and clean. However, when I first read more than 10 thousand lines of code, I also lamented that a few years ago, I was using java code to pile up tens of thousands of lines of nuclear code, it's so stupid.
1.2 complete dynamo implementation
In the age of cassandra, many things are inconvenient to implement. The vector clock of Version Control is replaced by timestamp. vnode is a very large block in cassandra and may be uneven during load balancing. In the riak era, all the features have been completed with the support of erlang. And added: 1. Support for http access. 2. Bidirectional index. 3. Search support. 4. m/r support.
Ii. bigtable hbase features
The bigtable solution corresponding to dynamo has a longer history. The open-source project has been in progress for many years, and the hbase community is constantly improving.
1.1 Dependency on hdfs
Strictly speaking, the implementation of hbase only focuses on the regionServer (where the central node is located, which is used to allocate the data location). Therefore, hdfs is used for backup.
1.2 Columns
After hdfs is used, the storage format implemented on it enables hbase to meet various requirements. Of course, it is best to use high-speed storage media such as ssd for such complex interaction.
Iii. Development Direction and features
After reviewing the features of the two camps, let's look at the future.
3.1 mysql age
Recruit a bunch of mysql DBAs. Work well!
3.2 nosql age
The Development Engineer understands the hard work of DBAs and the hard work of the boss who cannot recruit DBAs, and decides to structure the data and simplify the data structure of the Code.
A typical key-value system.
Then, based on these single structures, we will implement a bunch of automatic machine data conversion functions. Riak in this column. Hbase is slightly higher than this.
3.3 future
Not only storage, but the entire O & M work should be automated evolution. You can imagine that on a sunny afternoon, engineers listened to the song with headphones and input the requirement model, one red button and one click, the Code has been written, test starts automatically, AB test, staging, everything is OK, and it is automatically distributed everywhere. Five minutes after the launch, an alarm is triggered somewhere. The Central Government automatically determines how to add machines and executes the add operation.
-- It should be noted in the days when 32-bit servers are outdated.
If you want to quickly find the author, you can also leave a message on Twitter: @ 54 chen
Or you are too lazy to bring a ladder to the wall. Please go to Sina Weibo: @ 54 chen
Original article address: From distributed storage design to automated O & M, thanks to the original author for sharing.