Tables are stored in the QCon2017 share

Tables are stored in the QCon2017 share _ database

Last Update:2018-08-22 Source: Internet

Author: User

Tags failover

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Click to have a surprise

In the QCON2017 infrastructure session, the author shares the design considerations of distributed system based on the form storage, mainly extensibility, usability and performance. Each point is illustrated with a concrete example. Here is a simple summary of this sharing.

First of all, when it comes to the background of table storage, large-scale, weak relational data, the demand for flexible schema changes, the traditional database can not be satisfied, the appearance of NoSQL is a good supplement. NoSQL is not meant to replace SQL, nor can it replace SQL, which is a good complement to the existing database ecology. I think there will be more kinds of databases in the future, for different businesses, with different hardware, the database market will usher in more members.

Then, the function, ecology, architecture and data model of table storage are introduced, so that we can understand the later content better with these bases.

When discussing the ability to expand, I cite an example. HBase after a split, need to do compaction to continue to split, compaction time can be several hours, and table storage support continuous division. So why is the table store supporting continuous fragmentation? The main reason is the difference between the Multi-Tenant service and the products in the enterprise. For table storage, the user can click the mouse to open, business access may be a significant increase in time, users will not tell us in advance, even if told the human also not so much. The increase in access is likely to lead to hot spots within the zone, which require the system to be processed quickly, with 1 splitting into 2, and 2 splitting into 4 .... And within the enterprise, business can generally be expected, it is difficult to appear in the huge increase in operation dimension does not expect, so for hbase, the need for continuous division is reduced. This is different, seemingly different technology, the actual user is different, product form different choices brought about by the different.

When it comes to usability, one particular example is that Google BigTable and open source HBase use the worker layer to aggregate logs to improve performance. This idea is very well understood, that is, the log of multiple partitions together to write to the file system, which can reduce the IOPS of the file system and improve performance. However, this is a great harm to usability, because once the machine is failover, it means that the log files need to be read out and split by partition, and the separated log files are replay by the corresponding partitions, and then the corresponding partitions can provide services. Obviously, the above process will make the machine failover time partition can not be used longer (think about who to split the log.) Whether this will be a bottleneck. ）。 If you consider the complete group reboot, or the switch down causes more machines to fail, the impact on usability will be significant. Here is a trade-off between usability and performance, where the table is stored at the beginning of the design, with the choice of usability, which means that each partition has a separate log file to reduce the failover time under the machine. But does this mean a drop in performance? Yes, but we believe that usability is a higher priority, and performance is always solved, and then we found a very good way to see it.

When it comes to the trade-off between usability and performance, table storage chooses availability and discards performance. But the performance is obviously very important, so we rethink the problem. The core idea of bigtable and hbase is aggregation to reduce IOPS, which improves performance; then does aggregation have to be on the table level? Whether it can be pushed down to the Distributed File System layer. The conclusion of course is yes, and the effect is better, the beneficiary side more. The specific architecture is shown in the appendix, we have greatly improved performance by pushing the aggregation down to the file system, RPC Layer packet aggregation, pipeline transmission and so on, and we have achieved a good balance between availability and performance.
Continue down, we talked about, as a platform, how to learn from the user. The appendix gives the example of the PK Self adding column for the message push system, we have written many articles in this aspect, see [2][3].

Click to have a surprise

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More