Speaking to ordinary people. Distributed data storage

Source: Internet
Author: User

What's wrong with a relational database?

As many people may already know, relational database (RDB) technology has existed since the 1970 era, until the end of the 1990 era has been the de facto standard for structured storage. the RDB has been excellent in supporting highly consistent transactional workloads for decades and remains robust. Over time, this ancient technology has acquired new capabilities to respond to customer needs, such as BLOB storage,xml/ document storage, full-text indexing, code execution in the database, Data warehouses that use star data structures, and geospatial extensions.

 As long as everything can be squeezed into the definition of relational data structure, and suitable for a single machine, it can be implemented in the relational database.

However, the commercialization of the Internet has taken place and revolutionized everything, making relational databases no longer able to meet all of the storage needs. Availability, performance, and scaling are becoming equally important- Sometimes even more important-than consistency.

Performance has always been important, but with the advent of the commercialization of the Internet, the scale is changing. It turns out that to achieve large-scale performance, the required skills and techniques are unacceptable in the pre-Internet era. relational databases revolve around ACID(atomicity atomicity, consistency consistency, isolation isolation and Persistence durability) The concept of the building, to achieve ACID The simplest way is to keep everything on a single machine. As a result, the traditional method of RDB scaling is vertical scaling (scale up), in the white word, using a larger machine.

Oh-Oh, I think I need a bigger machine.

 A solution that uses a larger machine has always been good until the internet has brought a heavy load to a single machine that cannot be processed. This forces engineers to come up with ingenious techniques to overcome the limitations of individual machines. There are many different approaches, each with its advantages and disadvantages: Master - Vice, cluster, table union and partitioning (table Federation and partitioning), horizontal partitioning (sharding, which can be considered a special case of partitioning ).

Another factor that causes data storage options to increase is availability.

 Pre-Internet-era systems, whose users are typically from within the organization, may have planned downtime during non-working hours, and even unplanned outages can only have a limited impact. The commercialization of the Internet has also changed this: now every person who has access to the Internet is a potential user, so unplanned downtime is likely to have a greater impact, and the global nature of the Internet makes it difficult to determine non-working hours and schedule planned outages.

I have explored the role of redundancy in achieving high availability. However, when applied to the data storage tier, redundancy presents a series of new and interesting challenges. The most common way to apply redundancy at the database tier is the primary/secondary configuration.

This seemingly simple setup has a huge difference when compared to a traditional stand-alone relational database: We now have multiple machines with network isolation. When a database write occurs, we now decide when to think it's done: simply save to the primary database, or simply save to the secondary database (or even n sub-databases, if we want to get higher availability – to know how to increase the impact of another machine on the overall availability, See the first part of this blog series ). If we decide to save to the primary database, it is enough that if the primary database fails before replicating the data, we risk losing the data. If we decide to wait until the data copy is complete, we will accept the cost of the delay. In the rare case of a sub-database outage, we need to decide whether to continue to accept the write request or reject it.

Therefore, from a world of default consistency, we enter a world where consistency is a choice. In this world, we can choose to accept the so-called final consistency, that is, the state is replicated across multiple nodes, but not every node has a complete view of the entire state. In our example configuration above, if we choose to think that reaching the primary database is the completion of the write operation (or reaching the primary and any sub-databases, but not necessarily the two sub-databases), then we have chosen final consistency. Eventually, because each write operation is replicated to each sub-database. However, at any point in time, if we query a sub-database, we cannot guarantee that it contains all the write operations up to that point.

  CAP theory

In summary, when data storage is replicated (also known as partitioned), the state of the system is dispersed. This means that we leave the comfort of the ACID field and into The New World of the CAP's beauty. the CAP theory was presented by Dr. Eric Brewer of UC Berkeley in Year one. Its simplest form is this: a distributed system must trade-offs between consistency, availability, and separation tolerance (Partition tolerance), and only the two of the three.

CAPtheory extends the discussion of data storage beyondACID, which has stimulated the birth of many non-relational database technologies. In presenting hisCAPTheory ofTenyears later,BrewerThe doctor issued a statement clarifying his initial"three-choice two"has been greatly simplified in order to generate discussion and help to transcendACID. However, this great simplification has led to numerous misunderstandings and misunderstanding. On theCAPin a more granular interpretation, all three dimensions should be understood as ranges, not booleans. In addition, it should be understood that distributed systems work most of the time in a non-delimited mode, in which case consistency and performance need to be made/the tradeoff between delays. In rare cases where the separation is true, the system must make a choice between consistency and availability.

Contact us prior to the main/Vice example, if you choose to think that only when the data is replicated in all places (also known as synchronous replication) after the write operation is completed, we are at the expense of write operation delay to select the consistency. On the other hand, if you choose to think that once the data is saved to the primary database, the write operation is considered complete and replication is performed in the background (also known as asynchronous replication), we have chosen performance at the expense of consistency.

When network separation occurs, the distributed system enters a special separation mode, which is a trade-off between consistency and usability. Back to our example: After losing the connection to the primary database, multiple sub-databases may still continue to provide query services, which is to choose availability at the expense of consistency. Either, we can choose that if the primary database loses its connection to the secondary database, it should stop accepting the write request, so that consistency is chosen at the expense of availability. In the era of commercial Internet, the choice of consistency usually means loss of revenue, so many systems choose availability. In this case, when the system returns to its normal state, it can enter recovery mode, and all accumulated inconsistencies are resolved and replicated.

While we're still talking about the recovery model, it's worth talking about a distributed data storage configuration called primary - primary (or active - active ). In this setup, the write operation can be sent to multiple nodes and then replicated to each other. In such a system, even the normal pattern becomes complex. Because, if two updates to the same data occur at roughly the same time on two different master nodes, how do you reconcile them? Not only that, if such a system had to be recovered from a separate state, things would get worse. While there may be a viable master - master configuration, and there are some products that make it easier, my advice is to avoid it unless absolutely necessary. There are many ways to achieve a good balance of performance and availability without having to burden the high-complexity cost of the master - master configuration.

Common patterns for many modern data stores

A common way to provide a good mix of performance/scale and availability is to combine separation and replication to form a configuration (or pattern). This is sometimes referred to as a delimited collection of replicas (partitioned replica set).

Whether it's Hadoop,Cassandra , or MongoDB clusters, all of this basically fits this pattern, and many AWS the same is true for data services. Let's look at some common characteristics of a delimited collection of replicas:

 Data is separated (that is, separated) across multiple nodes (or clusters of nodes). No single partition has all the data. A single write operation is sent only to one partition. Multiple writes may be sent to multiple partitions and should therefore be independent of each other. Write operations that are complex, transactional, and multiple records (and therefore may involve multiple partitions) should be avoided because this can affect the entire system.

  The maximum amount of data a single partition can handle can be a potential bottleneck. If a partition reaches its maximum bandwidth, adding more partitions and splitting the traffic across it will help resolve the problem. Therefore, you can extend this type of system by adding more partitions.

 The index (key) of a partition is used to allocate data for each partition. You need to be careful to select the index of the partition so that the read and write operations are as evenly " distributed " across all partitions. If a read/write operation occurs, these operations may exceed the bandwidth of a partition, which in turn affects the performance of the entire system, while other partitions are underutilized. This is referred to as a " hot Partitioning " issue.

Data is replicated between multiple hosts. This can be that each partition is a completely separate collection of replicas, or multiple replicas on top of the same set of hosts. The number of times a piece of data is copied is often referred to as a replication factor.

  Such a configuration has built-in high availability: data is replicated to multiple hosts. Theoretically, several hosts that are less than the number of replication factors fail, and do not affect the availability of the entire system.

All of these benefits, along with built-in scalability and high availability, are accompanied by a corresponding cost: This is no longer your Swiss Army knife, a standalone relational database management system (RDBMS). This is a complex system that has a lot of variable parts to manage and parameters that need to be fine-tuned. Expertise is required to set up, configure, and maintain these systems. In addition, monitoring and alerting of the infrastructure is required to ensure their proper functioning. Of course you can do it yourself, but it's not easy, you may not be able to handle it for a short time.

Rich data storage, while causing some choice difficulties, but in fact is a good thing. We just need to go beyond the traditional idea of a single data store across the system, accepting a mindset that uses multiple data stores in the system, each serving its most suitable workloads. For example, we can use the following combination:

High-performance intake queue to get input-click traffic

Hadoop -based click traffic Processing System

Cloud-based object storage for low-cost, long-term storage of compressed, daily-click traffic summaries

A relational database that holds metadata that we can use to enrich clickstream data

Data Warehouse clusters for analysis

Search clusters for natural language queries

All of this can be part of a single subsystem, such as a Web analytics platform.

Summarize

  The commercialization of the Internet has led to the need for expansion and availability, and the Swiss Army knife, such as the RDBMS, is no longer able to meet this demand.

Increasing the level of scale and redundancy in data storage increases the complexity of the system, making ACID more difficult to guarantee, forcing US to consider trade-offs according to the CAP theory, creating many interesting opportunities for optimization and specialization.

Use multiple data stores in your system, each serving their most appropriate workloads.

Modern data storage is a complex system that requires special knowledge and management overhead.

 

Speaking to ordinary people. Distributed data storage

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.