Reprinted: http://tech.it168.com/a2011/0815/1232/000001232720_all.shtml
In previous articles, we introduced redis Quick Start: Key-value storage system introduction. Today we will further explain why we chose key-Value Store. Key-value store is a popular topic nowadays, especially when constructing large Internet applications such as search engines, Im, P2P, game servers, SNS, and providing cloud computing services, how to ensure the high performance, high reliability, high scalability, high availability, and low cost of the system in a massive data environment has become the focus of all system architectures, how to solve the performance bottleneck of database servers is the biggest challenge.
Why does a large number of internet users choose key-value store? There are two main reasons:
1. Large-scale Internet applications
For Internet enterprises such as Google and eBay, countless users are using the Internet services they provide every moment. These services bring about a large amount of data throughput at the same time, there are thousands of concurrent connections to operate on the database. In this case, a single server or several servers are far from enough to meet the data processing requirements. Simply upgrading the server performance in this scale-up mode is not enough, so the only way to do this is to scale out. There are many scale-out methods, but they are roughly divided into two categories: one is to use RDBMS, and then deploy the entire database to a cluster through vertical and horizontal cutting of the database, the advantage of this method is that it can adopt familiar technologies such as RDBMS, but its disadvantage is that it is targeted at specific applications. That is to say, due to different applications, the cutting method is different.
Another type is Google's method, which discards RDBMS and uses key-value storage. This greatly enhances scalability of the system ), if the data to be processed continues to increase, add more machines. In fact, the storage of key-value is due to the fact that the publishing of bigtable and other related papers gradually enters people's field of view.
2. Cloud storage
If there is another alternative solution (Database cutting) to the previous problem, maybe the key-value store is the only solution for cloud storage. In short, cloud storage is used to build a large storage platform for others, which means that the applications running on it are uncontrollable. If the applications of a customer grow with the growth of users, the cloud storage vendor cannot achieve scale through database cutting because the data is owned by the customer, suppliers cannot cut if they do not know the data. In this case, the key-value store is the only option, because scalability under this condition must be automatically completed and human intervention is not allowed. This is why almost all existing cloud storage is in the key-value format. For example, in Amazon's smipledb, the underlying implementation is key-value, and Google's
Googleappengine uses the bigtable storage format. The only possible exception may be the MS solution. At the qcon conference, I heard that the next version of the MS azure platform will launch RDBMS-based cloud storage, I am still skeptical about this.
The biggest feature of key-value store is its scalability, which is also its biggest advantage. In my opinion, the so-called scalability includes two aspects. On the one hand, key-value store can support massive data storage. Its distributed architecture determines that as long as there are more machines, more data can be stored. On the other hand, it supports a large number of concurrent queries. For RDBMS, hundreds of concurrent queries can make it very difficult, while a key-value store can easily support thousands of concurrent queries. The following briefly lists some features:
● Key-Value Store: a key-value data storage system that only supports some basic operations, such as set (Key, value) and get (key;
● Distributed: multiple machines (nodes) store data and statuses at the same time, exchange messages with each other to maintain data consistency, and can be viewed as a complete storage system.
● Data Consistency: data on all machines is synchronously updated without worrying about inconsistent results;
● Redundancy: All machines (nodes) store the same data. The storage capacity of the entire system depends on the capacity of a single machine (node;
● Fault Tolerance: if there are a few nodes errors, such as restart, server disconnection, network disconnection, and packet loss, all kinds of fault/fail will not affect the operation of the entire system;
● High Reliability: Fault Tolerance and redundancy ensure the reliability of the database system.
3. redis case studies
Currently, the world's largest redis user is Sina Weibo. There are more than 200 physical machines on Sina, and over 400 ports are running redis, there are more than 4 GB of data running on redis to provide services for Weibo users.
There are many deployment scenarios for Sina Weibo redis, which are roughly divided into the following two types:
First, the application directly accesses the redis database:
The second is that the application directly accesses redis, and only accesses MySQL when redis access fails:
At the same time, a new function of Digg adds the display of the number of browsing articles. A major selling point of this function is its real-time performance. Redis supports the Real-Time browser count.