FriendFeed How to use MySQL to store k-v data

Source: Internet
Author: User
Tags database sharding

@ This is a relatively old article, I now understand is to use MySQL to achieve a mongodb, in the thinking of reference significance.

Original address: Http://backchannel.org/blog/friendfeed-schemaless-mysql

Background

We use MySQL to store all friendfeed data. Our database grew a lot as our user base grew and now stores more than 250 million (million) entries and data strings that come from the "likes" of comments and friends lists.

As data grows, we repeatedly deal with the scaling problems that result from the rapid growth of data. We used conventional methods, such as using slave read servers and memcache, to increase read capability, and to use database sharding to improve the ability to write. However, as we develop, it is more difficult to add new features than to extend the capacity of existing systems.

Especially in these cases: the design pattern changes; To add an index to a database of more than 10 million or 20 million rows, the database is completely locked down for a few hours at a time. Deleting old indexes also takes the same amount of time, and not deleting them can compromise performance. Because the database continues to read and write these unused blocks each time the insert is inserted, the important blocks are extruded into memory. There is a very complex design that avoids these problems, such as generating an index from the server and then swapping the master and slave servers. But these methods are error-prone, too heavy-weight, and secretly prevent us from adding new features that need to change the index or design pattern. MySQL-related features, such as join, have been worthless to us since our database was deeply fragmented. So, we decided to look outside the relational database for answers.

Many projects, such as COUCHDB, were created to store data using flexible and fast indexing features. However, it seems that none of them have enough trust to be widely used by large websites. In testing, we run to show that none of them is stable enough or tested for our needs. MySQL can meet requirements and never damage data, or repeat work. We have recognized its limitations. We like to use MySQL for storage, not relational database usage patterns.

After some consideration, we decided to implement a "modeless" storage system on top of MySQL, rather than the complete use of other new storage systems. This text attempts to describe the details of the system at a high level. We are curious about how other large sites solve this problem, and we think some of the design work we do may be helpful to other developers.

Overview

Our database stores a modeless property bag (such as a dictionary in JSON or Python). The only attribute required by the storage entity is the ID, a 16-byte UUID (Universal unique identifier). The other properties of the entity are opaque before the database is connected. We can simply change the pattern by storing the new property.

We index the data of these entities by storing the indexes in a separate MySQL table. If we want to index 3 attributes in each entity, we will have 3 MySQL tables (one for each index). If we want to stop using an index, first stop writing to the table from the code, and then delete the table as needed in MySQL. If we want to use a new index, create a new table for the index in MySQL, and then run a program to populate the index asynchronously so that it does not interfere with our normal service.

Therefore, we will have more tables later than before, but it is easy to add and remove indexes. We have a deeply optimized program to populate the new index (called "Cleaner"), so you can quickly populate the index without interfering with the site's normal service. Instead of a week, we can store new attributes and index them in a single day of time. And, we do not need to exchange MySQL master-slave server, or do other scary operation to achieve it.

Details

In MySQL, our entities are stored in tables like this:

CREATE TABLE entities (added_id INT NOT NULL auto_increment PRIMARY KEY, id BINARY (+) not NULL, updated Timesta MP not NULL, body Mediumblob, UNIQUE key (ID), key (updated)) Engine=innodb;

The added_id column exists because the INNODB stores the data rows exactly in the primary key order. The self-increment primary key ensures that the new entity is continuously written to the disk after the old entity, while helping to determine the location of the read-write operation (since the FriendFeed page is sorted by chronological order, the new entity tends to have more frequent reads than the old entity). The entity content is compressed using the zlib algorithm and stored in a pickle python dictionary.

Indexes are stored separately in separate tables. In order to create a new index, a new table is created to store the attributes we want to index in order to find them in all the database groups. For example, a standard friendfeed entity looks like this:

{"id": "71f0c4d2291844cca2df6f486e96e37c", "user_id": "F48b0440ca0c4f66991c4d5f6a078eaf", "feed_id": "F48b0440ca 0c4f66991c4d5f6a078eaf "," title ":" We just launched a new backend system for friendfeed! "," link ":" Http://friendfee d.com/e/71f0c4d2-2918-44cc-a2df-6f486e96e37c "," published ": 1235697046," updated ": 1235697046,}

We want to index the USER_ID in the entity properties so that all entities within a page requested by a given user can be rendered. The index table looks like this:

CREATE TABLE index_user_id (user_id binary () not NULL, entity_id binary (+) is not null UNIQUE, PRIMARY KEY (user _id, entity_id)) Engine=innodb;

Our data store automatically maintains the index instead of you, so open an instance of the data store, and the entity that stores the structure above the given index should write (python):

User_id_index = friendfeed.datastore.index (    table= "index_user_id",  properties=["user_id"], shard_on= "user_id") Datastore = friendfeed.datastore.datastore (     mysql_shards=["127.0.0.1:3306",  "127.0.0.1:3307"],    indexes=[user _id_index]) new_entity = {     "id":  binascii.a2b_hex (" 71f0c4d2291844cca2df6f486e96e37c "),    " user_id ":  binascii.a2b_hex (" F48b0440ca0c4f66991c4d5f6a078eaf "),    " feed_id ":  binascii.a2b_hex (" F48b0440ca0c4f66991c4d5f6a078eaf "),    " title ":  u" We just launched a  new backend system for friendfeed! ",    " link ":  u"/http// friendfeed.com/e/71f0c4d2-2918-44cc-a2df-6f486e96e37c ",    " published ":  1235697046,      "Updated":  1235697046,} datastore.put (New_entity)  entity = datastore.get (Binascii.a2b_hex ("71f0c4d2291844cca2df6f486e96e37c"))  entity  = user_id_index.get_all (Datastore, user_id=binascii.a2b_hex ("F48b0440ca0c4f66991c4d5f6a078eaf" ))

The index class above looks for the user_id attribute in the known entity and automatically maintains the index in the index_user_id table. Since our database is fragmented, the shard_on attribute is used to confirm which database slice the index is stored on (in this case, the value is the ueser_id of the entity to the number of shards).

You can query an index using an index instance (see User_id_index.get_all above). The data storage system code completes the "join" work between the index_user_id table and the entity table in Python, by querying the index_user_id table in all the database slices, getting a list of entity IDs, and then reading the IDs in the entity table.

In order to add a new index, for example, to establish all on the link attribute, we should create a new table:

CREATE TABLE index_link (link VARCHAR (735) not NULL, entity_id BINARY (+) NOT null UNIQUE, PRIMARY KEY (link, en tity_id)) Engine=innodb DEFAULT Charset=utf8;

We will change the initial code of the storage system to add this new index:

User_id_index = Friendfeed.datastore.Index (table= "index_user_id", properties=["user_id"], shard_on= "user_id") link_ index = Friendfeed.datastore.Index (table= "Index_link", properties=["link"], shard_on= "link") Datastore = Friendfeed.datastore.DataStore (mysql_shards=["127.0.0.1:3306", "127.0.0.1:3307"], Indexes=[user_id_index, link_ Index])

And we can populate this index asynchronously (even when the service is busy), using:

./rundatastorecleaner.py--index=index_link

Consistency and atomicity

Since our database began to Shard, compared to the Entity data itself, an entity's index will be stored on different database slices, consistency is a problem. What if the program crashes before it finishes writing all the indexed tables?

Establishing a transactional protocol is a tempting solution for most aspiring friendfeed engineers, but we want to keep the system as simple as possible. We decided to release the constraint like this:

    • The property bag is stored in the primary entity table as a standard specification

    • The index may not reflect the real value of the entity

Therefore, we use the following steps to write a new entity to the database:

    1. Write entities to the entity table using the ACID properties of the InnoDB

    2. Writes an index to all index tables on all database slices

When reading from the index table, we know that the result is not very precise (that is, if you do not complete step 2 at the time of writing, the index may reflect the old property value). To ensure that we do not return invalid entities based on the above constraints, we use the Index table to confirm which entity to read, but we will repeatedly submit the filter query in the entity instead of believing the integrity of the index:

    1. Read entity_id in all index tables based on a query

    2. Reads entities in the entity table based on the given entity ID

    3. Filter (in Python) all entities that do not match the actual property value

In order to ensure that the index does not lose persistence, inconsistencies will eventually be fixed, and the "Cleaner" program I mentioned earlier, runs continuously between tables, writes lost indexes, and clears old, invalid indexes. It handles the most recently updated entity first, so inconsistencies in the actual index will be repaired very quickly (within seconds).

Performance

We have done a lot of optimization on the primary index in the new system, and we are very satisfied with the optimization results. Here is a chart of last month's FriendFeed page delay (we started the new backstage a few days ago and you can see a dramatic descent):

In particular, the delay of our system is now very stable, even at high noon time. Below is a chart of the last 24-hour FriendFeed page delays:

Compare data from a week ago:

So far, the system is really easy to do the work. Since we developed the system, we have changed the index many times, and we started using the new schema to transform the largest MySQL table so that we can change the data structure more freely with development.

FriendFeed How to use MySQL to store k-v data

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.