Optimization design of three-many-to-many relationship for large-scale web site architecture
Source: Internet
Author: User
Keywordstags we articles more than close time
The last chapter introduces the basic data partitioning scheme and the basic configuration scheme based on the user 6184.html "> Data table". But in the 2.0 era, this simple list index has been far from being a problem, and many-to-many relationships will be the most common relationship. Now, we present the elaboration and the concrete behavior judgment for the many-to-many relationship which is widely existed in the web2.0 data. For example, a very simple example, in the 2.0 era, friend function is the most commonly used, each user will have a lot of friends, but also will be a lot of friends, then this amount of data will be the number of users of the square level. Similarly, for the article label, each article can have more than one label, and each label can have multiple articles, which is a geometric product, the amount of data will be astronomical.
There are two traditional solutions, one is through the search method to achieve, one is to build a separate index table, storage of the corresponding ID for storage. For the first scenario, because it involves a lot of like queries, the performance is not flattering, in the second case, the number of rows in the database is surprisingly massive, and to cross the table to query, and maintain the uniqueness of the data, http://www.aliyun.com/zixun/ Aggregation/14345.html "The complexity of the > Data processing process is self-evident.
Text into the topic, the next face data many-to-many relationship to give specific solutions, we here to the label and the article between the many-to-many relationship as an example to explain, we can analogy think of groups and users, photo albums and users in the circle between the complex many-to-many relationship.
First we filter the process, and we take the second example of the traditional scheme, in the traditional database design we are walking: when a blog post and insert the label when the general is three steps (also can be understood as four steps, think also to determine whether the label exists), the first step into the article database and get the article ID , the second step is to insert the label database and query whether the label exists, if it exists, remove the ID of the label, otherwise insert a new label and remove the ID, the third part, the ID of the article and the ID of the tag inserted into the index table to establish the association. It would be disastrous to have an index on an index table at this point, especially if the data is large, although it can effectively improve the speed of the query, but the speed of the release may be unbearable.
The approach we are dealing with is also the trilogy, which deals further with many-to-many relationships.
With the label, we use the most is the Query tab under the article and display the article label, so we realize this example becomes.
The first step is to discard the index table.
For the article to do redundant fields, plus a tag column, we can speak tag tag as follows: [tagid,tagname]| [tagid,tagname]| [Tagid,tagname] also for the tag table, we do the following redundancy plus a aspires field, as follows [articleid,title]| [ArticleID, title]| [ArticleID, Title], in need to increase the time we just append, as for the aspires structure and tag structure can refer to my last article introduction. In fact, depending on the need to store more.
Some people will ask, why to store tagname and ArticleTitle, in fact, in order to avoid the query and Innerjoin query to do, in Query and cross table query will cause full table traversal, so we in the execution of the query is must find an effective alternative method.
Part II: Asynchronous Loading.
In the design mode we often think of the single piece mode, we adopt an alternative to the single piece mode to deal with, that is, the index between the article and the label as a special process to do, asynchronous implementation.
In order to avoid the thread congestion caused by the article when it was published that it wanted to check the tag table, we needed to do it with a deferred loading scheme. The server should maintain a process professional query and indexing of tags and article lots, we should publish the article when the label should be synchronized this piece of hosting to another program to process and index.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.