Youku Architecture Learning Notes

Source: Internet
Author: User
Youku Architecture Learning notesViews: 294 Times September 18, 2011 It Green Rattan House size: Big small share to: QQ space Sina microblogging Tencent Weibo Renren Douban net more 4

Remember before to introduce the video site leader of YouTube technology structure, I believe that everyone will have a lot of feelings, the internet is such a magical thing. It suddenly occurred to me today Youku in the domestic video site is also the eldest, do not know his architecture relative to YouTube is how, so with this curiosity to find a network of Youku structure of all aspects of information, although not talk about YouTube so detailed, but how much or dig a little, now summed up , I hope to help friends who like architecture.

Overview of the website basic data according to 2010 statistics, Youku daily average number of independent visitors (UV) reached 89 million, daily average visit (PV) is reached 1.7 billion, youku by virtue of this data to become Google's list of domestic video sites ranked highest manufacturers. Hardware, the introduction of Youku's Dell server mainly to the PowerEdge 1950 and the PowerEdge 860 primarily, storage arrays with Dell MD1000, 2007 of the data show that Youku has more than 1000 servers all over the country in major provinces and cities, now should be more.

Second, the website front end frame

From the beginning, Youku built a set of CMS to solve the front end of the page display, the separation between the various modules are more appropriate, the front-end scalability is very good, the separation of the UI, so that development and maintenance becomes very simple and flexible, the following figure is the cool front-end module call relationship:

In this way, the module, method and params are determined to call the relatively independent modules, which is very concise. Here is a cool front-end local architecture map:

Third, the database structure

It should be said that the cool database architecture is also experienced many twists and turns, from the beginning of a single MySQL server (Just Running) to a simple MySQL master-slave replication, SSD optimization, vertical library, horizontal sharding, this series of processes only experienced will have a deeper understanding of it, Like the architectural experience of MySpace, architecture is slowly growing and maturing.

1, simple MySQL master-slave replication:

MySQL master-slave replication to solve the database read and write separation, and a good upgrade of the performance of the read, the original figure is as follows:

Its master-slave copying process is shown in the following illustration:

However, master-slave replication also brings a number of other performance bottlenecks: Write unable to extend write cannot cache replication delay lock model Rising table, cache rate drop

The problem has to be solved, which produces the following optimization plan, take a look.

2. mysql vertical partition

If the business is cut enough to be independent, it would be a good idea to put different business data on different database servers, and if one of the businesses crashed, it would not affect the normal operation of other business, and also played a role in load streaming, greatly enhancing the throughput capacity of the database. The database schema diagram after the vertical partition is as follows:

However, although the business is independent enough, but some of the business between more or less always a bit of contact, such as users, are basically associated with each business, and this zoning method can not solve the problem of the rise of the single table data, so why not try the level of sharding it.

3, MySQL horizontal fragmentation (Sharding)

This is a very good idea, the user according to a certain rule (by the ID hash) group, and the user's data stored in a database fragmentation, that is, a sharding, so as the number of users, as long as a simple configuration of a server can be, the schematic diagram is as follows:

How to determine the Shard of a user, you can build a user and shard corresponding data table, each request from this table to find the user's Shard ID, and then from the corresponding Shard query relevant data, as shown in the following figure:

However, Youku is how to solve the cross shard query, this is a difficult point, according to the introduction of Youku is as far as possible not across the Shard query, it is not good through multidimensional fragmented index, distributed search engine, the worst way is distributed database query (this very troublesome and performance-consuming)

Four, caching strategy

The seemingly large system has a unique effect on "caching", cache from HTTP to memcached memory data cache, but Youku means no memory cache, for the following reasons: Avoid memory copy, avoid memory lock if you get a video from Big brother, it's a bit of a hassle in the cache.

and squid's write () user process space is consumed, Lighttpd 1.5 of AIO (asynchronous I/O) read files to user memory resulting in low efficiency.

But why do we visit Youku so smoothly, compared to the potato cool video loading speed slightly. This is thanks to the more complete content distribution network (CDN) that Youku has built, it through a variety of ways to ensure that the distribution of users around the country to visit the nearest-users click on the video request, Youku will be based on the location of users in the region, the nearest user, the best service video server address to the user, So that users can get a fast video experience. This is the advantages of CDN, the nearest visit, about the CDN more content, please google.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.