Architecture extension of YouTube

Source: Internet
Author: User

Author of YouTube's architecture Extension: fenng | reprinted. During reprinting, the original source and author information and copyright statement of the article must be indicated in hyperlink form.
Web: http://www.dbanotes.net/opensource/youtube_web_arch.html

At the Seattle scalability Technical Seminar, Cuong do of YouTube made a report on YouTube scalability. The video content is available on Google Video. Unfortunately, Chinese users cannot see it.

Kyle cordes introduced the content in this video. There are a lot of technical content. It is worth sharing. (The introduction of Kyle cordes is the main source of this article)

To put it simply, the YouTube data traffic, "One day's YouTube traffic is equivalent to sending 75 billion emails." In 2006, there was a message saying that the daily PV exceeds 0.1 billion. Now? Even more exaggerated, "1 billion downloads and uploads a day" is indeed an extraordinary massive volume. internet applications in China, but from the perspective of data volume, are afraid that only 51.com has this scale. but technically, there is no way to compare it with YouTube.

Web Server

For the sake of development speed, most of the Code is developed in Python. Some Web servers are Apache in FastCGI mode. Lighttpd is used for video content. As far as I know, some MySpace servers also use Lighttpd, but the amount is not large. YouTube is the most successful case of Lighttpd. (There are not many Lighttpd sites in China, and Douban is more comfortable to use. By fenng)

Video

Video thumbnails (thumbnails) pose a great challenge to the server. Each video has an average of 4 thumbnails, and each web page has more than one thumbnail.The request is too large. YouTube technical staff have enabled separate server groups to handle this pressure and target cache and OSSome optimizations were made. On the other hand, the pressure of the thumbnail request causes the performance of Lighttpd to decline. More worker threads are added through hack Lighttpd to solve the problem. The latest solution is Google's bigtable, which provides better performance in terms of performance, fault tolerance, and cache. According to the purchase, haogang is used in the cutting edge.

For redundancy, each video file is placed on a set of mini clusters. The so-called "Mini cluster" is a group of servers with the same content. Put the most popular videos on CDNIn this way, your server only needs to undertake some "miss" access immediately. YouTube uses simple, inexpensive, and universal hardware, which is consistent with Google's style. Maintenance methods are also common tools, such as rsync and SSH.Wait, but people are more familiar with it.

Database

YouTube uses MySQL to store metadata-user information, video information, and so on. The database server once encountered swap bumps. The solution was to delete the swap partition! .

Initially, the database had only 10 hard disks, raid 10, and then added a group of RAID1. Saving enough. This wave of Web 2.0 Companies rarely use Oracle (I only know Bebo, see here). In terms of scalability, the routes are similar to those of other sites, replication, and Io dispersion. The final solution is "partition". This is not a table partition at the database level, but a partition at the business level (in terms of username or ID, the application controls the search mechanism)

Memcached is also used for YouTube.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.