Http://www.kaiyuanba.cn/html/1/131/147/7540.htm these days have been concerned about and learning some of the architecture of large sites, I hope one day I can design a high concurrency, high fault-tolerant system and can be applied in practice. Today on the Internet to find architecture-related information, see a harmonious video site YouTube architecture analysis, see later feel oneself again to the architecture approached a step, so quickly come out to share with you.
YouTube is growing fast with more than 100 million video hits per day, but only a few people are maintaining the site and ensuring scalability. This is similar to PlentyOfFish, where a minority maintains a large system. What is the reason? Rest assured is not by character, nor by loneliness, below to see the overall YouTube technology structure it.
Platform
1. Apache
2. Python
3. Linux (SuSe)
4. MySQL
5, Psyco, a dynamic Python-to-C compiler
6, LIGHTTPD instead of Apache to do video viewing
Status
1. Support more than 100 million video clicks per day
2, founded in February 2005
3. Reached 30 million video hits per day in March 2006
4. Reached 100 million video hits per day in July 2006
5, 2 system administrators, 2 scalable software Architects
6, 2 software development Engineers, 2 network engineers, 1 DBAs
Web Server
1,netscaler for load balancing and static content caching
2, run Apache using mod_fast_cgi
3, use a Python application server to process the requested route
4, the application server interacts with multiple databases and other sources of information to get data and format HTML pages
5, it is generally possible to increase scalability at the Web layer by adding more machines
6,python Web layer code is not typically a performance bottleneck, and most of the time is blocked in RPC
7,python allows for rapid and flexible development and deployment
8, typically less than 100 milliseconds per page service
9, use Psyco (a dynamic Python-to-C compiler similar to the JIT compiler) to optimize the internal loop
10, for intensive CPU activity like encryption, use C extension
11, for some expensive blocks using pre-generated and cached HTML
12, row-level caching is used in the database
13, cache the full Python object
14, some data is calculated and sent to each program, so these values are cached in local memory. This is a poorly used strategy.
The fastest cache in the application server will not take much time to send pre-computed values to all servers. Just get an agent to listen for changes, precomputed, and then send.
Video Services
1, cost including bandwidth, hardware and energy consumption
2, each video by a mini cluster to host, each video is more than one machine held
3, using a cluster means:
-More hard drive to hold content means faster speed
-failover. If one machine fails, the other machine can continue to serve.
-Online Backup
4, use LIGHTTPD as a Web server to provide video services:
-apache too much overhead.
-Use Epoll to wait for multiple FDS
-transition from single-process configuration to multi-process configuration to handle more connections
5, most popular content is moved to CDN:
-cdn back up content in multiple places, so the chances of content closer to the user are higher
-CDN machines are often out of memory because content is so popular that few bumps in and out of memory
6, less popular content (1-20 views per day) using YouTube servers at many colo sites
-long tail effect. A video can have multiple plays, but many videos are playing. Random hard drive blocks are accessed
-In this case the cache will not be good, so spending money on more caches may not make much sense.
-Adjust raid control and pay attention to other low-level issues
-Adjust the memory on each machine, not too much and not too little
Video service key points
1, keep it simple and cheap
2, keep Simple network path, do not have too many devices between content and user
3, using common hardware, expensive hardware difficult to find help documents
4, using simple and common tools, using most of the tools built on or above Linux
5, very good handling random search (Sata,tweaks)
Thumbnail service
1, to be efficient and surprisingly difficult
2, about 4 thumbnails per video, so thumbnails are much more than video
3, thumbnails are only host on several machines
4, holding some of the problems encountered by small things:
Large number of hard disk lookups and Inode and page cache issues at the-os level
-Single directory file limit, especially Ext3, later moved to multi-layered structure. Recent improvements in kernel 2.6 may have allowed EXT3 to allow large directories, but it's not a good idea to store a large number of files in a file system.
-A large number of requests per second, because Web pages may display 60 thumbnails on a page
-Apache behaves badly under this high load
-Squid is used in the front-end of Apache, which works for some time, but fails due to increased load. It makes 300 requests per second into 20
-Try using lighttpd but it's in trouble because of the use of a single thread. Problems with multiple processes because they each keep their own separate caches
-so many pictures that a new machine can only take over 24 hours
-Restart the machine takes 6-10 hours to cache
5, in order to solve all these problems YouTube started using Google's bigtable, a distributed data store:
-Avoid small file problems because it collects files together
-Quick, false tolerance
-Lower latency because it uses distributed multilevel cache, which works with multiple different collocation sites
-See Google Architecture,googletalk Architecture and bigtable for more information
Database
1, early
-Use MySQL to store metadata, such as users, tags and descriptions
-Storage of data using RAID 10来 of a whole 10 HDD
-Rely on credit card so YouTube leased hardware
-youtube after a common revolution: Single-server, then single-master and multiple-read slaves, then database partitioning, and then sharding Way
-Pain with backup delay. The master database is multithreaded and runs on a large machine so it can handle a lot of work, slaves is single-threaded and usually runs on smaller servers and backups are asynchronous, so slaves will lag far behind master
-Update causes cache invalidation, slow I/O on hard disk causes slow backup
-Using a backup architecture requires a lot of money to get increased write performance
One solution for-youtube is to prioritize transfers by dividing the data into two clusters: a video viewing pool and a generic cluster
2, late
-Database Partitioning
-Divided into shards, different user designations to different shards
-Diffusion read/write
-Better cache location means less IO
-30% reduction in hardware
-Reduced backup latency to 0
-can now arbitrarily improve the scalability of the database
Data Center Policies
1, relies on credit cards, so initially only managed hosting providers can be used
2, managed hosting provider cannot provide scalability, cannot control hardware or use good network protocols
3,youtube use colocation arrangement instead. Now YouTube can customize everything and contract its own contract
4, using 5 to 6 data centers plus a CDN
5, the video comes from any data center, not the nearest match or anything else. Move to CDN If a video is popular enough
6, depending on the video bandwidth rather than the real delay. Can come from any Colo
7, picture delay is very serious, especially when a page has 60 pictures
8, use BigTable to back up pictures to different data centers, code to see who is the nearest
I learned something.
1,stall for time. Creative and risky skills allow you to solve problems in a short period of time and you will find long-term solutions
2,proioritize. Find out what's at the core of your service and prioritize your resources
3,pick your battles. Don't be afraid to split your core services. YouTube uses a CDN to distribute their most popular content. Creating your own network will take too much time and too much money
4,keep it simple! Simple allows you to quickly re-architect to respond to problems
5,shard. Sharding helps isolate storage, CPU, memory and Io, not just for more write performance
6,constant Iteration on bottlenecks:
-Software: DB, Cache
-os: Hard disk I/O
-Hardware: Memory, RAID
7,you succeed as a team. Have a cross-law understanding of the entire system and know what kind of team inside the system, such as installing printers, installing machines, installing networks and so on.
With a good team all things is possible.
YouTube Architecture Learning notes for large video sites