Site architecture (page static, image server separation, load Balancing) scheme full analysis
Article Category: Integrated technology
1, HTML Static in fact, we all know that the most efficient, the least expensive is the pure static HTML page, so we try to make the page on our site using static pages to achieve, the simplest method is actually the most effective method. But for a lot of content and frequently updated sites, we can not all manually to achieve, so we have a common information distribution system CMS, like we often visit the various portals of the news channel, and even their other channels, are through the information distribution system to manage and implement, Information Publishing system can achieve the simplest information input automatically generated static pages, but also with channel management, rights management, automatic capture and other functions, for a large web site, has a set of efficient, manageable CMS is essential. In addition to the portal and the type of information publishing site, for the interactive requirements of the Community type site, as much as possible static is also to improve the performance of the necessary means, the community posts, articles in real-time static, there is a renewal of the time and re-static is a lot of use of the strategy, A hodgepodge like mop is the use of such strategies, such as the NetEase community. At the same time, HTML static is also the use of some caching policies, for the system frequently using database queries but the content of small updates, you can consider the use of HTML static, such as forum public settings information, This information is currently the mainstream forum can be managed in the background and stored in the database, which is actually a lot of the foreground program calls, but the update frequency is very small, you can consider this part of the background update the time to static, so as to avoid a large number of database access requests.
2, Image server separation you know, for the Web server, whether it is Apache, IIS or other containers, the picture is the most consumption of resources, so we have to separate the picture and the page, which is basically a large site will adopt the strategy, they have a separate picture server, Even a lot of picture servers. This architecture can reduce the server system pressure to provide page access requests, and can ensure that the system does not crash due to picture problems, on the application server and picture server, can be different configuration optimization, such as Apache in the configuration of contenttype can be as little as possible to support, LoadModule as little as possible to ensure higher system consumption and execution efficiency.
3, database cluster and library table hash large sites have complex applications, these applications must use the database, then in the face of large-scale access, the bottleneck of the database can be quickly revealed, then a database will soon be unable to meet the application, so we need to use the database cluster or library table hash. In the database cluster, many databases have their own solutions, Oracle, Sybase and so on have a good solution, the common MySQL provided by the Master/slave is a similar scenario, you use what kind of db, refer to the corresponding solutions to implement. The database cluster mentioned above is constrained by the DB type used in architecture, cost, and extensibility, so we need to consider improving the system architecture from the perspective of the application, and the library table hashing is the most common and effective solution. We install the business and application in the application or function module to separate the database, different modules corresponding to different databases or tables, and then according to a certain policy on a page or function of a smaller database hash, such as the user table, according to user ID for the table hash, This makes it possible to improve the performance of the system at a low cost and has a good scalability. Sohu Forum is the use of such a framework, the Forum users, settings, posts and other information database separation, and then to the post, the user in accordance with the plate and ID hash database and table, finally can be configured in the configuration file simple configuration will allow the system at any time to add a low-cost database to supplement the system performance.
4, Cache cache the word technology has been contacted, many places to use the cache. Caching in the Web site architecture and Web development is also very important. Here we first describe the two most basic caches. The advanced and distributed caches are described later. Architecture cache, people familiar with Apache can know that Apache provides its own cache module, can also use the addition of Squid module for caching, both of which can effectively improve the access response of Apache. Web application development cache, the memory cache provided on Linux is a common cache interface, can be used in web development, such as Java development can call MemoryCache to some data caching and communication sharing, some large communities use such a framework. In addition, in the use of web language development, all kinds of languages have their own cache modules and methods, PHP has pear cache module, Java more,. NET is not very familiar with, I believe there is certainly.
5, mirror image is often used in large-scale web site to improve performance and data security, the mirror technology can solve different network access providers and geographical user access speed differences, such as the difference between chinanet and edunet prompted a lot of websites in the education network to build mirror site, Data is scheduled to be updated or updated in real time. In terms of mirror detail technology, this is not too deep, there are many professional ready-made solution architectures and products to choose from. There are also inexpensive ways to implement software, such as the tools of Rsync on Linux.
6. Load Balancing load balancing will be the ultimate solution for large web sites that address high-load access and large numbers of concurrent requests. Load balancing technology has developed for many years, there are many professional service providers and products can be selected, I personally contacted a number of solutions, including two architecture can give you a reference.
7, the hardware four layer Exchange fourth layer Exchange uses the third layer and the fourth Layer packet header information, according to the application interval to identify the traffic flow, assigns the entire interval segment the business flow to the suitable application server to handle. The fourth layer switch function is like a virtual IP, pointing to the physical server. It transmits services that comply with a variety of protocols, such as HTTP, FTP, NFS, Telnet, or other protocols. These operations are based on physical servers and require complex load balancing algorithms. In the IP world, the business type is determined by the terminal TCP or UDP port address, and the application interval in layer fourth switching is determined by the source and endpoint IP addresses, TCP, and UDP ports. In the hardware four-layer switching product area, there are some well-known products to choose from, such as Alteon, F5, etc., these products are expensive, but value for money, can provide very good performance and very flexible management capabilities. Yahoo China in the beginning, nearly 2000 servers used three or four alteon to get it done.
。 8, the software four layer exchange you know the hardware four layer switch principle, based on the OSI model to achieve the software four-layer exchange has emerged, such a solution to achieve the same principle, but the performance is slightly worse. But to meet a certain amount of pressure or comfortable, some people say that the software implementation is actually more flexible, the ability to handle the full look at your configuration of the familiar ability. Software four-layer switching we can use the common LVS on Linux to solve, LVs is Linux Virtual Server, he provides a real-time disaster response based on the Heart Line heartbeat solution, improve the system robustness, At the same time, the flexible virtual VIP configuration and management functions can meet a variety of application requirements, which is necessary for distributed systems. A typical use of load balancing strategy is to build a squid cluster on the basis of software or hardware four-layer switching, which is adopted on many large Web sites including search engines, which have low cost, high performance and strong extensibility, and it is easy to add or subtract nodes to the architecture at any time. Such a structure I am ready to empty a special detail and discuss with you. For large web sites, each of the previously mentioned methods may be used at the same time, I introduced here is relatively simple, the implementation of a lot of details of the process needs to be familiar with and experience, sometimes a very small squid parameter or Apache parameter settings, the impact on the system performance will be very large, I hope that we will discuss together to achieve the effect.
Use squid as a Web cache server, while Apache provides real Web services behind squid. Of course, the use of such a framework must ensure that most of the home page is static pages. This requires the programmer's cooperation to convert the page to a static page before the page is fed back to the client.
Basically see Sina and Sohu for the channel and other columns have used the same technology, that is, squid to listen to these IP 80 port, and the real Web server to listen to another port. There is no difference in the perception of the user, and the way in which the Web server is connected directly to the client, such a way of significantly saving bandwidth and the server. Users will feel faster to access.
Bandwidth: 4000m/s (Reference)
Number of servers: about 60 units
Web server: Lighttpd, Apache, Nginx
Application Server: Tomcat
Other: Python, Java, MogileFS, ImageMagick, etc.
About Squid and Tomcat
Squid and Tomcat appear to be less visible in the architecture of the WEB 2.0 site. First of all, I have a little doubt about Squid, the explanation of this Hua is "currently not found efficiency than Squid high cache system, the original hit rate is really poor, and then in Squid before loaded layer Lighttpd, based on the URL to do hash, the same picture will always go to the same Squid , so the hit rate has increased drastically. "
For the application server tier Tomcat, now Yupoo! Technicians are also gradually replacing them with other lightweight things, and YPWS/YPFS is now developed in Python.
· Ypws--yupoo Web Server YPWS is a small Web server developed in Python that provides basic web services that can increase the logical judgment of the display of users, images, and Web sites, and can be installed on any server that has free resources. Easily scale out when encountering performance bottlenecks.
· Ypfs--yupoo File System is similar to YPWS, YPFS is also based on the image upload server developed on this WEB server.
"Updated: There are netizens to query the efficiency of Python, Yupoo Liuping wrote in del.icio.us," Ypws with Python wrote, each machine can handle 294 requests per second, now almost 10% under pressure ""
Image processing Layer
The next image process Server is responsible for processing the images uploaded by the user. The package used is also ImageMagick, and the ratio for sharpening has been adjusted over the last storage upgrade (I personally feel that the effect is really much better). "MAGICKD" is a remote interface service for image processing that can be installed on any machine that has free CPU resources, similar to the memcached service mode.
We know that the Flickr thumbnail function was originally used ImageMagick software package, and later was acquired by Yahoo for copyright reasons without (?); EXIF and IPTC Flicke are extracted from Perl, and I am very much advised Yupoo! To do some articles for EXIF, this is also a potential benefit of a focus.
Picture storage Layer
Original Yupoo! Storage uses a disk array enclosure based on NFS, and as the volume of data increases, "yupoo! Since June 07, the development department has been researching a set of large capacity, which can meet yupoo! Future development needs, safe and reliable storage System ", it seems yupoo! System is more confident, but also full of expectations, after all, it is necessary to support the terabytes of data storage and management of massive images. We know that, in addition to the original image, there are different sizes, these images are stored uniformly in the mogilefs.
For other parts, common Web 2.0 sites must be visible to the software, such as MySQL, Memcached, LIGHTTPD, and so on. Yupoo! On the one hand, a lot of relatively mature open source software, on the one hand, the development of self-tailor-made architecture components. It's also a way for a WEB 2.0 company to go.
Thanks a lot, yupoo!. The technology is common to the sharing of technical information in Hwa. What's the next one to explode?
Lighttpd+squid This set of caches is placed in another computer room as a CDN node use, the figure is not depicted clearly, to everyone inconvenience.
Squid front end with lighttpd useless nginx, mainly used for so long, no big problem, so did not think of other.
URL Hash extensibility is really bad, can do is not easy to increase or decrease the server, we are currently 5 servers to do a set of hash.
We are now using Python to write the web Server, in terms of efficiency, I can give a test data, according to the current Access log simulation Access test results are 1 YPWS, the average processing 294 requests per second (load all logic judgment).
In reliability, there is no specific data, the current 1 months of operation has not been any exception.
LVS on each node are installed Nginx, mainly for the reverse proxy and processing static content, but Apache has become less necessary, ready to gradually remove.
We deal with images in real-time, and more than half of our servers are now loaded with MAGICKD services to share image processing requests.
What are the real-time hotspots in tens of millions of Blog content per day? Tailrank This WEB 2.0 Startup is committed to answering this question.
Todd Hoff, who specializes in explosive material websites, interviewed Kevin Burton. So we can look at some information about the Tailrank architecture. Hourly index 24 million Blog and Feed, the content processing capacity of 160-200mbps,io write about in 10-15mbps. 52T of raw data is processed every month. The crawlers used by Tailrank have now become a standalone product: SPINN3R.
Currently about 15 servers, CPU is 64-bit Opteron. Hang two SATA disks on each host, and do RAID 0. As far as I know, many of the domestic WEB 2.0 companies also use a similar way, SATA disk capacity, low price, is the perfect choice. The operating system is Debian Linux. The WEB server uses Apache 2.0,squid to do the reverse proxy server.
Tailrank with MySQL database, federated database form. Storage engine with InnoDB, data volume 500GB. Kevin Burton also points out that MySQL 5 has fixed some multi-core mutex issues (this Bug?). The JDBC drive connection pool to the database is load balanced with Lbpool. MySQL Slave or master replication is done easily with Mysqlslavesync. But even then, it takes 20% of the time to toss the DB.
Other Open Software
Any set of systems can not be separated from the appropriate Profiling tools, Tailrank also unfavorable, for Java program Benchmark with benchmark4j. Log tool with log5j (not log4j). Most of the tools used by Tailrank are open.
A relatively large competitor of Tailrank is Techmeme, although the focus on the content is different for the time being. In fact, the biggest opponents or themselves, when the need to dig more and more information, if accurate and timely presentation to user content costs will be higher. From now on, Tailrank is far from the expected target. Looking forward to the early completion of Rome
YouTube Architecture Learning
Original: YouTube Architecture
YouTube is growing fast with more than 100 million video hits per day, but only a few people are maintaining the site and ensuring scalability.
Psyco, a dynamic Python-to-C compiler
LIGHTTPD to do video viewing instead of Apache
Supports more than 100 million video hits per day
Founded in February 2005
Reached 30 million video hits per day in March 2006
Reached 100 million video hits per day in July 2006
2 system administrators, 2 scalable software Architects
2 software development Engineers, 2 network engineers, 1 DBAs
Handle fast-growing traffic
1. while (true)
3. Identify_and_fix_bottlenecks ();
4. Drink ();
5. Sleep ();
6. Notice_new_bottleneck ();
Run the Loop multiple times per day
1,netscaler for load balancing and static content caching
2, run Apache using mod_fast_cgi
3, use a Python application server to process the requested route
4, the application server interacts with multiple databases and other sources of information to get data and format HTML pages
5, it is generally possible to increase scalability at the Web layer by adding more machines
6,python Web layer code is not typically a performance bottleneck, and most of the time is blocked in RPC
7,python allows for rapid and flexible development and deployment
8, typically less than 100 milliseconds per page service
9, use Psyco (a dynamic Python-to-C compiler similar to the JIT compiler) to optimize the internal loop
10, for intensive CPU activity like encryption, use C extension
11, for some expensive blocks using pre-generated and cached HTML
12, row-level caching is used in the database
13, cache the full Python object
14, some data is calculated and sent to each program, so these values are cached in local memory. This is a poorly used strategy. The fastest cache in the application server will not take much time to send pre-computed values to all servers. Just get an agent to listen for changes, precomputed, and then send.
1, cost including bandwidth, hardware and energy consumption
2, each video by a mini cluster to host, each video is more than one machine held
3, using a cluster means:
-More hard drive to hold content means faster speed
-failover. If one machine fails, the other machine can continue to serve.
4, use LIGHTTPD as a Web server to provide video services:
-apache too much overhead.
-Use Epoll to wait for multiple FDS
-transition from single-process configuration to multi-process configuration to handle more connections
5, most popular content is moved to CDN:
-cdn back up content in multiple places, so the chances of content closer to the user are higher
-CDN machines are often out of memory because content is so popular that few bumps in and out of memory
6, less popular content (1-20 views per day) using YouTube servers at many colo sites
-long tail effect. A video can have multiple plays, but many videos are playing. Random hard drive blocks are accessed
-In this case the cache will not be good, so spending money on more caches may not make much sense.
-Adjust raid control and pay attention to other low-level issues
-Adjust the memory on each machine, not too much and not too little
Video service key points
1, keep it simple and cheap
2, keep Simple network path, do not have too many devices between content and user
3, using common hardware, expensive hardware difficult to find help documents
4, using simple and common tools, using most of the tools built on or above Linux
5, very good handling random search (Sata,tweaks)
1, to be efficient and surprisingly difficult
2, about 4 thumbnails per video, so thumbnails are much more than video
3, thumbnails are only host on several machines
4, holding some of the problems encountered by small things:
Large number of hard disk lookups and Inode and page cache issues at the-os level
-Single directory file limit, especially Ext3, later moved to multi-layered structure. Recent improvements in kernel 2.6 may have allowed EXT3 to allow large directories, but it's not a good idea to store a large number of files in a file system.
-A large number of requests per second, because Web pages may display 60 thumbnails on a page
-Apache behaves badly under this high load
-Squid is used in the front-end of Apache, which works for some time, but fails due to increased load. It makes 300 requests per second into 20
-Try using lighttpd but it's in trouble because of the use of a single thread. Problems with multiple processes because they each keep their own separate caches
-so many pictures that a new machine can only take over 24 hours
-Restart the machine takes 6-10 hours to cache
5, in order to solve all these problems YouTube started using Google's bigtable, a distributed data store:
-Avoid small file problems because it collects files together
-Quick, false tolerance
-Lower latency because it uses distributed multilevel cache, which works with multiple different collocation sites
-See Google Architecture,googletalk Architecture and bigtable for more information
-Use MySQL to store metadata, such as users, tags and descriptions
-Storage of data using RAID 10来 of a whole 10 HDD
-Rely on credit card so YouTube leased hardware
-youtube after a common revolution: Single-server, then single-master and multiple-read slaves, then database partitioning, and then sharding Way
-Pain with backup delay. The master database is multithreaded and runs on a large machine so it can handle a lot of work, slaves is single-threaded and usually runs on smaller servers and backups are asynchronous, so slaves will lag far behind master
-Update causes cache invalidation, slow I/O on hard disk causes slow backup
-Using a backup architecture requires a lot of money to get increased write performance
One solution for-youtube is to prioritize transfers by dividing the data into two clusters: a video viewing pool and a generic cluster
-Divided into shards, different user designations to different shards
-Better cache location means less IO
-30% reduction in hardware
-Reduced backup latency to 0
-can now arbitrarily improve the scalability of the database
Data Center Policies
1, relies on credit cards, so initially only managed hosting providers can be used
2, managed hosting provider cannot provide scalability, cannot control hardware or use good network protocols
3,youtube use colocation arrangement instead. Now YouTube can customize everything and contract its own contract
4, using 5 to 6 data centers plus a CDN
5, the video comes from any data center, not the nearest match or anything else. Move to CDN If a video is popular enough
6, depending on the video bandwidth rather than the real delay. Can come from any Colo
7, picture delay is very serious, especially when a page has 60 pictures
8, use BigTable to back up pictures to different data centers, code to see who is the nearest
I learned something.
1,stall for time. Creative and risky skills allow you to solve problems in a short period of time and you will find long-term solutions
2,proioritize. Find out what's at the core of your service and prioritize your resources
3,pick your battles. Don't be afraid to split your core services. YouTube uses a CDN to distribute their most popular content. Creating your own network will take too much time and too much money
4,keep it simple! Simple allows you to quickly re-architect to respond to problems
5,shard. Sharding helps isolate storage, CPU, memory and Io, not just for more write performance
6,constant Iteration on bottlenecks:
-Software: DB, Cache
-os: Hard disk I/O
-Hardware: Memory, RAID
7,you succeed as a team. Have a cross-law understanding of the entire system and know what kind of team inside the system, such as installing printers, installing machines, installing networks and so on. With a good team all things is possible.
Site architecture (page static, image server separation, load Balancing) scheme full analysis