Knowledge summarization of high-load and high concurrent web-site architecture--a few understandings of large traffic website architecture

Source: Internet
Author: User
Tags dedicated server

: Hard Architecture

1: Choice of room:

In the selection of the computer room, according to the geographical distribution of users of the site, you can choose netcom or telecommunications room, but more often, may be a double-line room is appropriate. The bigger the city, the more expensive the room price, from the perspective of cost can be in some small and medium-sized cities hosting servers, such as Guangzhou company can consider the server hosted in Dongguan, Foshan and other areas, not particularly far, but the price will be much cheaper.

2: The size of the bandwidth:

Usually when the boss pays us to structure the website, it gives us some goals, such as the website can withstand 1 million PV traffic every day, and so on. At this point we need to budget about how much bandwidth is needed, the bandwidth is calculated mainly involving two indicators (peak traffic and page size), we might as well make the necessary assumptions before computing:

First: Assume that the peak flow is 5 times times the average flow.
Second: Assume that the average page size per visit is about 100K bytes.

If the traffic of 1 million PV is evenly distributed over a day, approximately 12 accesses per second, if the average per-access page size is about 100K bytes, the 12 accesses are approximately 1200K bytes, byte units are byte, and the unit of bandwidth is bit, The relationship between them is 1Byte = 8bit, so 1200K byte is roughly the equivalent of 9600K bit, that is, 9Mbps, in fact, our website must be able to maintain normal access at peak traffic, so according to the assumed peak flow calculation, The demand for real bandwidth should be around 45Mbps.

Of course, this conclusion is based on the two-point hypothesis mentioned above, and if your actual situation is different from the two-point hypothesis, then the results will vary.

3: Division of the server:

Look at what servers we all need: Picture servers, page servers, database servers, application servers, log servers, and so on.

For a large number of sites, separate image server and page server is necessary, we can use LIGHTTPD to run the image server, with Apache to run the page server, of course, we can choose another, or even, we can expand into many image servers and a number of page servers, and set related domain names, such as img.domain.com and www.domain.com, the picture path in the page uses absolute path, such as Then set the DNS round robin to achieve the most elementary load balancing. Of course, the number of servers will inevitably involve a synchronization problem, this can be done using the Rsync software.

The database server is a top priority because the bottleneck of the site is in the database. Now the general small and medium-sized website uses MySQL database more, but its cluster function does not seem to reach the stage of stable, so do not evaluate here. In general, when using the MySQL database, we should engage in a master-slave (a master multi-slave) structure, the primary database server using the INNODB table structure, from the data server using the MyISAM table structure, give full play to their respective advantages, and such a master-slave structure separates the read and write operations, Reduce the pressure of read operation, even we can set a dedicated server to do backup server, convenient backup. Otherwise if you have only one master server, in the case of large data volume, mysqldump Basic will not be, directly copy the data file, you have to stop the database service and then copy, or the backup file will be wrong. But for many web sites, even if the database service is stopped for only one second is unacceptable. If you have a server from the database, when you back up the data, you can stop the service (slave stop) and then back up, and then start the service (slave start) after the server automatically synchronizes the data from the primary server, nothing is affected. But the master-slave structure also has a fatal disadvantage, that is, the master-slave structure is only to reduce the pressure of reading operation, but can not reduce the pressure of writing operations. In order to accommodate a larger size, there may be only one final trick left: Horizontal/Vertical segmentation of the database. The so-called horizontal partition database, is to save different tables to different database server, such as the user table saved on a database server, the article table is saved on the B database server, of course, such a division is a cost, the most basic is that you can not do the left join and other operations. The so-called vertical segmentation database, generally refers to the user identification (USER_ID) to divide the data storage server, such as: We have 5 database servers, then "user_id% 5 + 1" equals 1 of the saved to 1th server, equals 2 to save to 2 good server, and so on, There are many principles of vertical separation, which can be selected depending on the situation. However, as with the horizontal segmentation of the database, the vertical segmentation of the database is also a cost, the most basic is that we do such as count, Sum and other summary operations will be a lot of trouble. In summary, the database server solution is often a mixed scenario, with the advantage of a variety of scenarios, and sometimes the need for third-party software such as memcached to accommodate larger traffic requirements.

If there is a dedicated application server to run PHP script is the most suitable, so that our page server only save static pages, you can set some of the application server, such as app.domain.com and other domain names to the page server to distinguish. For the application server, I still prefer to use the prefork mode of Apache, with the necessary xcache and other PHP cache software, the less load modules the better, in addition to the necessary modules such as mod_rewrite, unnecessary things are discarded, Minimize the memory consumption of the httpd process, while those image servers, page servers and other static content can be used lighttpd or tux to make full use of the characteristics of various servers.

If the conditions allow, the independent log server is also necessary, the general small Web site is the practice of the page server and log server into one, in the early hours of the small traffic cron run the day before the log calculation, but if you use awstats such as log analysis software, for millions traffic, Even if archived on a daily basis, it consumes a lot of time and server resources to compute, so separating the individual log servers is beneficial, and does not affect the state of the official server.

Transferred from: Http://phpweb.blog.163.com/blog/static/1797061622011101824915484/?latestBlog

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.