Pre-planning for large sites

Source: Internet
Author: User


From a low-cost, high-performance and high-scalability perspective of the site planning: HTML static, Image server separation, database cluster and library table hash, cache ...

A small website, such as personal website, can use the simplest HTML static page to achieve, with some pictures to achieve beautification effect, all the pages are stored in a directory, such a site on the system architecture, performance requirements are very simple, with the Internet business is constantly enriched, Website related technology After these years of development, has been subdivided into very fine aspects, especially for large sites, the use of technology is very wide, from hardware to software, programming languages, databases, WebServer, firewalls and other fields have a very high requirements, is not the original simple HTML static site can be compared.

Large Web site architecture, such as the architecture of a portal site. In the face of a large number of user access, high concurrent requests, the basic solution is focused on a number of aspects: the use of high-performance servers, high-performance databases, high-efficiency programming language, as well as high-performance web containers. But in addition to these aspects, there is no way to solve the large-scale web site is facing high load and high concurrency problems.

The above offers a few solutions to a certain extent also means a greater input, and such a solution to the bottleneck, not very good extensibility, below I from the low cost, high performance and high expansion of the perspective of my experience.

1. Static HTML

In fact, we all know that the most efficient, the least expensive is the pure static HTML page, so we try to make the page on our site using static pages to achieve, the simplest method is actually the most effective method. But for a lot of content and frequently updated sites, we can not all manually to achieve, so we have a common information distribution system CMS, like we often visit the various portals of the news channel, and even their other channels, are through the information distribution system to manage and implement, Information Publishing system can achieve the simplest information input automatically generated static pages, but also with channel management, rights management, automatic capture and other functions, for a large web site, has a set of efficient, manageable CMS is essential.

In addition to the portal and the type of information publishing site, for the interactive requirements of the Community type site, as much as possible static is also to improve the performance of the necessary means, the community posts, articles in real-time static, there is a renewal of the time and re-static is a lot of use of the strategy, A hodgepodge like mop is the use of such strategies, such as the NetEase community.

At the same time, HTML static is also the use of some caching policies, for the system frequently using database queries but the content of small updates, you can consider the use of HTML static, such as forum public settings information, This information is currently the mainstream forum can be managed in the background and stored in the database, which is actually a lot of the foreground program calls, but the update frequency is very small, you can consider this part of the background update the time to static, so as to avoid a large number of database access requests.

2, Image server separation

You know, for the Web server, whether it is Apache, IIS or other containers, the picture is the most consumption of resources, so we have to separate the picture and the page, which is basically a large site will adopt the strategy, they have a separate picture server, and even many picture server. Such a architecture can reduce the server system pressure to provide page access requests, and can ensure that the system does not crash due to picture problems, on the application server and picture server, can be different configuration optimization, such as Apache in the configuration of ContentType when possible to support less, as little as possible LoadModule to ensure higher system consumption and execution efficiency.

3. Database cluster and library table hash

Large Web sites have complex applications, which must use databases, and in the face of a large number of accesses, the bottleneck of the database can soon be revealed, when a database will soon be unable to meet the application, so we need to use the database cluster or library table hash.

In the database cluster, many databases have their own solutions, Oracle, Sybase and so on have a good solution, the common MySQL provided by the Master/slave is a similar scenario, you use what kind of db, refer to the corresponding solutions to implement.

The database cluster mentioned above is constrained by the DB type used in architecture, cost, and extensibility, so we need to consider improving the system architecture from the perspective of the application, and the library table hashing is the most common and effective solution. We install the business and application in the application or function module to separate the database, different modules corresponding to different databases or tables, and then according to a certain policy on a page or function of a smaller database hash, such as the user table, according to user ID for the table hash, This makes it possible to improve the performance of the system at a low cost and has a good scalability. Sohu Forum is the use of such a framework, the Forum users, settings, posts and other information database separation, and then to the post, the user in accordance with the plate and ID hash database and table, finally can be configured in the configuration file simple configuration will allow the system at any time to add a low-cost database to supplement the system performance.

4. Cache

The word cache has been touched by technology, and caches are used in many places. Caching in the Web site architecture and Web development is also very important. Here we first describe the two most basic caches. The advanced and distributed caches are described later.

Architecture cache, people familiar with Apache can know that Apache provides its own cache module, can also use the addition of Squid module for caching, both of which can effectively improve the access response of Apache.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.