Scalability of large portal site architecture design

Last Update:2017-02-28 Source: Internet

Author: User

Tags access server memory advantage

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

We know, for a large portal site, scalability is very important, how to have good scalability in both vertical and horizontal, you need to do architectural design when the principle of a division, I want to say in a number of ways how to divide:

The first is the transverse points:

1. Large Web sites to dissolve into a number of small sites: when we have a site with multiple functions, we can consider splitting the site into several small modules, each module can be a website, so we can be very flexible to deploy these sites to different servers.

2. Static dynamic separation: Static files and dynamic files best separated into 2 sites, we know that static Web sites and dynamic Web sites for the server stress is different, the former may be heavy IO heavy CPU, then we choose the hardware can also have a focus, and static and dynamic content of the caching strategy is not the same. Typical applications, we typically have separate file or picture servers.

3. According to the function: for example, there is a module is responsible for uploading, upload operation is time-consuming, if and other applications mixed together, it is likely that a little access will make the server paralyzed, this special module should be separated. Security is not safe to separate, but also need to consider the subsequent purchase of SSL.

4. We do not necessarily have to use their own servers, search, reports can rely on other people's services, such as Google's search and Reporting Services, do not necessarily compare themselves to others, the server bandwidth is saved.

The second is the longitudinal division:

1. The file also corresponds to the database, IO traffic may be larger than the database, which is also a vertical level of access, upload the file picture must be separate from the Web server. Of course, the database and the site are placed on a server is very few, this is the most basic.

2. For dynamic programs involving database access, we can use a middle tier (the so-called application or logic layer) to access the database (deployed on separate servers), with the greatest benefit being caching and flexibility. Cache memory footprint is relatively large, we want to separate it and the site process, and so we can easily change some of the data access strategy, even if there is a distribution of the database here can do a deployment work, so flexibility is very large. There is also the advantage is that the middle layer can do wire netcom bridge, may netcom access to dual-line access to telecommunications more than netcom direct access to the telecommunications server faster.

Some people say I do not point, I can do load balancing, yes, yes, but if it is, the same 10 machines will certainly be able to withstand more traffic than 10 machines, and the need for hardware may not be high, knowing which hardware is particularly good. Strive to make each service period is not idle, are not too busy, reasonable combination adjustment and expansion, such a system scalability is high, according to the number of visits to adjust the premise is before the division, the advantages of the points are flexibility, scalability, isolation and security.

For the server, there are a few things we need to observe for a long time, and any point can be a bottleneck:

1. CPU: Dynamic file resolution needs to be more cpu,cpu bottlenecks to see which function is too long to occupy the thread, if it is divided. Or that each request processing time is not long, but the traffic is very high, then add the server. The CPU is a good thing, can't let him wait, do not do things.

2. Memory: Caching is independent of the IIS process, and generally not enough memory for a Web server. Memory is faster than disk and should be used reasonably.

3. Disk IO: Using Performance Monitor to find which files Io is particularly large, found on a separate set of file servers up, or directly to do CDN. Slow disk, large-scale read data applications by caching, large-scale write data applications can rely on queues to reduce the burst of concurrency.

4. Network: We know that the network communication is relatively slow, slower than the disk, if it is to do distributed caching, distributed computing, to take into account the physical server network communication between the time, of course, in the flow of large, this can improve the system's ability to accept a level. Static content can be used to share a part of the CSD, in the assumption of the server should also consider the Chinese characteristics of the telecommunications netcom and firewalls.

For SQL Server database server [UPDATE]:

In fact, or horizontal segmentation and vertical segmentation, a two-dimensional table, horizontal segmentation is to cut across a knife, vertical segmentation is a vertical cutting knife:

1, vertical segmentation is that our different applications can be divided into different db, in different instances, or to have a lot of fields to split into small tables.

2, horizontal segmentation is that some applications may not load, such as user registration, but the user table will be very large, you can separate the large table. You can use table partitioning, data stored on different files, and then deployed to an independent physical server to increase IO throughput to improve read and write performance, soil is the practice of their own regular archive of old data. Another advantage of table partitioning can increase the speed of data query, because our page index can have multiple layers, like a folder in the file not too much, more than a few layers of folders.

3, can also through the database mirroring, replication subscriptions, things log, read and write separate to different mirror physical database, generally enough, if not yet can use hardware to achieve the load balance of the database. Of course, for BI, we might also have a data warehouse.

After the architecture is taken into account, the traffic is large, and the load balancing of Web server or application server can be adjusted on this basis. Most of the time we are repeatedly discovering the problem-"finding the bottleneck-" to solve the process.

The typical architecture is as follows:

Dynamic Web server with a good point of CPU, static Web server and file server disk better

Application server memory is large, cache server is also, database server of course, memory and CPU are better

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More