Building a large Web site architecture 10 steps

Source: Internet
Author: User

Today we are going to talk about a site is generally how to build a system architecture, although we hope that the site can have a very good structure at the beginning, but things are in the development of continuous progress, the site architecture is also with the expansion of the business, the needs of users continue to improve, the following is a gradual development of the basic process of Web site architecture.

Schema Evolution First step: physical separation of webserver and databases

Initially, because of some ideas, so on the internet to build a website, this time may even host is rented, but because this article we only focus on the evolution of architecture, Therefore, it is assumed that this time is already hosting a host, and there is a certain amount of bandwidth. This time because the site has a certain characteristics, attracted some people to visit, gradually you find the system pressure is getting higher and slower, and this time is more obvious is the database and application interaction, application problems, the database is also prone to problems, and database problems, the application is also prone to problem. Then entered the first stage of evolution: The application and the database from the physical separation into two machines, this time there is no new technical requirements, but you find that the effect of the system has been restored to the previous response speed, and support higher traffic, and will not be due to the database and application to form a mutual impact.

Schema Evolution the second step: increase the page cache

The long time, with more and more people visiting, you find that the response speed is beginning to slow down, find the reason, found that access to the database is too many operations, resulting in a competitive data connection, so the response is slow. But the database connection can not open too much, otherwise the database machine pressure will be very high, so consider adopting the caching mechanism to reduce the competition of database connection resources and the pressure to read the database. At this point, you may choose to use similar mechanisms such as squid to cache relatively static pages in the system (for example, a two-day update of the page) for caching (of course, you can also use the static page of the scheme), so that the program can not be modified, will be able to reduce the pressure on the webserver and reduce the competition of database connection resources, OK, then began to use squid to do relatively static cache of the page.

Schema Evolution step three: Increase the page fragment cache

added squid to cache, the overall system speed is indeed improved, the pressure of the webserver began to decline, but with the increase in traffic, the discovery system began to become somewhat slower. After tasting the benefits of a dynamic cache such as squid, I started to think about whether the relatively static parts of the dynamic pages would be cached now, so consider using a page fragment caching strategy like ESI, OK, and start using ESI to do the caching of the relatively static fragment portion of the dynamic page.

Architecture Evolution Step Fourth: Data caching

With the adoption of ESI-like techniques to improve the caching of the system again, the pressure of the system is actually further reduced, but again, as the traffic increases, the system starts to slow down. After looking, it may be found in the system there are some repeated access to data information, such as access to user information, and so on, this time began to consider whether this data can be cached, so that the data cached to local memory, after the change is complete, fully meet the expectations, the system's response speed has been restored, The pressure on the database has also diminished a lot.

Architecture Evolution Step Fifth: Increase webserver

Not long, found that with the increase in system access, Webserver machine pressure in the peak will rise to a relatively high, this time began to consider adding a webserver, which is also to solve the availability of the problem, to avoid a single webserver Down machine words can not use, after doing these considerations, decided to add a webserver, add a webserver, will encounter some problems, typical is:

1, how to assign access to the two machines, this time usually consider the plan is Apache's own load balancing scheme, or LVS such a software load balancing scheme;

2, how to maintain the synchronization of state information, such as user session, this time will consider the scheme has written to the database, write storage, cookies or synchronization session information mechanism, etc.

3, how to maintain the synchronization of data cache information, such as previously cached user data, etc., this time usually consider the mechanism of cache synchronization or distributed cache;

4, how to make uploading files these similar functions continue to normal, this time usually consider the mechanism is the use of shared file system or storage, etc.;

After solving these problems, the webserver is finally added to two units, and the system is finally back to the previous speed.

Architecture Evolution Sixth Step: sub-Library

Enjoy a period of time the system visits the high-speed growth of happiness, the discovery system began to slow down, this is what the situation, after looking, found that the database write, update some of these operations database connection resource competition is very fierce, causing the system to slow down, how to do? At this point, the option has a database cluster and sub-library policies, cluster aspects like some database support is not very good, so the sub-Library will become a more common strategy, sub-Library also means to modify the original program, a change to achieve the sub-Library, good, the goal reached, the system recovery even faster than before.

Architecture Evolution Step Seventh: Table, Dal, and distributed cache

With the continuous operation of the system, the volume of data began to grow substantially, this time to find the library after the query will still be some slow, so according to the idea of the library began to do the work of the table. Of course, this inevitably will require some changes to the program, perhaps at this time will find the application of their own to care about the rules of the sub-database, or some complex. So the initiation can be added to a common framework for the data access of the sub-database table, which corresponds to the DAL in the architecture of ebay, the evolution of this process takes a relatively long time. Of course, it is also possible that this generic framework will wait until the table is finished before starting to do it. At the same time, there may be problems with the previous cache synchronization scheme, because the amount of data is too large, which makes it less likely to present the cache locally, and then synchronize the way it needs to adopt a distributed cache scheme. So, it is a survey and torture, and finally a large number of data cache transfer to the distributed cache.

Architecture Evolution Step Eighth: add more webserver

After doing the work of the sub-Library, the pressure on the database has dropped to a relatively low, and began to watch the daily traffic explosion of the happy life. Suddenly one day, found that the system's access and began to slow down the trend, this time first look at the database, the pressure is normal, then look at webserver, found that Apache blocked a lot of requests, and the application server for each request is also relatively fast, it seems that the number of requests is too high to wait for the queue, Slow response times. This is OK, generally speaking, this time will also have some money, so add some webserver server, in this add webserver server process, there may be several challenges:

1. Apache soft load or LVS soft load can not bear the huge amount of web traffic (request connection number, network flow, etc.) scheduling, this time if the funding allows, the plan is to buy hardware load balancing equipment, such as F5, Netsclar, Athelon and so on, If the funds are not allowed, the plan is to make the application logically classified, and then dispersed to different soft load cluster;

2, some of the original state information synchronization, file sharing and other programs may be bottlenecks, need to be improved, perhaps this time will be based on the situation to write to meet the needs of the Web site Distributed file system, etc.

After doing this, we begin to enter an era of seemingly perfect infinity, and when website traffic increases, the solution is to constantly add webserver.

Architecture Evolution Step Tenth: Into the era of large-scale distributed applications and inexpensive server group Dream era

After the long and painful process above, finally is again ushered in the perfect era, and constantly increase the webserver can support more and more high traffic. For large sites, the importance of popularity is beyond doubt, as the popularity of the more and more high, a variety of functional requirements also began to explode. This time suddenly found that the original deployment of the Web application on the webserver is very large, when more than one team began to change it, it is quite inconvenient, reusability is pretty bad, basically every team has done more or less duplication of things, and deployment and maintenance is also quite troublesome. Because the huge application package in the N machine to copy, start all need to spend a lot of time, the problem is not very good to check, and another worse situation is likely to be a bug in an application caused by the whole station is not available, there are other like tuning bad operation (because the application deployed on the machine to do everything, There is no way to make targeted tuning) and other factors, based on such analysis, began to make a decision, the system according to the responsibility of the split, so a large distributed application was born, usually, this step takes a long time, because there will be a lot of challenges:

1, split into a distributed after the need to provide a high-performance, stable communication framework, and need to support a variety of different communication and remote Call mode;

2, it takes a long time to split a huge application, need to do business collation and system dependency control, etc.

3, how to operate (rely on management, health management, error tracking, tuning, monitoring and alarm, etc.) good this huge distributed application.

After this step, the architecture of almost the system enters a relatively stable phase, but also can start to use a large number of inexpensive machines to support the huge amount of traffic and data, combined with this architecture and the experience of so many evolutionary processes to adopt a variety of other methods to support the increasing volume of traffic.

This step involves a lot of knowledge system, requires a deep understanding and mastery of communication, remote call, message mechanism and so on, the requirements are from the theory, hardware level, operating system level and the implementation of the language used have a clear understanding. To learn more about the relevant content, you can go to e-Mentor network learning related tutorials.

Building a large Web site architecture 10 steps

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.