============= optimization of large-scale website Architecture =====================
A small website, such as personal website, can use the simplest HTML static page to achieve, with some pictures to achieve beautification effect, all the pages are stored in a directory, such a site on the system architecture, performance requirements are very simple, with the Internet business is constantly enriched, Website related technology After these years of development, has been subdivided into very fine aspects, especially for large sites, the use of technology is very wide, from hardware to software, programming languages, databases, WebServer, firewalls and other fields have a very high requirements, is not the original simple HTML static site can be compared.
Large Web sites, such as portals. In the face of a large number of user access, high concurrent requests, the basic solution is focused on a number of aspects: the use of high-performance servers, high-performance databases, high-efficiency programming language, as well as high-performance web containers. But in addition to these aspects, there is no way to solve the large-scale web site is facing high load and high concurrency problems.
The above offers a few solutions to a certain extent also means a greater input, and such a solution to the bottleneck, not very good extensibility, below I from the low cost, high performance and high expansion of the perspective of my experience.
1. Static HTML
In fact, we all know that the most efficient, the least expensive is the pure static HTML page, so we try to make the page on our site using static pages to achieve, the simplest method is actually the most effective method. But for a lot of content and frequently updated sites, we can not all manually to achieve, so we have a common information distribution system CMS, like we often visit the various portals of the news channel, and even their other channels, are through the information distribution system to manage and implement, Information Publishing system can achieve the simplest information input automatically generated static pages, but also with channel management, rights management, automatic capture and other functions, for a large web site, has a set of efficient, manageable CMS is essential.
In addition to the portal and the type of information publishing site, for the interactive requirements of the Community type site, as much as possible static is also to improve the performance of the necessary means, the community posts, articles in real-time static, there is a renewal of the time and re-static is a lot of use of the strategy, A hodgepodge like mop is the use of such strategies, such as the NetEase community.
At the same time, HTML static is also the use of some caching policies, for the system frequently using database queries but the content of small updates, you can consider the use of HTML static, such as forum public settings information, This information is currently the mainstream forum can be managed in the background and stored in the database, which is actually a lot of the foreground program calls, but the update frequency is very small, you can consider this part of the background update the time to static, so as to avoid a large number of database access requests.
2, Image server separation
You know, for the Web server, whether it is Apache, IIS or other containers, the picture is the most consumption of resources, so we have to separate the picture and the page, which is basically a large site will adopt the strategy, they have a separate picture server, and even many picture server. This architecture can reduce the server system pressure to provide page access requests, and can ensure that the system does not crash due to picture problems, on the application server and picture server, can be different configuration optimization, such as Apache in the configuration of contenttype can be as little as possible to support, LoadModule as little as possible to ensure higher system consumption and execution efficiency.
3. Database cluster and library table hash
Large Web sites have complex applications, which must use databases, and in the face of a large number of accesses, the bottleneck of the database can soon be revealed, when a database will soon be unable to meet the application, so we need to use the database cluster or library table hash.
In the database cluster, many databases have their own solutions, Oracle, Sybase and so on have a good solution, the common MySQL provided by the Master/slave is a similar scenario, you use what kind of db, refer to the corresponding solutions to implement.
The database cluster mentioned above is constrained by the DB type used in architecture, cost, and extensibility, so we need to consider improving the system architecture from the perspective of the application, and the library table hashing is the most common and effective solution. We install the business and application in the application or function module to separate the database, different modules corresponding to different databases or tables, and then according to a certain policy on a page or function of a smaller database hash, such as the user table, according to user ID for the table hash, This makes it possible to improve the performance of the system at a low cost and has a good scalability. Sohu Forum is the use of such a framework, the Forum users, settings, posts and other information database separation, and then to the post, the user in accordance with the plate and ID hash database and table, finally can be configured in the configuration file simple configuration will allow the system at any time to add a low-cost database to supplement the system performance.
4. Cache
The word cache has been touched by technology, and caches are used in many places. Caching in the Web site architecture and Web development is also very important. Here we first describe the two most basic caches. The advanced and distributed caches are described later.
Architecture cache, people familiar with Apache can know that Apache provides its own cache module, can also use the addition of Squid module for caching, both of which can effectively improve the access response of Apache.
Web application development cache, the memory cache provided on Linux is a common cache interface, can be used in web development, such as Java development can call MemoryCache to some data caching and communication sharing, some large communities use such a framework. In addition, in the use of web language development, all kinds of languages have their own cache modules and methods, PHP has pear cache module, Java more,. NET is not very familiar with, I believe there is certainly.
5. Mirror
Mirroring is often used by large web sites to improve performance and data security, the mirror technology can solve the different network access providers and geographical user access speed differences, such as the difference between chinanet and edunet prompted a lot of websites in the education network to build mirror site, Data is scheduled to be updated or updated in real time. In terms of mirror detail technology, this is not too deep, there are many professional ready-made solution architectures and products to choose from. There are also inexpensive ways to implement software, such as the tools of Rsync on Linux.
6. Load Balancing
Load balancing will be the ultimate solution for large web sites to address high-load access and a large number of concurrent requests.
Load balancing technology has developed for many years, there are many professional service providers and products can be selected, I personally contacted a number of solutions, including two architecture can give you a reference.
Hardware four-layer switching
The fourth layer Exchange uses the header information of the third layer and fourth layer packets, according to the application interval to identify the business flow, the entire interval segment of the business flow distribution to the appropriate application server for processing. The fourth layer switch function is like a virtual IP, pointing to the physical server. It transmits services that comply with a variety of protocols, such as HTTP, FTP, NFS, Telnet, or other protocols. These operations are based on physical servers and require complex load balancing algorithms. In the IP world, the business type is determined by the terminal TCP or UDP port address, and the application interval in layer fourth switching is determined by the source and endpoint IP addresses, TCP, and UDP ports.
In the hardware four-layer switching product area, there are some well-known products to choose from, such as Alteon, F5, etc., these products are expensive, but value for money, can provide very good performance and very flexible management capabilities. Yahoo China at the beginning of nearly 2000 servers using three or four alteon to be done.
Software four-layer switching
When you know the principle of hardware layer four switch, the software four layer exchange based on the OSI model comes into being, so the solution achieves the same principle, but the performance is slightly worse. But to meet a certain amount of pressure or comfortable, some people say that the software implementation is actually more flexible, the ability to handle the full look at your configuration of the familiar ability.
Software four-layer switching we can use the common LVS on Linux to solve, LVs is Linux Virtual Server, he provides a real-time disaster response based on the Heart Line heartbeat solution, improve the system robustness, At the same time, the flexible virtual VIP configuration and management functions can meet a variety of application requirements, which is necessary for distributed systems.
A typical use of load balancing strategy is to build a squid cluster on the basis of software or hardware four-layer switching, which is adopted on many large Web sites including search engines, which have low cost, high performance and strong extensibility, and it is easy to add or subtract nodes to the architecture at any time. Such a structure I am ready to empty a special detail and discuss with you.
For large web sites, each of the previously mentioned methods may be used at the same time, I introduced here is relatively simple, the implementation of a lot of details of the process needs to be familiar with and experience, sometimes a very small squid parameter or Apache parameter settings, the impact on the system performance will be very large, I hope that we will discuss together to achieve the effect.
============= Large-scale website Architecture Design System Evolution! ================
There have been some articles about the evolution of large-scale web sites, such as LiveJournal and ebay, which are well worth referring to, but feel that they are talking more about the results of each evolution than on why they need to be evolved, Plus recently feel a lot of students are difficult to understand why a website needs so complex technology, so there is the idea of writing this article, in this article will explain a common web site developed into a large web site in the process of a more typical architecture evolution and need to master the knowledge system, Hope to be engaged in the Internet industry students a little preliminary concept,:), the text of the wrong place also ask you to give a little more advice, so that this article really play a starting effect.
Architecture Evolution First step: Physically separate webserver and databases
At first, because of some ideas, so on the internet to build a website, this time may even host is rented, but because this article we only focus on the evolution of the architecture, so it is assumed that this time is already hosting a host, and a certain amount of bandwidth, This time because the site has a certain characteristics, attracted some people to visit, gradually you find the system pressure is getting higher and slower, and this time is more obvious is the database and application interaction, application problems, database is also prone to problems, and database problems, the application is also prone to problem , then entered the first stage of evolution: The application and the database from the physical separation into two machines, this time there is no new technical requirements, but you find that the effect of the system has been restored to the previous response speed, and support the higher traffic, and will not be due to the database and application to form a mutual impact.
Look at the diagram of the system after the completion of this step:
This step involves these knowledge systems:
This step of architecture evolution has little requirement on the technical knowledge system.
Architecture Evolution Step Two: Increase page caching
Not long, with more and more people visiting, you find that the response speed and began to slow down, find the reason, the discovery is to access the database too many operations, resulting in fierce competition in data connections, so the response is slow, but the database connection can not open too much, or the database machine pressure will be very high, So consider using the caching mechanism to reduce the competition of database connection resources and the pressure of database reading, at this time we may choose to use similar mechanisms such as squid to cache relatively static pages in the system (for example, a two-day update of the page) (Of course, You can also use the static page of the scheme, so that the program can not be modified, it will be able to reduce the pressure on the webserver and reduce the competition of database connection resources, OK, so began to use squid to do relatively static cache of the page.
Look at the diagram of the system after the completion of this step:
This step involves these knowledge systems:
Front-end page caching technology, such as squid, if you want to use good words also have to grasp the implementation of Squid and cache failure algorithm.
Architecture Evolution Step Three: Increase page fragment caching
Added squid to do the cache, the overall system speed is indeed improved, the pressure of the webserver began to decline, but with the increase in traffic, the discovery system began to change a little slower, in the taste of squid and other dynamic cache brought benefits, Starting to think about whether the relatively static parts of the dynamic pages are also cached, so consider using a page fragment caching strategy like ESI, OK, andstart using ESI to do the caching of the relatively static fragment portion of the dynamic page.
Look at the diagram of the system after the completion of this step:
This step involves these knowledge systems:
Page fragment caching technology, such as ESI , to use good words also need to master the implementation of ESI, and so on;
Architecture Evolution Step Fourth: Data caching
In the adoption of technology such as ESI once again improve the system's cache effect, the system pressure is indeed further reduced, but again, with the increase in traffic, the system will start to slow down, after looking, you may find that there are some duplicate data in the system to obtain information, such as access to user information, This time began to consider whether this data can be cached, so that the data cache to local memory, after the change is complete, fully meet the expectations, the system's response speed has been restored, the database pressure has been reduced a lot.
Look at the diagram of the system after the completion of this step:
This step involves these knowledge systems:
Caching techniques, including map data structures, caching algorithms, the implementation mechanism of the chosen framework itself.
Architecture Evolution Step Fifth: increase webserver
and found that as the number of system visits increased again, webserverThe pressure on the machine will rise to a higher peak at this time, and then consider adding a webserver machine words can not be used, Having made these considerations, it was decided to add a Webserver, how to get access to the two machines, this time will usually consider the scheme is Apache This kind of software load Balancing scheme, 2, how to keep the state information synchronized, For example, the user Session How to keep the data cache information synchronized, such as previously cached user data, etc. The mechanism that is usually considered at this time is cache synchronization or distributed cache; 4, how to make uploading files these similar functions continue to normal, this time usually consider the mechanism is to use shared file system or storage, etc.
Look at the diagram of the system after the completion of this step:
This step involves these knowledge systems:
Load balancing technology (including but not limited to hardware load balancing, software load balancing, load algorithm, Linux forwarding Protocol, implementation details of selected technology, etc.), Master and standby technology (including but not limited to ARP spoofing, Linux heart-beat, etc.), State information or cache synchronization technology (including but not limited to cookie technology, UDP protocol, status information broadcast, implementation details of the selected cache synchronization technology, etc.), shared file technology (including but not limited to NFS, etc.), storage technology (including but not limited to storage devices, etc.).
Architecture Evolution Sixth Step: sub-Library
Enjoy a period of time the system visits the high-speed growth of happiness, the discovery system began to slow down, this is what the situation, after looking, found that the database write, update some of these operations database connection Resource competition is very fierce, causing the system to slow down, how to do it, At this point, the option has a database cluster and sub-library policies, cluster aspects like some database support is not very good, so the sub-Library will become a more common strategy, sub-Library also means to modify the original program, a change to achieve the sub-Library, good, the goal reached, the system recovery even faster than before.
Look at the diagram of the system after the completion of this step:
This step involves these knowledge systems:
This step is more need to do a reasonable division from the business to achieve the sub-Library, the specific technical details of no other requirements;
At the same time, with the increase of data volume and the sub-Library, the design, tuning and maintenance of the database need to do better, so the technology in these areas has put forward a very high demand.
Architecture Evolution Step Seventh: Table, DAL, and distributed cache with the continuous operation of the system, the volume of data began to grow significantly, this time to find the library after the query will still be a little slow, so according to the idea of sub-Library to do the work of the table, of course, this will inevitably require some changes to the program , Perhaps at this time will be found to apply themselves to care about the rules of the Sub-database table, or some complex, so the initiation can be added to a common framework to achieve the data access to the database sub-table, this in the ebay architecture corresponds to the Dal, This process of evolution takes a relatively long time, and of course, it is possible that the generic framework will wait until the table is done, and at this stage it can be found that the previous cache synchronization scheme is problematic because the amount of data is too large, which makes it now unlikely that the cache will exist locally and then synchronized in a way , need to adopt the distributed cache scheme, then, it is a survey and torture, and finally is a large number of data cache transferred to the distributed cache.
Look at the diagram of the system after the completion of this step:
This step involves these knowledge systems:
Sub-table More is also the division of business, the technology involved in the dynamic hash algorithm, consistent hash algorithm and so on;
The DAL involves more complex techniques, such as the management of database connections (timeouts, exceptions), the control of database operations (timeouts, exceptions), the encapsulation of sub-list rules, etc.
Architecture Evolution Step Eighth: add more webserver
After the work of the sub-Library, the pressure on the database has dropped to a relatively low, and began to watch the daily traffic explosion of the happy life, suddenly one day, found that the system's visit began to slow trend, this time first look at the database, pressure all normal, then view webserver, Discover Apache A lot of blocking requests, and the application server for each request is also relatively fast, it appears that is that the number of requests is too high causing the need to wait in line, Slow response, this is OK, generally speaking, this time will be some money, so add some Webserver server, in this add Webserver Apache soft load or Lvs buy hardware loads, such as F5, Netsclar Athelon and so on, if the funding is not allowed, the plan is to apply the application logically to do a certain classification, Then dispersed to different soft load cluster; 2, some of the existing state information synchronization, file sharing and other scenarios may be bottlenecks, need to be improved, Perhaps this time will be based on the situation to write a distributed file system to meet the needs of the Web site, etc. After doing this, we begin to enter a seemingly perfect era of infinite scaling, when the website traffic increases, The solution is to constantly add Webserver
Look at the diagram of the system after the completion of this step:
This step involves these knowledge systems:
At this point, as the number of machines continues to grow, the volume of data continues to grow, and the requirements for system availability are increasing, this time requires a deeper understanding of the technologies used and the need for more customized products based on the needs of the site.
Architecture Evolution Step nineth: Data read-write separation and inexpensive storage solutions
Suddenly one day, found this perfect era also to end, the database nightmare again appeared in the eyes, because of the addition of Webserver too much, resulting in the database connection resources is not enough, and this time has been divided into a table, and began to analyze the database pressure state, May find the database read and write ratio is very high, this time usually think of the data read and write separation scheme, of course, the implementation of this scheme is not easy, in addition, may find some data stored in the database some waste, or too occupy the database resources, So the evolution of architecture that could be formed at this stage is to achieve a separation of data read and write, while writing some more inexpensive storage schemes, such as BigTable .
Look at the diagram of the system after the completion of this step:
This step involves these knowledge systems:
Data read and write separation requirements of the database replication, standby and other strategies have in-depth grasp and understanding, at the same time will require a self-implemented technology;
The inexpensive storage scheme requires in-depth mastery and understanding of the file storage of the OS, and requires in-depth mastery of the implementation of the language in the file.
Architecture Evolution Step Tenth: Into the era of large-scale distributed applications and inexpensive server group Dream era
After the long and painful process above, finally is again ushered in the perfect era, constantly increasing webserverCan support the increasing number of visits, for large sites, the importance of popularity is notIt is doubtful that as the popularity of the more and more high, a variety of functional requirements also began to explode the growth, this time suddenly found that the original deployment in webserver The system is split according to responsibilities, so a large distributed application is born, usually this step takes a long time, because it will meet a lot of challenges: 1, split into a distributed after the need to provide a high-performance, Stable communication framework, and the need to support a variety of different communication and remote call mode; 2 How to operate (rely on management, health management, error tracking, tuning, monitoring and alerting, etc.) this is a huge distributed application.
Look at the diagram of the system after the completion of this step:
This step involves these knowledge systems:
This step involves a lot of knowledge system, requires a deep understanding and mastery of communication, remote call, message mechanism and so on, the requirements are from the theory, hardware level, operating system level and the implementation of the language used have a clear understanding.
Operation and maintenance of this piece of knowledge system is also very much, in most cases need to master the distributed parallel Computing, reporting, monitoring technology and rule strategy and so on.
It is really not very laborious, the entire site architecture of the classic evolution of the process is similar to the above, of course, each step to take the program, the evolution of the steps may be different, in addition, because the site's business is different, there will be different professional and technical needs, this blog More from the perspective of the structure of the evolution of the process, of course, there are many of the technology is not mentioned here, such as database clustering, data mining, search, etc., but in the real evolution of the process will also rely on such as upgrading hardware configuration, network environment, upgrading operating system, CDN image, etc. to support greater traffic, So in the real development process there will be a lot of different, another large web site to do far more than these, as well as security, operation, operations, services, storage and so on, to do a large site is really not easy to write this article is more hope to lead to more large-scale website architecture evolution of the introduction,:) .
[to] Optimization and architecture evolution of large-scale website architecture (collation)