We know that scalability is very important for a large website. To achieve good scalability both vertically and horizontally, we need to consider the principle of splitting when designing the architecture, I want to explain how to score in multiple aspects:
The first is horizontal score:1. A large website resolves multiple small websites: When a website has multiple functions, we can consider splitting the website into several small modules. Each module can be a website, in this way, we can flexibly deploy these websites on different servers. 2. static dynamic separation: it is best to separate static files from dynamic files into two websites. We know that static websites and dynamic websites are different in terms of server pressure. The former may be io-intensive, and the latter may be CPU-intensive, we can also focus on hardware selection, and the static and dynamic content cache policies are different. For typical applications, we generally have independent file or image servers. 3. by function: for example, a module is responsible for uploading, which consumes a lot of time. If it is mixed with other applications, a little access will paralyze the server, such special modules should be separated. Security and insecurity should also be separated, and future SSL purchases should also be taken into account. 4. we don't have to use all our servers. Searching and reporting can rely on others' services, such as Google's search and report services. What we do is not necessarily comparable to others, the server bandwidth is saved.The second is vertical score:1. Files are also equivalent to databases. Io traffic may be larger than databases. This is a vertical access level. uploaded file images must be separated from Web servers. Of course, there are very few databases and websites on one server, which is the most basic. 2. for dynamic programs involving database access, we can use an intermediate layer (so-called application layer or logic layer) to access the database (deployed on an independent server ), the biggest benefit is caching and flexibility. The cache memory usage is large. We need to separate it from the website process. In this way, we can easily change some data access policies, even if the database is distributed at that time, you can make a deployment here, which is very flexible. The advantage is that the middle layer can be used as a wire and network communication bridge. It may be faster for China Netcom to access the dual-line network than that of China Netcom to directly access the Telecom server. Some people say that I don't know, but I can do load balancing. Yes, it is. But if I do, the same 10 machines will certainly be able to withstand more traffic than 10 machines, in addition, the requirement for hardware may not be very high, because it is especially good to know which hardware is needed. We strive to make every service period idle and not too busy, and make reasonable combination adjustment and expansion, so that the system is highly scalable, the premise for adjustment based on the traffic volume is that points have been taken into account before, the benefits of points are flexibility, scalability, isolation and security.
For servers, we have several points to observe for a long time. Any point may be a bottleneck:1. CPU: the parsing of dynamic files requires a lot of CPU, and the CPU bottleneck depends on whether the function occupies the thread for a long time. If yes, it will be split out. Or, if each request is not processed for a long time but has a high access volume, the server is added. CPU is a good thing, so you cannot wait for it and do nothing. 2. Memory: the cache is independent from the IIS process. Generally, the web server does not have enough memory. The memory is faster than the disk and should be used properly. 3. Disk I/O: Use the performance monitor to find which files are particularly large in I/O. If I find the files, I/O is assigned to an independent set of file servers, or directly implement CDN. The disk is slow. Applications that read data on a large scale rely on caching. Applications that write data on a large scale can rely on queues to reduce the burst concurrency. 4. network: We know that network communication is relatively slow and slower than disks. If distributed cache and distributed computing are used, the network communication time between physical servers must be considered. Of course, when the traffic is high, this can improve the system's acceptance capability by a level. Static content can be shared by CSD. When making server assumptions, we also need to consider China's telecom network situation and firewall.For SQL Server database servers, [update]:In fact, it is still horizontal split and Vertical Split. in a two-dimensional table, horizontal split is a cross-cutting, vertical split is a vertical split: 1. Vertical Split is, different applications can be divided into different databases and instances, or a table with many fields can be split into small tables. 2. Horizontal Split means that some applications may not be loaded, such as user registration, but the user table will be very large and large tables can be separated. You can use table partitioning, store data on different files, and then deploy it on an independent physical server to increase Io throughput to improve read/write performance. In other words, you can archive old data on a regular basis. Another advantage of Table Partitioning can increase the data query speed, because our page indexes can have multiple layers, just as there are not too many files in a folder, just like there are several layers of folders. 3. You can also use database images, copy subscriptions, and transaction logs to separate read/write data from different image physical databases. Generally, this is sufficient, if not, you can use hardware to achieve database load balancing. Of course, for Bi, we may also have data warehouses. After this is taken into account in the architecture, if the traffic is high, you can adjust or balance the load of web servers or application servers on this basis. Most of the time, we repeatedly find the problem-find the bottleneck-to solve the problem.The typical architecture is as follows:Dynamic Web servers are equipped with better CPUs, static Web servers and file servers disks are better. application servers have larger memory and cache servers. database servers certainly have better memory and CPU.
Experience of large-scale Internet website architecture II: Changing
"Points" is a relatively large principle and a relatively high-level principle. This time I want to talk about the other two principles: and change.
And
Why? It is because we want to improve the system's load-carrying capability by dividing. What is that? I thought about the following aspects: 1. Merge user requests. The most basic thing is to merge CSS/images/scripts and merge pages. However, merging may result in a waste of traffic and requires a balance point. 2. the granularity of the merge interface. For distributed applications, we may not directly access the database but call the interface provided by the application layer. Because it is a network call, the cost is relatively high, therefore, when designing an interface with coarse granularity, we should try our best to provide a coarse-grained interface. A single call will return a large amount of data, instead of refining the interface to add, delete, and modify layers. 3. for the deployment of the merge interface, data redundancy can be considered for frequent cross-machine calls, so that cross-network service programming processes can communicate or even be transferred to the client. For example, you can directly call the interface provided by the application layer (cross-machine) to filter dirty words when posting a forum, but it may be costly. You can deploy this interface on the local machine using IPC.
ChangeTime for space, space for time is a common practice, specifically: 1. cache. The importance of caching is early in computer hardware. For websites, there are many types of caches, which can be the client resource cache, the page output cache, or the application layer data cache, with the same purpose, it can reduce the number of server requests, the request processing process, or the number of database accesses. Of course, generating static files can also be regarded as a cache. It is impossible to access a disk, but we need to greatly reduce disk access opportunities. 2. Sometimes, in order to get extremely fast responses, we will use repeated computing at any cost. For example, a certain operation may slow the response due to network problems or other reasons, and a unified processing interface can be provided during the design, the interface is distributed to different servers for asynchronous implementation of this operation. When the server returns the result, we use this result and then kill the redundant operations of other servers. 3. websites generally pursue fast responses, and generally do not use time for space at a relatively high level. However, some user-specific data processing algorithms may still consider space savings. 4. Sometimes we use some aggregate tables to store aggregated data, that is, to perform some pre-computing to Improve the Performance of complex computing (such as reports. Of course, building a multi-dimensional database for data analysis is also a good choice. A lot of netizens commented that there was nothing specific about it. I think it is difficult to say how to implement the architecture. Because the implementation depends on the situation and there is no perfect thing, the architecture is usually balanced, it is very likely that the implementation of the architecture will be affected by a different emphasis. I hope my articles will give you a tip. If you think "I didn't take this into consideration, but pay attention to it later", it may be the greatest help, next I want to talk about some other problems, each of which is scattered. It is a supplement: 1. whether to use the existing items or do it on your own requires detailed consideration. using other things may be relatively stable, but your own control is a little less. Using your own things can be flexible, however, there may be many problems. In any case, when we adopt a third-party framework, we must conduct a thorough investigation to see its shortcomings. Otherwise, the project may be restricted by this framework in the future. Otherwise, when you decide to build a framework, you need to see what other frameworks cannot provide. 2. Data can be compressed during transmission, but the CPU resources are required for compression and decompression. There is a balance between I/O (disk, bandwidth, and transmission capacity) and CPU. 3. The ideal scalability architecture allows you to add or replace servers without downtime or great adjustments. When using a unified scheduling center to schedule these servers and allocate requests, we need to consider how much traffic the scheduling server can withstand. 4. Is there a large number of cheap servers or a small number of high-end servers? How to combine servers to maximize their role as needed. 5. For the distributed architecture, we try to keep each node simple logic and minimize the dependencies between nodes at the same level. A unified place is required to manage all nodes. 6. Function Decomposition, asynchronous integration, failover, and failure protection. 7. the architecture upgrade of software is very similar to the architecture upgrade of computer hardware. It may take some time for us to gradually improve our overall capabilities, which has increased several times in two years, then we found that we could improve our capabilities by dozens of times only through some thorough architectural changes. After the upgrade, we may encounter other problems. Like the CPU, it is a simple way to increase the clock speed or completely change the architecture. 8. Data: read/write splitting, database separation, function division, cache, and image. 9. The architecture on the hardware network is very important, but some details in software development cannot be ignored. A good architecture does not mean that code reviews are not required.