Original: http://agiledon.github.io/blog/2013/02/27/scalability-system-architecture-lessons/
Recently, reading the articles of'll Larson introduction to the architecting System for scale, I feel very valuable. The author shares his architectural experience in designing scalable systems that he has harvested at Yahoo! and Digg. In my past architectural experience, due to the major involvement in the development of enterprise software systems, this enterprise-oriented software system usually does not have too much load, too much concurrency, and therefore the scalability of the system is considered less. In general, just consider clustering and load balancing on your system deployment. This article has given me a lot of inspiration, now the main content of this article selected passage out, and combined with their understanding of it.
Larson first thought that an ideal system for the growth of capacity (capacity) should be linearly related to the number of hardware added. In other words, if the system has only one server, the capacity should be doubled after adding another machine of the same size. And so on This linear capacity scaling method is often referred to as horizontal scaling "horizontal Scalability".
When designing a robust system, nature must first consider the failure. Larson believes that an ideal system will not crash when one of the servers is lost. Of course, the loss of a single server can also lead to a linear reduction in capacity response. This condition is often referred to as redundant "redundancy".
Load Balancing
Both horizontal scaling and redundancy can be achieved through load balancing. Load balancing is like a mediator that coordinates requests, and it allocates requests to the Web server according to the current load of the machines in the cluster, so as to effectively utilize the resources of each machine in the cluster. Obviously, this equalizer should be somewhere between the client and the Web server, as shown in the following:
In this paper, several methods for load balancing are mentioned. One is the smart client, which adds load-balanced functionality to the clients of the database (as well as the cache or service). This is a way of using software to load balance, the disadvantage is that the scheme is more complex, not strong enough, it is difficult to reuse (because the logic of the coordination request is mixed in the business system). In this respect, Larson in the article in a continuous way to raise questions, in order to strengthen their non-recognition of this program attitude:
Is it attractive because it is the simplest solution? Usually, No. Is it seductive because it's the most robust? Sadly, No. Is it alluring because it ' ll being easy to reuse? Tragically, No.
The second approach is to use a hardware load balancer, such as the Citrix NetScaler. However, the cost of purchasing hardware is expensive, and usually some large companies will consider this option.
If you are unwilling to tolerate the pain of the smart client and do not want to spend too much money on hardware, you can use a hybrid (Hybird) approach called a software load balancer (software load Balancer). Larson mentioned the Haproxy. It will run locally, and services that require load balancing will be balanced and coordinated locally.
Cache
To reduce the load on the server, you also need to introduce a cache. The article gives a common classification of caches, including: pre-computed results (precalculating result, for example, for the previous day of the relevant logic), pre-generation of expensive indexes (pre-generating expensive indexes, For example, a user clicks on a history recommendation) and a copy of frequently accessed data (such as memcached) stored on a faster backend.
App Cache
The way to provide caching can be divided into application cache and database cache. Both of them win the game. Application caching typically requires the code that handles the cache to be explicitly integrated into the application code. This is a bit like using proxy mode to provide caching for real objects. First check the cache for the required data, if any, directly from the cache, or else to query the database. What values need to be placed in the cache? There are many algorithms, such as depending on the most recently accessed, or depending on the frequency of access. The code that uses memcached is as follows:
Key="User.%s"%user_idUser_blob=Memcache.Get(Key)IfUser_blobIsNone: User=Mysql.Query(The select * from users WHERE user_id=\ "%s\" " User_id) if user: memcache. Set (keyjson. Dumps (user return userelse: return json. Loads (user_blob)
|
Database Cache
The database cache is not contaminated with application code, and some gifted DBAs can even improve system performance by tuning the database without modifying any code. For example, by configuring the Cassandra row cache.
Memory Cache
To improve performance, the cache is typically stored in memory. Common memory caches include memcached and Redis. However, the use of this approach still requires a reasonable trade-off. We can't peremptorily all the data in memory, although this can greatly improve performance, but compared to disk storage, the cost of RAM is more expensive, but also affect the robustness of the system, because in-memory data is not persisted, easy to lose. As mentioned earlier, we should put the required data into the cache, the usual algorithm is least recently used, which is LRU.
Cdn
Another common practice for improving performance and reducing load on Web servers is to put static media into the CDN (Content distribution Network). As shown in the following:
CDN can effectively share the pressure of the Web server, allowing the application server to concentrate on dynamic pages, while the CDN can also improve the performance of response requests by geographical distribution. Once the CDN is set up, when the system receives the request, it first asks the CDN for the static media required in the request (usually through the HTTP header to configure what the CDN can cache). If the requested content is not available, the CDN queries the server for the file, caches it locally on the CDN, and then provides it to the requestor. If the current site is not large, the effect of introducing a CDN is not obvious, you can consider temporarily not using a CDN, in the future, you can use some lightweight HTTP server, such as Nginx, for static media to separate a dedicated subdomain such as static.domain.com to provide services.
Cache invalidation
The problem with introducing the cache is how to guarantee the consistency between the real data and the cached data. This problem is often referred to as cache invalidation (caches invalidation). From a strategically advantageous position perspective, the solution to this problem is nothing more than updating the data in the cache. One approach is to write the new value directly into the cache (often referred to as Write-through cache), or simply delete the value in the cache and generate it when the next read of the cache value is done.
Overall, to avoid caching effectiveness, you can rely on database caching, or add a valid period for cached data, or try to avoid this problem when implementing application logic. For example, do not directly use delete from a WHERE ... To delete the data, instead of querying the qualifying data, the corresponding data in the cache is invalidated, and the rows are explicitly deleted based on their primary key.
Off-line processing
This article also mentions the way off-line handles requests by introducing Message Queuing. In fact, this approach is also a common practice in most enterprise software systems. This architecture is described in more detail in my article "Case study: Message-based distributed architecture". When Message Queuing is introduced, the Web server acts as the publisher of the message, while the other end of the message queue can provide consumer consumer as needed. As shown in. Whether or not a off-line task is performed is usually known by polling or callback.
To better code readability, you can explicitly indicate whether the task is on-line or off-line in the exposed interface definition.
The introduction of the message Queue can greatly relieve the pressure on the Web server because it can take longer-time tasks to a dedicated machine.
In addition, by introducing timed tasks, you can also effectively take advantage of the Web server's idle time to handle background tasks. For example, you perform a daily, weekly, or monthly scheduled task through a spring Batch job. If more than one machine is required to perform these scheduled tasks, the puppet provided by spring can be introduced to manage these servers. Puppet provides a highly readable declarative language to complete the configuration of the machine.
Map-reduce
For the processing of big data, map-reduce can be introduced naturally. It is necessary to introduce a map-reduce layer for the whole system to process the data. Map-reduce support for scalability is better relative to how SQL databases are used as data centers. Map-reduce can be combined with the timing mechanism of the task. As shown in the following:
Platform Layer
Larson that most systems are Web applications that communicate directly with the database, but it might be better to join a platform layer (Platform layer).
First, detach the platform from the Web application, allowing them to scale independently. For example, if you need to add a new API, you can add a new platform server without adding a Web server. You know, in such a separate physical layered architecture, different levels of server requirements are not the same. For example, for a database server, the IO performance of the database server, such as using SSDs as much as possible, should be ensured due to frequent disk I/O operations. For the Web server, the CPU requirements are high, as far as possible to use a multi-core CPU.
Secondly, an additional platform layer can be added to improve the reusability of the system effectively. For example, we can extract some of the features that are common to the system and the crosscutting concerns (such as support for caching, access to databases, and so on) to the platform layer as an infrastructure for the entire system (Infrastructure). This architecture, especially for product line systems, can provide better service to multiple products.
Finally, this architecture can also be beneficial for cross-team development. The platform can pull out some product-independent interfaces to hide the specifics of its implementation. If the division is reasonable, and can design a relatively stable interface, you can make each team can be developed in parallel. For example, you can set up a platform team dedicated to platform implementation and optimization.
[Reprint] Architecture experience with scalable systems