From the simplest level, scalability is about doing more things. More things can be done in response to more user requests, do more work, or process more data. Designing software is inherently complex, and getting software to do more work has its own problems. This article puts forward some principles and guidelines for building scalable software systems.
1. Reduce processing time
One way to increase the amount of work done by the app is to reduce the time it takes to complete a single job. For example, reducing the time required to process a user request means that you can process more user requests in the same amount of time. Here are some examples of how this principle applies and some possible implementation strategies.
Collocated (collocation): reduces the overhead required to get the data you need by collocated data and code.
Caching: If data and code cannot be collocated, the data is cached to reduce the overhead of repeatedly fetching data.
Pooling: Reduce the overhead associated with their use by pooling expensive resources.
Parallelization: Reduce the time required to complete a unit of work by decomposing the problem and parallelizing the individual steps.
Partition Processing: Centralize the related processing process as much as possible by splitting the code and then partitioning the related partitions.
Remoting: Reduces the time it takes to access remote services, such as the ability to divide interfaces more coarse-grained. It should be borne in mind that remote or local is a clear design decision that cannot be changed back and forth. Also consider the first rule of distributed computing-don't distribute your objects.
Software developers always love to introduce abstractions and layers where they don't need them. Yes, these concepts are a good tool for decoupling software components, but they can add complexity and impact performance, especially if the data representation between each layer needs to be transformed. Therefore, the reduction of processing time is also important to ensure that the abstraction is not too abstract, and there is no excessive layering. In addition, for runtime services that we take for granted, it is necessary to understand their costs, because unless they provide a specific service level agreement, they are likely to eventually become bottlenecks in the application.
2. Partitioning
Reducing the processing time of a single unit of work achieves a good result, but when you reach the limit of a single-process scenario, you eventually need to scale the system horizontally. In a typical Web application, horizontal scaling can be as simple as adding more Web servers to handle user requests and then load-balancing them. However, you may find that some parts of the overall architecture become the focus of resource contention, because everything becomes busy at the same time. A good example is a single database server on all Web server back ends. When this single database server becomes a bottleneck, you have to change the approach, one of which is to adopt a partitioning strategy. In short, this involves breaking up a single part of the schema into smaller, easier-to-manage parts. Splitting a single element into smaller pieces allows for horizontal scaling, which is exactly the technology that large Web sites like ebay use to ensure that their architectures are scalable. Partitioning is a good solution, although you may find sacrificing consistency.
As for how to divide your system, it depends on the situation. A truly stateless component can simply scale horizontally, spreading the workload across all instances, allowing all instances of the component to run efficiently. On the other hand, if you need to maintain a state, you need to find a workload segmentation strategy that allows multiple instances of stateful components, each of which is responsible for working and/or a unique subset of the data.
3. Scalability is concurrency
Scalability is inherently associated with concurrency; After all, it's about doing more work at the same time. Technologies like earlier versions of EJB try to provide a simplified programming model that encourages us to write single-threaded components. Unfortunately, components tend to depend on other components or cause concurrency problems. If concurrency is not considered, the data in the system can easily become corrupted. On the other hand, too much protection around concurrency can cause the system to become essentially serial, limiting the ability to scale. Concurrent programming is not difficult, and there are some simple principles that can help when building a scalable system.
If you do need to hold locks (such as local objects, database objects, etc.), try to hold them as short as possible.
Try to reduce contention for shared resources and, as far as possible, compete to avoid critical processing paths (for example, through asynchronous dispatch work).
Any design for concurrency needs to be done in advance to be able to fully understand which resources can be safely shared and where potential scalability bottlenecks may be.
4. Need to know the needs
In order to build a successful software system, you need to know what your goals are and what you are doing with them. Although functional requirements are often clear, there is often a lack of non-functional requirements (or system quality requirements). If you really need to build a highly scalable software, you first need to investigate the following characteristics of key components/workflows:
Target average performance and peak performance (i.e. response time, latency, etc.).
Target average load and peak load (i.e. concurrent user, amount of information, etc.).
Acceptable limits to performance and scalability.
Performance may not be the most critical aspect, but you must be aware of this information as early as possible, because the way to handle scalability is determined by performance requirements.
5. Continuous Testing
Understanding the requirements allows you to start designing and building solutions. The design and writing code we put forward is actually static, so you cannot fully determine how it will work before you execute it. In addition, all decisions about performance and scalability should be supported by evidence, and the evidence should be collected and reviewed from the outset of the project and will continue thereafter. In other words, a measurable goal that runs through the system confirms and measures actual performance and considers performance at all stages of the project.
One of the most common mistakes is that our insights into system performance and scalability are confused by our own experience or hearsay. You may want to review other decisions made on the project, one of the reasons for this is to meet the non-functional nature of the system. For example, non-functional requirements may affect your choice not to use the standard, to switch to non-mainstream/popular things. Non-functional requirements may break the rigid dogma that the evidence trumps dogma.
6. Architecture-First
Perhaps the most important principle for building scalable systems is that if you need to make the system such a property, you must design it beforehand. Many people (including myself) are caught in the trap of thinking that you can build an application that automatically scales vertically (scale up) or horizontally (scale out), especially when a Java EE is just appearing. Applications designed to scale horizontally can almost always be scaled vertically, but applications designed for vertical scaling are almost impossible to scale horizontally. Most applications can be scaled vertically by running on more powerful hardware, but horizontal scaling is a more complex problem. For example, how do you make sure that data is consistent between application instances? How do you make your singleton and sync code blocks work across threads?
Of course, thinking ahead of the matter is not necessarily the same as making a waterfall-style, pre-large design. Iterations and agile processes are both helpful, and they provide a framework that helps us to make just enough design to solve the problem. Be pragmatic. Well, no matter how good we think we are at designing scalable applications, it's best to not trust yourself and write/test code as early as possible.
7. Focus on the global
Finally, remember to look at the whole picture--see the forest before seeing the trees. It is really easy for us to tweak components at a fine-grained code level, but ultimately it is the system that needs to be optimized as a whole. Pay attention to the performance and scalability of each link, sacrificing local optimization if necessary. If you need to use profiling tools to identify bottlenecks, do it, but don't rush ahead before you know the overall performance. Because performance is inversely proportional to the set of all waiting times for the entire system, any operation that increases the wait time faster than the load will become a problem. Despite all this, I would like to point out that if you find it difficult to meet performance and scalability goals, it's important to wonder if you've chosen the right architecture. Still, look at the big picture and make sure someone is taking the architect's responsibility.
Summarize
This article puts forward some principles and guidelines for building scalable applications, covering many different aspects of the software development process. Whoever builds a scalable system, the best advice I can give him is that you need to think clearly and design your system. Scalability is not magic, and it is not available for free. Finally, faster hardware may be able to save you for a while, but don't rely on it for good!
High-scale software architecture design