This is a creation in Article, where the information may have evolved or changed.
Background
Today, the main share of the micro-service in the auto scale, watercress in March 2005, is a relatively long history of the internet company, mainly covers the culture of integrated areas of the web, apps and other products, now has watercress reading, watercress film, watercress music and so on.
Markdown
Introduction of the Watercress
In the technical aspect, the main development language of watercress is Python and golang, watercress has self-developed private cloud Platform Douban app Engine (hereinafter referred to as "DAE"), the above managed Douban all applications use configuration to describe the application: application dependent MQ, Daemon, and Cron, This allows the developer to use a profile to describe all the requirements for the resource, and the platform can then do the required resource management on top of the description file.
Markdown
〓douban App Engine
All resources are uniformly dispatched on the DAE, and the product developer does not have to care about the specific machine equipment, such as how many physical machines are needed for a line of business, all resources are deployed uniformly by the platform, all development processes, from PR to testing, and on-line is a unified process.
Markdown
Status Quo
Watercress starting from 2012 to do service, mainly because with the product line more and more, monomer application has been unable to meet the requirements of the development efficiency of watercress, therefore, starting from 2012 gradually began to split the entire site, as of the end of 2016, more than 90% of the products have been completed product service, due to the full pre-preparation, In this process, the performance of the entire site has not changed, not because of the performance of the service has been reduced, the full stack availability has been improved, not because the new recruits to do a less successful submission, and led to the entire stack hanging off, the release from the daily release to on-demand release, service, All on-line processes are controlled by each line of business.
Return to today's theme, why watercress to do auto scale, do a micro-service within the company will form a lot of dependencies between the system, micro-service, the entire system structure is difficult to use a hierarchical model to describe, presented a complex network of dependencies, a large number of applications have staggered dependencies between each other.
Goal
Markdown
Here's the problem, when it comes to single application, OPS deployment application is actually very simple, deploy the application on one machine, deploy the corresponding process according to the performance of the machine, and add the server directly when the request comes to insufficient processing ability.
After microservices, operators have a large number of micro-services, each service request volume and performance are different, operation and maintenance is difficult to manually maintain how much computing resources for each specific application, which brings a problem: How to enable the application can both run well and run the province, if each application is given enough resources, This is obviously not an economical way, and there is an urgent need for a way to improve server utilization and automation.
Markdown
The way before service
The time of the Watercress micro-service started in December 2013, because too early, some of the popular open source technology was not used at the time, February 2014, began to carry out simple field application, March, can already take over more than 140 applications deployed on the platform. While the company was in the process of service, only a small number of applications deployed microservices to enjoy the benefits of auto scale.
March 14, Docker has not come out, watercress at this time is a pure Python company, using Python to do application packaging, in the second half of 2014, began to discover the great advantages of Docker, so did the whole platform of Docker transformation, 2015 Q2 completed the overall Docker transformation.
The development of the entire autoscale is starting with the 140 applications mentioned earlier, with service-based recommendations, continuously optimizing performance and improving stability, and currently 500 managed applications across the platform are under auto scale.
Markdown
〓dae scale model
A git repo map to resource allocation, scheduling, billing, and logical collections on the Dae platform, each of which is globally unique and independent from each other, and is called by Thrift/pidl/http RPC, and watercress supports multi-instances for web and service services , such as an application need to have a Web page, also need Thrift service, can use a code up two different services, where the main service of Web services from external users every day, Thrift service is internal call, magnitude than external access is much larger, Auto Scale provides a mechanism by which two instance are separated by configuration, Austo scale provides a mechanism to divide the two instance by configuration, and scale is done separately.
Markdown
DAE Scale is a multi-process model, each request is handled by a process, the whole system can be understood as the total number of workers and application service processing capacity directly related, node is a physical machine, some node will deploy an application, some deployment of several applications, from the platform resource scheduling, All stateless nodes are equal, the difference is mainly in the CPU and mem, node and the app is a many-to-many relationship, each app in each node in the number of workers (worker process), is the key to auto scale, will be put into the database.
Markdown
Theoretically, all applications are equal, but some applications are less important than other applications, require physical hard isolation, pool is a set of nodes, in a number of separated pool, some pool is within the company's small application, only limited support, computing resources are very limited, More important applications are divided into two levels, Production pool and stable pool and some very important applications, is the company's core products, they will use their own resources, in addition, some departments have their own computing resources, hope to use the mechanism of platform transfer, In this case, the Department can contribute its own computing resources to a pool alone.
Markdown
is the entire auto scale structure, the bottom of the node has monitor, will collect the above various performance indicators, including memory, CPU, Load, the current process of the busy level, collected and sent to a number of places, mainly for monitoring personnel to view, in the application of auto scale , in fact, its performance is not good enough, so generally will save a copy of the data Redis inside, after Redis, the consolidated data called Bridge, the data collected on the bridge node further processing encapsulated a number of APIs, including the app, node, and the pool API. All application scale policies are configured in this application.
The real application of Auto scale strategy is DAE Scale,scale this piece will have a number of cron jobs, do the corresponding things, such as the app's Auto Scale,web Auto scale, really go to scale their worker, There will be a series of strategies, for the pool, the most core of the application of watercress, computing resources are shared, the Movie, group and SNS is the most core of three business, will monopolize a pool, but the number of nodes in the pool each day, is a dynamic adjustment process, There will be a special pool to do elastic scaling.
After the calculation is done in scale, the next step is to scale up to an app, or scale down an app, or move an app from one node to another, or put an app on a node or take it off, scale produces these behaviors, Finally landed and returned to the node, the node above some tools to expose the API, let scale back to the node above, so that the entire process string up.
After the external application update, will change the nginx configuration, different nodes the same application of different contacts, configuration response to different weights, processing capacity of the node to get more requests, weak nodes to get fewer requests, this is the overall structure of scale.
Markdown
Collect Data
About collecting data, watercress is a multi-process model, how to judge the process in the work did not? is to count, each process before the run will make a mark, after the completion of the mark will be deleted, monitor will be to the number of the machine, how many workers on this machine, how many tags, the tender is busy node, busy process, here is a little trick, the mark on the memory disk, It can't be seen on the system.
Markdown
〓scale Application Strategy
The entire app scale has two strategies, one is how many processes are busy for each application instance, need a strategy to calculate, exactly how many processes, watercress tried a number of strategies, such as NOOP, some applications do not need scale, with a few processes is enough, do not need scale, No interference is required, but busy_ratio_mark_cnt is required to intervene, Rps_ratio can be calculated based on the application's specific request volume and response time, and how many workers should be required at the moment.
There is also a ondemand_rps process, the Auto scale is based on the assumption that all requests are a gradual growth and reduction process, but there are exceptions, such as the app to do a large-scale push, or an ad, when there is expected to push or after the ad release, there will be a lot of replay, At this point the curve of auto scale will be more obvious, because it has no way with the guidance RPS abrupt rise, the watercress processing method is to give scale to provide an API, the initiative back to the application developers, let him in accordance with the size of the push, estimated the size, average return rate, feedback to scale, The scale will do a quick pull-up action, immediately push the required process up for a while, wait for the request to come over, and then slowly down as the request passes.
Ondemand_rps also has a bit of experience, is the need to have a approximate estimated time, after the scale up, if the request has not come, then the auto scale will quickly intervene, if there is not enough request it will start down, for example, to maintain half an hour, this is a product experience value, If the half-hour request has not come, perhaps this time will not have such a strong request.
Markdown
about which node plus process, there are two policies: one is completely equal, such as an application on 5 nodes need 20 processes, each node to do 4, absolute equalization, some applications require each node processing capacity is quite, but, more nodes allow different nodes on the processing capacity is not the same, This premise will slightly ensure that the nodes are not too poor, some processing power of the node will not get too many processes, such as 80% of the request is placed in a node, if the node has a problem, there will be problems, and should be when a node down, when the redundancy of the period can cover the problem of downtime.
Markdown
Each application will make a configuration, the application of the minimum number of nodes, the application is not the same, there is a large demand for higher, such as at least 10 nodes, some applications two nodes can be, this is through the scale to ensure that these operations do not break down the minimum requirements.
Therefore, the whole scale of the core of the truth, adjust the total worker and to each node above the process, in the project will sometimes VIP and ordinary separate, if a scale stuck, such as morning and evening peak, there will be corresponding alarm, if that time did not add processing capacity, It is likely that there will be a large area problem.
Scale some basic strategy, in the addition of workers must be fast, the request to the time to very fast response, but reduce the time need to slow, if quickly lost, may be bumpy more severe, in addition, the distribution between the nodes is not too uneven, the later simple calculation of the standard gap between the nodes, See if it's too far from the bad.
There is also a dimension, from the point of view of the node, the need to balance the load of each node in the same pool, in the same pool of nodes to do a balance between the main CPU, storage, can prioritize the use of small workers to take off, leaving some relatively large workers, After a while, the scale tends to stabilize, you will find some traffic is very large, some applications will gradually monopolize some nodes, this is what IT staff want to see.
For the VIP pool, they will have a concept of an elastic pool, there is a freed pool, we are currently idle resources are free pool, when the VIP Pool has the need for resources, it will be taken from the free pool, load down after it will return, In these large vippool between, in fact, is to share the entire computing resources, the main consideration is also the various products user access pattern is not too, some products in the daytime traffic is higher, the night is relatively low, and some conversely, the same group of machines in fact through the mechanism can be used more fully.
Markdown
Monitoring
Auto Scale is a complete automation tool, the watercress around it to do a lot of monitoring, the site all the resources are assigned to the auto scale to manage, once the auto scale any lag, or other circumstances, will have a very big problem, so there will be a large number of alarm to see related things, in addition , scale application is 5 minutes to do one, the frequency of higher meaning and not big, will cause the whole system bumps too much, also need to provide enough hand tools, when the auto scale itself in the situation, need to provide operators with adequate script, It is a very deadly thing if the automatic system is in condition and cannot be intervened manually.
There are some typical situations, such as the site is DDoS, the entire site visits directly down to the end, all workers are idle, when the auto scale began to intervene, will gradually put the worker downward, DDoS after the past these worker is not enough, So when a similar situation occurs, use the manual switch to stop it, but also need a strong log, so that the subsequent search problems can be quickly located.
Markdown
More
After the app Auto scale has been found, the watercress has a large number of offline MQ Consumer, before the product developers to deploy, should be adjusted to how much is more difficult to control, MQ Auto scale to find and external request class is not quite the same, after the external request came, No matter the response or not, it will soon go away, but MQ will not go after it comes, it will always be there, MQ may be getting longer, there is a point before the app did not pay attention to-line, queue at this point is very important, need to see if the queue is growing, if the growth needs to quickly add consumer.