"Editor's words" with the rapid development of the company's business volume, the challenges faced by the platform are far greater than the business, the demand is increasing, the number of technical personnel increased, the complexity of the face increased greatly. In this context, the platform's technical architecture has also been completed from the traditional monomer application to the evolution of micro-service.
The evolutionary process of system architecture
Single Application architecture (first generation architecture)
This is the beginning of the platform, when the traffic is small, in order to save costs, and all applications are packaged into an application, the adoption of the schema for. NET SQL Server:
Represents the layer: at the outermost (topmost) level, closest to the user. Used to display data and receive user input data, to provide users with an interactive interface, the platform used is based on. NET form of Web.
Business Logic layer: it is undoubtedly the part of the system architecture that embodies the core value. Its focus is mainly on the development of business rules, the implementation of business processes and other business requirements related to the system design, that is, it is related to the domain of the system (domain) logic, and many times, the business logic layer is also called the domain layer. The position of the business logic layer in the architecture is critical, it is in the middle of the data access layer and the presentation layer, which plays the role of connecting link in the data exchange. Since the layer is a weakly coupled structure, the dependencies between layers and layers are downward, and the bottom layer is "ignorant" to the upper layer, and changing the upper-level design has no effect on the bottom of its invocation. If you follow the idea of interface-oriented design in a layered design, this downward dependency should also be a weak dependency. For the data access layer, it is the caller and, for the presentation layer, it is the callee.
Data Access layer: sometimes referred to as the persistence layer, its function is mainly responsible for database access, access to the database system, binary files, text documents or XML documents, the platform at this stage using hibernate.net SQL Server. The first generation of the
architecture looks simple, but it supports the early business development of the platform, satisfying the tens of thousands of-scale processing needs of site user visits. However, when the user traffic is growing at a large scale, the problem is exposed:
- Maintenance costs continue to increase: when there is a failure, it is possible to cause a failure of the combination of more, which will also lead to analysis of the failure, location failure, repair the cost of the fault correspondingly increased, the average repair cycle of failures will take a lot of time, and any one module failure will affect other application modules In the absence of a developer's deep understanding of global functionality, fixing a failure, often introducing other failures, leads the process into a vicious circle of "more repairs, more failures."
- Poor scalability: All of the application's functional code runs on the same server, which can make it difficult to scale the application horizontally, using only vertical scaling.
- Longer delivery cycles: any minor changes to the application and code submissions trigger code compilation for the entire application, running unit tests, code checking, building and generating deployment packages, validating functions, and so on, which also makes the feedback period of the version grow longer and the efficiency of the build within the unit time becomes low.
- New training cycles grow: As applications become more functional and the code becomes more complex, it can take longer to understand the business context, familiarize the application, and configure the local development environment for the new members of the team.
Vertical Application Architecture (second generation architecture)
To solve the problems faced by the first generation architecture, the team developed the following strategy and formed the second-generation application Architecture (vertical application architecture).
- Application is split into separate application modules.
- Individual application modules are deployed independently and the session is maintained to solve the horizontal scaling problem of the application module in load Balancing.
Sticky is a cookie-based load balancing solution that enables the client to maintain session with the backend server through a cookie, guaranteeing that the same client access is the same back-end server under certain conditions. The request came, the server sent a cookie, and said: Next time to bring, come directly to me! In the project, we used the Session_sticky module in the Taobao Open source Tengine.
- The database is split into different databases, accessed by the corresponding application.
- Domain name splitting.
- Static and dynamic separation.
You can see that the second-generation architecture addresses the level scale extension of the application level, which has been optimized to support the access needs of hundreds of thousands of of users, some of which have been rewritten using Java to complete the MVC architecture. There are, of course, some problems.
- Applications are highly coupled and heavily interdependent.
- The interaction between the application modules is complex, sometimes directly accessing the counterpart module database.
- Databases involve too many correlated queries and slow queries, and database optimization is difficult.
- A single point of access to the database is critical and the failure cannot be resumed.
- The data replication problem is serious, causing a large amount of data inconsistency.
We tried to solve the scaling problem with SQL Server always on, but the experiment found that there was a delay of at least 10s during the copy process, so this scenario was abandoned.
- System expansion is difficult.
- Each development team fighting each other, the development efficiency is low.
- The testing effort is huge and publishing is difficult.
MicroServices Architecture (Platform status: Third generation architecture)
In order to solve the problems of the first and second generation architectures, we have combed the platform and optimized them. Based on the platform's business needs and a summary of the 12th generation architecture, we identified the core requirements of the third generation architecture:
- Core business extraction, as an independent service external services.
- Service modules are continuously deployed independently, reducing release lead times.
- The database is divided into tables by service.
- Use caching heavily to improve access.
- Interactions between systems use a lightweight rest protocol, eliminating the RPC protocol.
- To. NET, the development language uses Java to implement.
Based on this, the reconstruction of the third-generation architecture of the platform is done.
Look at the composition of the third-generation architecture, mainly divided into eight parts:
- CDN:CDN system is responsible for the real-time based on the network traffic and the connection of each node, the load situation and the distance to the user and response time and other comprehensive information to the user's request to re-direct to the user's nearest service node. The goal is to enable users to get the content they need, solve the congestion of Internet networks, and improve the responsiveness of users to websites.
When choosing a CDN vendor, the platform needs to consider the length of business, whether there are expandable bandwidth resources, flexible traffic and bandwidth choices, stable nodes, and price/performance ratio.
- LB Layer: The platform includes many business domains, different business domains have different clusters, the LB layer (load Balancer) is a load balancing service for traffic distribution of multiple business servers, extending the service capability of the application system through traffic distribution, and eliminating the single point of failure improves the usability of the application system.
Choose which kind of load, need to consider a variety of factors (whether to meet high concurrency high performance, Session maintenance How to solve, load balancing algorithm, support compression, memory consumption of cache), mainly divided into the following two kinds:
LVS: Working on Layer 4, Linux implements high-performance, highly concurrent, scalable, and reliable load balancers that support multiple forwarding methods (NAT, Dr, IP Tunneling), where Dr Mode supports load balancing over a WAN. Support Dual standby hot (keepalived or Heartbeat). The dependence on the network environment is relatively high.
Nginx: 7-tier, event-driven, asynchronous, non-blocking architecture, high concurrency support for multi-process load balancer/reverse proxy software. You can do some diversion for HTTP for domain names, directory structures, and regular rules. The failure of the server is detected through the port, such as the status code returned by the server processing the page, timeout, and so on, and will return the wrong request to another node, but the disadvantage is that the URL is not supported to detect. For session sticky, we do this through a cookie-based extended nginx-sticky-module. This is also the platform currently used by the scheme.
- Business layer: representative of the platform of a certain area of business services provided, for the platform, there are goods, members, live, order, finance, forum and other systems, different systems to provide different areas of service.
- Gateways and Registries: Provides a unified bottom-level microservices API entry and registration management. Encapsulates the internal system architecture and provides Rest APIs to each client, while monitoring, load balancing, caching, service degradation, throttling, and other responsibilities are implemented. At present, the platform adopts Nginx Consul to realize.
- Service layer: This layer is a small, autonomous service that works together, and the platform determines the boundaries of the service based on the boundaries of the business, with each service focusing only on its own boundaries. This layer is built on Spring Cloud.
- Infrastructure Layer: This layer provides infrastructure services for upper-level services, mainly in the following categories:
Redis cluster: Provides caching services at high response times and memory operations to the upper layer.
MongoDB cluster: Due to the features of MongoDB's flexible document model, highly available replication set, and extensible Shard Cluster, the platform provides storage services such as articles, posts, link logs, and so on. The MongoDB cluster uses a replicated shard architecture to address usability and extensibility issues.
MySQL cluster: Stores data that has transactional requirements for members, commodities, orders, and so on.
Kafka: All messaging services that support the platform.
ES (Elasticsearch): Provides a platform for products, members, orders, logs and other search services.
- Integration layer: This feature is the biggest highlight of the entire platform, including the practice of continuous integration of CI, continuous delivery CD, DevOps culture, allowing everyone to participate in delivery, with the delivery of standardized processes and standards to complete the automatic deployment of a service release, thereby improving the overall efficiency of the release delivery link.
- Monitoring layer: Splitting the system into smaller, finer-grained microservices offers many benefits to the platform, but it also increases the operational complexity of the platform system. The task service provided to the end user is a large number of microservices to complete, and an initial call will eventually trigger multiple downstream service calls, how can the request flow be rebuilt to reproduce and resolve the problem? Deploying an open source Open-falcon platform for this purpose provides application-level monitoring, analysis using ELK to provide application logs, link log tracking using self-built services, and a unified configuration service based on Spring Config Server practice.
How micro-service teams work
> Conway's Law: When any organization designs a system, the delivered design is structurally aligned with the organization's communication structure.
Working style
In implementing the third generation architecture, we made several adjustments to the team organization:
- According to the division of business boundaries, in a team full stack, let the team autonomy, in this way, to maintain the cost of communication within the system, each subsystem will be more cohesive, each other's dependence coupling can become weaker, cross-system communication costs can be reduced.
- An architect department was set up to oversee the implementation of the third generation architecture. It is usually a reasonable structure for a team of architects to have a system architecture, application architecture, operations, DBA, and agile expert five roles. Then how to control the output of the structure group and ensure the smooth implementation of the framework work?
- First: Creating a self-organizing culture of continuous improvement is the key cornerstone of implementing microservices. Only continuous improvement, continuous learning and feedback, and continuously create such a culture atmosphere and team, the micro-service architecture can continue to develop, maintain fresh vitality, so as to achieve our original intention.
- Second: The architecture group of products to undergo a rigorous process, because the architecture group is the implementation of a common solution, in order to ensure the quality of the scheme, we have a rigorous closed-loop from the project research to the review and implementation.
Talk about the entire team's delivery process and development model, if not predefined, it is difficult for the microservices architecture to play a real value, let's look at the micro-service architecture delivery process first.
Using a microservices architecture to develop applications, we are actually designing, developing, testing, and deploying micro-services, because each service is not dependent on each other, and the approximate delivery process is like this.
Design phase:
The schema group splits the product functionality into several microservices, designing API interfaces for each microservices (for example, the REST API), and providing API documentation, including the API's name, version, request parameters, response results, error codes, and so on.
In the development phase, the development engineer to implement the API interface, also includes the completion of the API unit testing work, in the meantime, the front-end engineers will develop the Web UI part in parallel, according to the API documentation to create some false data (we call "mock data"), so that front-end engineers do not have to wait for the backend API full Department development completed, in order to start their own work, the realization of the front-end parallel development.
Test phase:
This phase of the process is fully automated process, the developer submits code to the code server, the code server triggers the continuous integration build, test, if the test pass will be automatically pushed through the Ansible script to the simulation environment, in practice for the online environment is the first to go through the audit process, through before it can be pushed to the production environment. Improves productivity and controls some of the online instability that may be caused by inadequate testing.
Development model
In the above delivery process, the development, testing, and deployment of these three phases may involve control of code behavior, we also need to develop a relevant development model to ensure that many people can work well together.
Practice "strangling pattern":
Due to the large span of the third generation architecture and the inability to modify the. NET legacy system, we adopt the strangling model, add new proxy proxy microservices outside the legacy system, and in lb control upstream, instead of directly modifying the original system, gradually implement the replacement of the old system.
Development specification:
Experience shows that we need to make good use of the Code version control system, I have encountered a development team, because the branch has no specifications, the last small version of the line code actually a few hours, and finally developers do not know which branch to close. For GitLab, it's a good way to support multi-branch code versions, and we need to use this feature to improve development efficiency, which is our current branch management specification.
The most stable code is placed on the master branch, and we do not commit the code directly on the master branch, only the code merge operation on that branch, such as merging the code of the other branches onto the master branch.
The code in our daily development needs to pull a develop branch from the master branch, which everyone can access, but in general we don't commit the code directly on that branch, and the code is also merged from other branches into the develop branch.
When we need to develop a feature, we need to pull out a feature branch from the develop branch, such as Feature-1 and feature-2, to develop specific features in parallel on those branches.
When the feature has been developed, we decide that we need to publish a version, where we need to pull a release branch from the develop branch, such as release-1.0.0, and merge the features that need to be published from the relevant feature branch to the release branch, which will then target The release branch is pushed to the test environment where the test engineer does functional testing and the development engineer modifies the bug on that branch. When the test engineer cannot find any bugs, we can deploy the release branch to the pre-set environment, and once again verify that there are no bugs, then deploy the release branch to the production environment. After the line is complete, merge the code on the release branch into the develop branch with the master branch and a tag on the master branch, such as v1.0.0.
When a bug is found in the production environment, we need to pull a hotfix branch (such as hotfix-1.0.1) from the corresponding tag (for example, v1.0.0) and make a bug fix on that branch. After the bug is fully repaired, you need to merge the code on the hotfix branch into the develop branch and the master branch at the same time.
For the version number we also have requirements, in the format: x.y.z, where x is used for significant refactoring will not be upgraded, Y is used for new features to be released when the upgrade, Z used to modify a bug before the upgrade. For each micro-service, we need to strictly follow the above development model to execute.
Micro-Service development system
We have described the architecture, delivery process, and development model of the MicroServices team, so let's talk about the micro-service development system.
What is micro-service architecture
Martin Flower's definition:
in short, the Microservice architectural style was an approach to developing a single application as a suite of small Services, each running in its own process and communicating with lightweight mechanisms, often an HTTP resource API. These services is built around business capabilities and independently deployable by fully automated deployment machinery . There is a bare minimum of centralized management of these services, which could be written in different programming Languag ES and use different data storage technologies
To put it simply, microservices are a design style in the architecture of a software system that advocates the partitioning of a separate system into smaller services that run in separate processes, communicating between services through a RESTful, HTTP-based API. Each of the microservices that is split is built around one or some of the more highly coupled services in the system, and each service maintains its own data storage, business development, automated test cases, and standalone deployment mechanisms. Because of the lightweight communication mechanism, these microservices can be written in a language that is not available.
Micro-service split granularity
In the end, the micro-service is split to a large granularity, more often it is necessary to find a balance between the granularity and the team, the smaller the micro-service, the more benefits of the micro-service independence. But managing a large number of microservices can be more complex. Basically splitting requires following several principles:
- The single principle of responsibility: "To bring together things that change for the same reason, to separate things that change for different reasons". This principle is adopted to determine the micro-service boundary.
- Team Autonomy principle: The larger the team, the higher the cost of communication and assistance, we in the practice of a team will not exceed 8 people, the whole stack within the team, is a full-featured team.
- First sub-database, sub-service: The data model can be completely separated, determines whether the boundary function of microservices is completely clear, in practice, we first discuss the data model boundary, the boundary of the data model map the business boundary, and then the bottom up to complete the service split.
How to build a micro-service architecture
In order to set up a good micro-service architecture, technology selection is a very important stage, only choose the right "actor", in order to play the game well.
We use Spring Cloud as a microservices development framework, Spring Boot has embedded Tomcat, can run a jar package directly to publish microservices, but it also provides a series of "out of the box" plug-in, such as: Configuration Center, service registration and discovery, fuses, routing, proxy , control bus, one-time token, global lock, leader election, distributed session, cluster status, etc., can improve our development efficiency a lot.
Engineering Structure Specification
Is the project structure that each service should have in our practice.
which
- MicroServices Name Service:
Provides service invocation for internal other microservices. The Service Name API module defines the interface specification for inter-service, using the Swagger rest interface definition. The Service Name server module contains the applications and configurations that can start the service directly.
- Micro-service Name Web:
The portal for the upper-level WEB application request, which typically calls the underlying microservices to complete the request.
API Gateway Practice
API Gateway serves as an access point for all microservices and APIs in the backend, auditing, streaming, monitoring, billing, and so on for micro services and APIs. Common API Gateway solutions are:
- Application Layer Scenarios
The most famous of course is Netflix's Zuul, but this does not mean that the solution is best for you, such as NETFILX because of the use of AWS, the infrastructure control is limited, so you have to do in the application layer Zuul Such a scenario, if the overall consideration, this is not the most appropriate or most of the solution 。
However, if your team has limited control over the overall technical facilities and the team is not well-structured, the application-level approach may be the best solution for you.
- Nginx Lua Solution
Also we adopt and think that the most suitable solution, openresty and Kong is more mature option, but Kong use Postgres or Cassandra, domestic companies estimate the choice of these two goods, but Kong's HTTP API design is very good.
- Our Solutions
Using the Nginx Lua Consul combination scheme, although the majority of our team is Java, the choice of ZooKeeper will be more natural choice, but as the new Sharp faction, the results of the test analysis, we finally choose to use Consul.
Good HTTP API support, can dynamically manage the upstreams, which also means that we can through the publishing platform or glue system seamless implementation of service registration and discovery, the access to the service transparent.
In the above scenario:
Consul as state storage or configuration center (mainly using Consul KV storage function), Nginx as API Gateway, dynamically distributes traffic to the configured Upstreams node according to the related configuration of upstreams in Consul;
Nginx is connected to the Consul cluster according to the configuration item;
Launched API or micro-service instance, through the manual/command line/release deployment platform, the instance information registration/write Consul;
Nginx obtains the corresponding Upstreams information update, then dynamically changes the Upstreams distribution configuration inside Nginx, thus routing and distributing traffic to the corresponding API and micro-service instance node;
By curing the above registration and discovery logic through a script or a unified release deployment platform, transparent service access and scaling can be achieved.
Link Monitoring Practice
We found that before the single application of log monitoring is very simple, in the micro-service architecture has become a big problem, if unable to track the business flow, unable to locate the problem, we will spend a lot of time to find and locate the problem, in the complex microservices interaction, we will be very passive, when distributed link monitoring came into being, The core is the call chain. Through a global ID will be distributed across the service node on the same request in series, restore the original call relationship, tracking system problems, analysis call data, statistical system indicators.
Distributed Link Tracking was first seen in the 2010 Google published a paper "Dapper".
So let's take a look at what a call chain is, and the call chain actually restores a distributed request to the call link. Explicitly in the backend to see the call of a distributed request, such as time-consuming on each node, the request specific to which machine, the request status of each service node, and so on. It reflects the number of services and service levels that have been experienced in a request (such as your system a call b,b C, the level of this request is 3), and if you find that some request levels are greater than 10, the service is likely to need to be optimized.
Common solutions are:
- Pinpoint
GitHub Address: Github-naver/pinpoint:pinpoint is a open source APM (Application performance Management) tool for Lar Ge-scale distributed systems written in Java.
friends interested in APM should look at this open source project, a Korean team open source, through the javaagent mechanism to do bytecode code implantation (probe), to join Traceid and crawl performance data. The performance analysis of tools such as Newrelic and OneAPM on the Java platform is also a similar mechanism.
- Zipkin
Official website: Openzipkin A Distributed Tracing System
GitHub Address: Github-openzipkin/zipkin:zipkin is a distributed tracing
system This is the Twitter open source, but also refer to the Dapper system to do.
The Java application of Zipkin is implemented by a component called brave to achieve performance analysis data acquisition within the application.
Brave GitHub Address: Https://github.com/openzipkin/brave
This component implements a sequence of Java interceptors to track the call process for Http/servlet requests and database accesses 。 You then complete the performance data collection for your Java application by adding these interceptors to a configuration file such as Spring.
- CAT
GitHub Address: github-dianping/cat:central application Tracking
This is the public reviews open source, the implementation of the function is also quite rich, there are some companies in the domestic use. However, CAT implements tracking by hard coding in the code to write some "buried point", that is, intrusive.
This has advantages and disadvantages, the advantage is that they can be buried in their own places, more targeted; the downside is the need to change existing systems, and many development teams are reluctant.
In the first three tools, if you do not want to repeat the wheel, I recommend the order is pinpoint->zipkin->cat. The reason is simple, that is, the three tools for the program source code and configuration file intrusion, is incremented in turn.
Our solutions:
For MicroServices, we have expanded our microservices architecture on the basis of Spring Cloud and designed a distributed tracking system (WEAPM) based on the concept of Google Dapper for a micro-service architecture.
As shown, we can query the log of responses through parameters such as service name, time, log type, method name, exception level, interface time-consuming, and so on. The resulting trackid can query the entire link log of the request, providing great convenience for reproducing the problem and analyzing the log.
Circuit Breaker Practice
In the micro-service architecture, we split the system into a micro-service, it is possible because of network reasons or rely on the service itself, the problem of call failure or delay, and these problems will directly lead to the caller's external service also delay, if the caller's request is increasing at this time, Finally, there will be a backlog of tasks due to waiting for the failure of the relying party response, resulting in the paralysis of their own services. In order to solve this problem, the circuit breaker mode is produced.
We have used hystrix in practice to realize the function of circuit breaker. Hystrix is one of Netflix's Open source MicroServices framework packages designed to provide greater fault tolerance for latency and failures by controlling the nodes that access remote systems, services, and third-party libraries. The Hystrix features thread and signal isolation with fallback mechanism and breaker functionality, request caching and request packaging, and monitoring and configuration.
The circuit breaker uses the following procedure:
Enable circuit breakers
@SpringBootApplication @enablecircuitbreakerpublic class Application {public static void main (string[] args) {Springapp Lication.run (Dvoicewebapplication.class, args); }}
Alternative use mode
@Componentpublic class Storeintegration {@HystrixCommand (Fallbackmethod = "defaultstores") public Object getstores (Map <string, object> parameters) {//do stuff that might fail}public Object defaultstores (map<string, object> p Arameters) {return/* something useful */; }}
Configuration file
Resource control practices
Talk to the resource control, estimated that a lot of small partners will contact Docker,docker is really a good solution to achieve resource control, we did research on whether to use Docker review, but ultimately choose to abandon, and use the Linux Libcgroup script control, The reasons are as follows:
- Docker is more suitable for large memory resource control, containerized, but our online servers are generally around 32G, using Docker will be a waste of resources.
- Using Docker can make operations complex and stressful from the business.
Why should there be group?
A common requirement in Linux systems is to limit the allocation of resources to one or some processes. That is, the concept of a set of containers, in which a specific proportion of CPU time, IO time, usable memory size, etc. are allocated.
So there is the concept of Cgroup, Cgroup is the controller group, originally proposed by Google's engineers, and later integrated into the Linux kernel, Docker is based on this to achieve.
Libcgroup Use process:
Installation
Yum Install Libcgroup
Start the service
Service Cgconfig Start
Profile template (in memory as an example):
Cat/etc/cgconfig.conf
See the memory subsystem is mounted under directory/sys/fs/cgroup/memory, enter this directory to create a folder, created a control group.
mkdir Testecho "service process number" >> tasks (tasks are a file under the test directory)
This will add the current terminal process to the memory limit of the cgroup.
Summarize
Summing up, this article from the background of our micro-service practice, introduced the micro-service practice, technology selection, and some related micro-service technology. Includes: API Gateway, registration Center, circuit breaker, etc. Believe that these technologies will give you some new ideas in practice.
Original link: Start with Spring Cloud and talk about the practical way of microservices architecture
Start with Spring Cloud and talk about the path to microservices architecture practices