Today's headlines go to build according micro-service Practice

Last Update:2017-06-23 Source: Internet

Author: User

Tags cas

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This is a creation in Article, where the information may have evolved or changed.

Today's headlines using the go language to build a large-scale microservices architecture, this paper combines the go language features to explain the concurrency, time-out control, performance in the construction of micro-services in the practice

Editor's note: This article is from the public number "InfoQ" (Id:infoqchina), author Shang; 36 Krypton is authorized for release.

Today's headlines more than 80% of the current backend service traffic is running on go-built services. With over 100 microservices, a peak QPS of more than 7 million, and more than 300 billion daily processing requests, it is the largest Go app in the industry.

Go to build a micro-service journey

Before 2015, the main programming language for headlines was Python and some C + +. With the rapid growth of business and traffic, the pressure on the service side is increasing, and the problem comes out frequently. The explanatory language features of Python and its backward multi-process service model have been challenged greatly. In addition, the service-side architecture at that time was a typical monolithic architecture, with severe coupling and some independent functions that needed to be detached from the monolithic architecture.

Why Choose Go language?

The Go language has several natural advantages over other languages:

Simple syntax, quick to get started

High performance, fast compilation and low development efficiency

Native support concurrency, the co-process model is a very good service-side model, but also suitable for network calls

Easy deployment, small build-up package, almost no dependencies

The 1.4 version of Go was released, and I started using the go language to develop back-end components when Go was in version 1.1, and I used go to build a back-end service with very large traffic, so I was more confident about the stability of the go language itself. In addition to the headline backend holistic service architecture, the decision to use the Go language to build a microservices architecture for today's headline backend.

In June 2015, today's headlines began using the Go language to refactor back-end Feed stream services, while refactoring, iterating over existing business, and splitting the service until June 2016, when the feed-stream backend service was almost completely migrated to go. Because of the rapid growth of the business, the inclusion of services split, so there is no horizontal comparison of the various indicators before and after reconstruction. But after switching to the go language, the overall stability and performance of the service are greatly improved.

Micro-Service Architecture

For complex inter-service calls, we abstracted the concept of a five-tuple: (From, Fromcluster, to, Tocluster, Method). Each of the five tuples uniquely defines a class of RPC calls. With a five-tuple unit, we built a suite of microservices architectures.

We developed the internal microservices framework kite using the Go language, which is fully compatible with Thrift. With five-tuple as the base unit, we integrate service registration and discovery on the kite framework, distributed load balancing, timeout and fuse management, service degradation, Method level metrics monitoring, and distributed call chain tracking functions. Currently using the kite framework to develop internal Go language services, the overall architecture supports unlimited horizontal scaling.

About kite framework and microservices architecture implementation details There are opportunities to share, and here's how the go language itself brings us to the convenience of building a large-scale microservices architecture using go and the experience we've gained in our practice. The content mainly includes concurrency, performance, monitoring, and some experience with the use of the Go language.

Concurrent

Go as a new programming language, the biggest feature is that it is native support concurrency. Unlike traditional OS-based threading and process implementations, the Go language concurrency is based on user-state concurrency, which becomes very lightweight and can easily run tens of thousands of or even hundreds of thousands of of concurrency logic. So the service-side application using Go developed is the "co-process Model", and each request is completed by a separate coprocessor process.

Compared to the process threading model, a few orders of magnitude of concurrency, and relative to the server-side model based on event callbacks, the go development approach is more consistent with human logic processing thinking, so even using Go to develop large-scale projects, it is easy to maintain.

Concurrency model

The concurrency of Go is an implementation of the CSP concurrency model, and the core concept of the CSP concurrency model is: "Do not communicate through shared memory, but should share memory through communication." The implementation of this in the Go language is Goroutine and Channel. In the CSP paper published in 1978, there is a description of how to solve the problem using CSP thinking.

"Problem:to print in ascending order all primes less than 10000. Use an array of processes, SIEVE, in which each process inputs a prime from its predecessor and prints it. The process then inputs a ascending stream of numbers from it predecessor and passes them on to its successor, Suppressi Ng any that is multiples of the original prime. "

To find out all the primes within 10000, the method used here is the Sieve method, which marks all the numbers that can be divisible by the prime number, starting from 2, every time a prime is found. Until there is no number to mark, the rest is prime. The following is an example of finding all primes within 10, and using CSP to solve this problem.

As you can see, each row of filters uses a separate concurrency handler to pass data through the adjacent concurrent handlers to communicate. With 4 concurrent handlers to get a list of primes within 10, the corresponding Go implementation code is as follows:

This example shows the two features of using the Go language development:

The concurrency of the Go language is simple and can improve processing efficiency by increasing concurrency.

A variable can be shared between the threads by means of communication.

concurrency control

When concurrency becomes the native nature of a language, concurrency is frequently used in practice to handle logical problems, especially those involving network I/O, such as RPC calls, database access, and so on. is an abstract description of a microservices processing request:

When request arrives at GW, GW needs to consolidate the results of the downstream 5 services to respond to this request, assuming there is no data dependency problem with the calls to the downstream 5 services. This will initiate 5 RPC requests at the same time, and then wait for 5 requests to return results. To avoid long waits, the concept of waiting time-outs is introduced here. After a timeout event occurs, in order to avoid a resource leak, an event is sent to the request that is being processed concurrently. In the process of practice, two kinds of abstract models are obtained.

Wait

Cancel

Wait and cancel two kinds of concurrency control methods, when using GO development services are everywhere, as long as the use of concurrency will be used in both modes. In the above example, after the GW initiates 5 concurrent RPC calls, the main process goes into a wait state and waits for the return result of the 5 RPC calls, which is the wait mode. In the other cancel mode, the total time-out for this request processing has been reached before the 5 RPC calls are returned, which will require the cancel of all outstanding RPC requests and end the process prematurely. The Wait mode is used more broadly, while for the Cancel mode it is mainly reflected in time-out control and resource recycling.

In the Go language, there are sync respectively. Waitgroup and context. Context to implement these two patterns.

Timeout control

Reasonable timeout control is very important in building a reliable large-scale microservices architecture, unreasonable timeout settings or invalid timeout settings will cause the service avalanche on the entire call chain.

The dependent service g in the figure is slow to respond for some reason, so requests from the upstream service are blocked on the call to service G. If the upstream service does not have a reasonable timeout control at this point, and the request blocking cannot be released on the service G, the upstream service itself will be affected and further affect the services on the entire call chain.

In the Go language, the model of the Server is the "co-model", a process that processes a request. If the current request processing process is slow to block due to dependent service response, it is easy to accumulate a large number of threads in a short period of time. Each of these processes consumes a different amount of memory due to different processing logic, and when the process data surges, the service will quickly consume a lot of memory.

A vicious cycle of process spikes and memory usage increases the burden on the Go Scheduler and runtime GC, which in turn affects the processing power of the service, which causes the entire service to become unavailable. In the process of using Go to develop microservices, there have been a number of similar problems that we call the co-process skyrocket.

Is there a good way to solve this problem? This problem usually occurs because the network call is blocked too long. Even after we reasonably set the network timeout, occasionally there will be a time-out limit of the situation, how to use time-out control in the Go language analysis, first we look at the next network call process.

The first step is to establish a TCP connection, which usually sets a connection timeout to ensure that the process of establishing the connection is not blocked indefinitely.

In the second step, the serialized request data is written to the socket, in order to ensure that the process of writing data is not blocked, the Go language provides a Setwritedeadline method to control the time-out of data written to the socket. Depending on the size of the data volume of the Request, the operation of the Socket may need to be written more than once, and in order to improve efficiency, the edge-serialized edge write is used. Therefore, the reset timeout time is reset before each Socket is written in the implementation of the Thrift library.

In the third step, reading the returned results from the Socket, as with writing, the Go language also provides the Setreaddeadline interface, since the read data also has multiple reads, so the Reset time-out before each reading of the data.

Analyzing the above procedure can find that the total amount of time that is spent on an RPC is composed of three parts: connection timeout, write timeout, read timeout. And the read and write timeouts can occur multiple times, which results in a time-out throttling situation. To solve this problem, the concept of concurrency timeout control is introduced in the kite framework, and the functionality is integrated into the client call library of the kite Framework.

The Concurrency Timeout control model, as shown in the model, introduces the "Concurrent Ctrl" module, which is part of the microservices fuse function to control the maximum number of concurrent requests that the client can initiate. Concurrency Timeout Control the whole process is like this

First, the client initiates the RPC request and passes the "Concurrent Ctrl" module to determine whether the current request is allowed to originate. If the RPC request is allowed to be initiated, a process is started and an RPC call is initiated, and a time-out timer is initialized. The RPC completion event signal and the timer signal are then monitored simultaneously in the main process. If the RPC completion event arrives first, it indicates that the RPC was successful, otherwise, when the timer event occurs, the RPC call times out. This model ensures that, in either case, the RPC does not exceed the predefined time to achieve a precise control timeout.

The Go language introduced the "context" in the 1.7 version of the standard library, which became a standard practice for concurrency control and timeout control, and later in the 1.8 version, added support for "context" in several old standard libraries, including the "Database/sql" package.

Performance

Go has a great performance advantage over traditional WEB server-side programming languages. However, many times because of the incorrect use of the service, or because of the high latency requirements of services, some performance analysis tools have to be used to track down problems and optimize service performance. A variety of performance analysis tools are available in the Go language tool chain for developers to analyze problems.

CPU usage Analysis

Internal Use analysis

View the co-process stack

View GC Logs

Trace Analysis Tool

is a variety of analytical methods

In the process of using go language development, we summed up some ways to write high-performance go services

Pay attention to the use of locks, as far as possible to lock variables and not lock process

CAs can be used, the CAS operation is used
Targeted optimization for hotspot code
Do not overlook the impact of GC, especially high-performance low-latency services
Reasonable object reuse can achieve a very good optimization effect
Avoid reflections and eliminate the use of reflection in high-performance services
In some cases, you can try tuning the "gogc" parameter
New version is stable, try to upgrade the new Go version, because the old version will never get better

The following describes a real-world example of performance optimizations for online services.

This is a basic storage service that provides SetData and Getdatabyrange two methods for bulk data storage and batch data acquisition in time intervals. To improve performance, storage is stored in the KV database with the user ID and a period of time as key, all data within the time interval as value. Therefore, when you need to add new storage data, you need to first read the data from the database, stitching it into the corresponding time interval and then save to the database.

For requests that read data, the corresponding key list is computed based on the requested time interval, and then the data is read from the database.

In this case, peak service interface response time is relatively high, seriously affecting the overall performance of the service. After analyzing the peak service through the above performance analysis method, the following conclusions are drawn:

Problem point:

Large GC pressure, high CPU resource consumption
The deserialization process consumes high CPU

Optimization ideas:

GC pressure is mainly the frequent application and release of memory, so the decision to reduce memory and object requests
Serialization was using the Thrift serialization method, and through Benchmark we found a relatively efficient way to msgpack serialization.

Analysis of the service interface function can be found that data decompression, deserialization of the process is the most frequent, which also conforms to the conclusion of performance analysis. The process of extracting and deserializing is carefully analyzed, and it is found that an "IO" is required for deserialization operations. Reader "interface, and for decompression, it implements" IO "itself. Reader "interface. In the Go language, "IO. Reader "interface is defined as follows:

This interface defines the Read method, and any object that implements the interface can read a certain amount of byte data from it. Therefore, it takes only a small amount of memory Buffer to implement the process from decompression to deserialization, without having to decompress all the data and then deserialize it, saving memory.

In order to avoid frequent Buffer requests and releases, use "Sync". Pool "implements an object pooling to achieve object reuse.

In addition, for the acquisition of the historical data interface, from the original loop to read multiple key data, optimized to read from the database and the data of each key concurrently. After these optimizations, the peak PCT99 of the service was reduced from 100ms to 15ms.

The above is a more typical Go language service optimization case. It is summarized as two points:

Increase concurrency at the business level
Reduce the use of memory and objects

The optimization process uses the Pprof tool to discover performance bottleneck points and then discovers "IO." Reader "interface with the Pipeline of data processing, and thus the overall optimization of the performance of the entire service.

Service Monitoring

The Go language runtime package provides multiple interfaces for developers to get the status of the current process running. The kite framework integrates the number of threads, the state of the process, GC pause time, GC frequency, stack memory usage, and other monitoring. Capture these metrics for each currently running service in real-time, setting alarm thresholds for each metric, such as the number of threads and GC pause times, respectively. On the other hand, we are also trying to take a snapshot of the stack and running state of the runtime services to facilitate the tracing of some instances of a process restart that cannot be reproduced.

Programming Thinking and Engineering

In relation to traditional Web programming languages, Go has indeed brought many changes in programming thinking. Each Go development service is a separate process, any request processing caused Panic, will let the entire process exit, so when the start of a co-process need to consider whether the need to use the recover method, to avoid affecting other co-processes. For Web server-side development, it is often desirable to string together the entire process of a request processing, which relies heavily on Thread Local variables, and does not have this concept in the Go language, so it is necessary to pass the context when the function is called.

Finally, concurrency is the norm in projects that you develop with Go, so you need to pay extra attention to access to shared resources, and the handling of critical-area code logic can add more mental burdens. These differences in programming thinking require a transformational process for developers accustomed to traditional Web backend development.

As for engineering, it is also a point that the Go language is not so mentioned. In fact, in the go official website about why to develop the go language, it is mentioned that most of the current language when the code becomes huge, the management of the Code itself and dependency analysis becomes extremely difficult, so the code itself becomes the most troublesome point, many large projects to the end have become afraid to move it. But the go language is different, its own design syntax is simple, the Class C style, does one thing does not have many kinds of methods, even some code style is defined to go compiler's request inside. Furthermore, the Go language standard library comes with a source-code analysis package that makes it easy to convert the code of a project into an AST tree.

The following is an illustration of the engineering nature of the Go language:

It is also a square, Go has only one way, each unit is consistent. Python stitching can be a variety of ways.

Written in the last

Today's headlines using go language to build a large-scale micro-service architecture, this paper, with the Go language features focus on the concurrency, time-out control, performance and so on in the construction of micro-service practice. In fact, the Go language is not only excellent in service performance, but also well suited for containerized deployments, and a large part of our services are already running on-premises private cloud platforms. In conjunction with the microservices-related components, we are evolving towards the Cloud Native architecture.

Author Introduction:

Shang, senior research engineer for today's headlines. 2015 joined the headlines today, responsible for service transformation related work, in-house promotion of the use of the Go language, the development of internal micro-service Framework kite, integrated service governance, load balancing and other micro-service functions, the implementation of the go language to build a large-scale micro-service architecture in the headlines. He worked for Xiaomi.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More