This is a creation in Article, where the information may have evolved or changed.
Write in front
In the case of high concurrency, if each request requires a piece of memory for the calculation, such as:
make([]int64, 0, len(ids))
Will be a very expensive thing. In order to locate the slow statements in the project, I used the "dichotomy" method to print slow logs and locate the code where the program slowed down. It is not every time slow, but every few seconds suddenly become extremely slow, TPS can be reduced from 2000 to 200. Causing the problem is similar to the one above.
Initialize one slice
, the beginner will use:
make([]int64, 0)
Advanced programmers will know that the first allocation of memory is equivalent to no allocation, if you want to follow append
the elements, it will cause slice
an exponential expansion, you can refer to the following code, append 3 elements, the slice
expansion of 3 times.
a := make([]int64, 0)fmt.Println(cap(a), len(a))for i := 0; i < 3; i++ {a = append(a, 1)fmt.Println(cap(a), len(a))}0 01 12 24 3
Each time the expansion space, will be re-apply for an area, the space inside the elements copied in, the new append. What about the elements in the old space? Waiting for a garbage collection.
The simplest way to optimize is to apply a good space for yourself in slice
advance, similar to the first line of code.
make([]int64, 0, len(ids))
This avoids the slice
application of memory for multiple expansions, but there is still a problem.
Heap or stack?
The program will request a piece of memory from the operating system, and the memory will be split into heaps and stacks. Stacks can be simply understood as memory that is applied internally to a function call, and they return the memory to the system as the function returns.
func F() {temp := make([]int, 0, 20)...}
A variable similar to the one in the above code is just a temporary variable inside the inner temp
function application, and is not returned as a return value, it is applied to the stack by the compiler. Application to Stack memory benefit: function returns direct release without causing garbage collection and has no effect on performance .
func F() []int{a := make([]int, 0, 20)return a}
The code above, the application of the code is the same, but after the application as the return value returned, the compiler will assume that the variable will also be used, when the function returned does not return its memory, then it will be applied to the heap. applying the memory above the heap will cause garbage collection .
So, how do you explain the following three things in the exam?
func F() {a := make([]int, 0, 20)b := make([]int, 0, 20000)l := 20c := make([]int, 0, l)}
a
b
like the code, the application space is not the same, but the fate of their two is diametrically opposed. As a
mentioned earlier, it will apply to the stack, and b
, due to the large amount of memory requested, the compiler will transfer this large amount of memory variables to the heap above. even if it is a temporary variable, applying too much will apply on the heap .
And c
, for us, the meaning and a
consistency, but the compiler for this indefinite length of the application, will also apply on the heap, even if the length of the application is very short .
You can view the location of the variable request by using the following command. For more information, refer to my previous article, "Translating" to optimize go mode.
go build -gcflags='-m' . 2>&1
Fragmentation of memory
The actual project is basically through the c := make([]int, 0, l)
application of memory, the length is uncertain. Naturally these variables will be applied to the heap. The garbage collection algorithm used by Golang is "tag-clear". Simply put, it is the program to request a larger piece of memory from the operating system, the memory is divided into small pieces, linked through a chain table. Each time the program requests memory, it iterates through each small block on the list, finds a match and returns to its address, and then requests it from the operating system without proper application. If more memory is requested and the size of the application is not fixed, a memory fragmentation problem is caused. The requested heap memory is not used up, but the memory requested by the user is not available in the appropriate space. This will traverse the entire list and will continue to request memory from the operating system. This will explain the problem I described at the outset, applying a piece of memory into a slow statement.
Temporary Object Pool
How to solve this problem, the first thing to think about is the object pool. The Golang sync
provides an object pool inside Pool
. Generally everyone calls this a pool of objects, and I like to call it a temporary object pool. Because each garbage collection reclaims objects that are not referenced in the pool.
func (p *Pool) Get() interface{}
Get selects an arbitrary item from the pool, removes it from the pool, and returns it to the caller. Get choose to ignore the pool and treat it as empty. Callers should not assume no relation between values passed to Put and the values returned by Get.
It is important to note that the Get
method removes the returned object from the pool. So the object that ran out, still had to be put back in the pond again.
Soon, I wrote the first version of the Object pool optimization scheme:
var idsPool = sync.Pool{New: func() interface{} {ids := make([]int64, 0, 20000)return &ids},}func NewIds() []int64 {ids := idsPool.Get().(*[]int64)*ids = (*ids)[:0]idsPool.Put(ids)return *ids}
The realization of this is to put everything slice
in the same pool. In order to cope with the problem of variable length, it is a variable that is applied according to a larger value. Although it is an optimization, the performance is not improved with very large slice
computations.
Then, referring to the Da da God Code sync_pool.go, and wrote a version:
var default_sync_pool *syncpoolfunc Newpool () *syncpool {default_sync_pool = Newsyncpool (5, 30000, 2,) return DEFA Ult_sync_pool}func Alloc (size int) []int64 {return default_sync_pool. Alloc (size)}func free (mem []int64) {Default_sync_pool. Free (MEM)}//Syncpool is a sync. Pool base Slab allocation memory Pooltype syncpool struct {classes []sync]. poolclassessize []intminsize intmaxsize int}func Newsyncpool (minSize, maxSize, Factor int) *syncpool {n: = 0for ch Unksize: = minSize; ChunkSize <= maxSize; ChunkSize *= Factor {n++}pool: = &syncpool{make ([]sync. Pool, N), make ([]int, N), minSize, maxsize,}n = 0for chunkSize: = minSize; ChunkSize <= maxSize; ChunkSize *= Factor {pool.classessize[n] = Chunksizepool.classes[n]. New = func (size int) func () interface{} {return func () interface{} {buf: = make ([]int64, size) return &BUF}} (ChunkSize) N++}return pool}func (Pool *syncpool) Alloc (size int) []int64 {if size <= pool.maxsize {for i: = 0; i < Len (pool.cla Ssessize); i++ {if pool.classessize[i] >= size {mem: = Pool.classes[i]. Get (). (*[]int64)//Return (*MEM) [: Size]return (*MEM) [: 0]}}}return Make ([]int64, 0, size)}func (pool *syncpool) free (mem [] Int64) {if size: = Cap (MEM); size <= pool.maxsize {for i: = 0; i < len (pool.classessize); i++ {if pool.classessize[i ] >= size {pool.classes[i]. Put (&MEM) Return}}}}
Invocation Example:
attrFilters := cache.Alloc(len(ids))defer cache.Free(attrFilters)
Focus on the Alloc
method. In order to be able to support the variable length slice
, there are several pools, the size is from 5 onwards, the largest to 30000, in multiples of 2. That is 5, 10, 20 ...
DEFAULT_SYNC_POOL = NewSyncPool(5, 30000, 2, )
- When allocating memory, find the pool that meets the smallest size from the pool. For example, if the application length is 2, allocate the pool with a size of 5. If it is 11, allocate the object in the pool with a size of 20;
- If the application
slice
is large, exceeding the upper limit of 30000, this situation does not use the pool, directly from the memory application;
- Of course, these parameters can be adjusted according to their actual situation;
- In contrast to the previous practice, putting objects back in the pond is
Free
accomplished by means of methods.
Conclusion
In order to optimize the interface, after a year. The results are good, and TPS has a much lower 30%,tp99.