The path of Golang optimization--Temporary object pool

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed.

Write in front

In the case of high concurrency, if each request requires a piece of memory for the calculation, such as:

make([]int64, 0, len(ids))

Will be a very expensive thing. In order to locate the slow statements in the project, I used the "dichotomy" method to print slow logs and locate the code where the program slowed down. It is not every time slow, but every few seconds suddenly become extremely slow, TPS can be reduced from 2000 to 200. Causing the problem is similar to the one above.

Initialize one slice , the beginner will use:

make([]int64, 0)

Advanced programmers will know that the first allocation of memory is equivalent to no allocation, if you want to follow append the elements, it will cause slice an exponential expansion, you can refer to the following code, append 3 elements, the slice expansion of 3 times.

a := make([]int64, 0)fmt.Println(cap(a), len(a))for i := 0; i < 3; i++ {a = append(a, 1)fmt.Println(cap(a), len(a))}0 01 12 24 3

Each time the expansion space, will be re-apply for an area, the space inside the elements copied in, the new append. What about the elements in the old space? Waiting for a garbage collection.

The simplest way to optimize is to apply a good space for yourself in slice advance, similar to the first line of code.

make([]int64, 0, len(ids))

This avoids the slice application of memory for multiple expansions, but there is still a problem.

Heap or stack?

The program will request a piece of memory from the operating system, and the memory will be split into heaps and stacks. Stacks can be simply understood as memory that is applied internally to a function call, and they return the memory to the system as the function returns.

func F() {temp := make([]int, 0, 20)...}

A variable similar to the one in the above code is just a temporary variable inside the inner temp function application, and is not returned as a return value, it is applied to the stack by the compiler. Application to Stack memory benefit: function returns direct release without causing garbage collection and has no effect on performance .

func F() []int{a := make([]int, 0, 20)return a}

The code above, the application of the code is the same, but after the application as the return value returned, the compiler will assume that the variable will also be used, when the function returned does not return its memory, then it will be applied to the heap. applying the memory above the heap will cause garbage collection .

So, how do you explain the following three things in the exam?

func F() {a := make([]int, 0, 20)b := make([]int, 0, 20000)l := 20c := make([]int, 0, l)}

ablike the code, the application space is not the same, but the fate of their two is diametrically opposed. As a mentioned earlier, it will apply to the stack, and b , due to the large amount of memory requested, the compiler will transfer this large amount of memory variables to the heap above. even if it is a temporary variable, applying too much will apply on the heap .

And c , for us, the meaning and a consistency, but the compiler for this indefinite length of the application, will also apply on the heap, even if the length of the application is very short .

You can view the location of the variable request by using the following command. For more information, refer to my previous article, "Translating" to optimize go mode.

go build -gcflags='-m' . 2>&1

Fragmentation of memory

The actual project is basically through the c := make([]int, 0, l) application of memory, the length is uncertain. Naturally these variables will be applied to the heap. The garbage collection algorithm used by Golang is "tag-clear". Simply put, it is the program to request a larger piece of memory from the operating system, the memory is divided into small pieces, linked through a chain table. Each time the program requests memory, it iterates through each small block on the list, finds a match and returns to its address, and then requests it from the operating system without proper application. If more memory is requested and the size of the application is not fixed, a memory fragmentation problem is caused. The requested heap memory is not used up, but the memory requested by the user is not available in the appropriate space. This will traverse the entire list and will continue to request memory from the operating system. This will explain the problem I described at the outset, applying a piece of memory into a slow statement.

Temporary Object Pool

How to solve this problem, the first thing to think about is the object pool. The Golang sync provides an object pool inside Pool . Generally everyone calls this a pool of objects, and I like to call it a temporary object pool. Because each garbage collection reclaims objects that are not referenced in the pool.

func (p *Pool) Get() interface{}

Get selects an arbitrary item from the pool, removes it from the pool, and returns it to the caller. Get choose to ignore the pool and treat it as empty. Callers should not assume no relation between values passed to Put and the values returned by Get.

It is important to note that the Get method removes the returned object from the pool. So the object that ran out, still had to be put back in the pond again.

Soon, I wrote the first version of the Object pool optimization scheme:

var idsPool = sync.Pool{New: func() interface{} {ids := make([]int64, 0, 20000)return &ids},}func NewIds() []int64 {ids := idsPool.Get().(*[]int64)*ids = (*ids)[:0]idsPool.Put(ids)return *ids}

The realization of this is to put everything slice in the same pool. In order to cope with the problem of variable length, it is a variable that is applied according to a larger value. Although it is an optimization, the performance is not improved with very large slice computations.

Then, referring to the Da da God Code sync_pool.go, and wrote a version:

var default_sync_pool *syncpoolfunc Newpool () *syncpool {default_sync_pool = Newsyncpool (5, 30000, 2,) return DEFA Ult_sync_pool}func Alloc (size int) []int64 {return default_sync_pool. Alloc (size)}func free (mem []int64) {Default_sync_pool. Free (MEM)}//Syncpool is a sync. Pool base Slab allocation memory Pooltype syncpool struct {classes []sync]. poolclassessize []intminsize intmaxsize int}func Newsyncpool (minSize, maxSize, Factor int) *syncpool {n: = 0for ch Unksize: = minSize; ChunkSize <= maxSize; ChunkSize *= Factor {n++}pool: = &syncpool{make ([]sync. Pool, N), make ([]int, N), minSize, maxsize,}n = 0for chunkSize: = minSize; ChunkSize <= maxSize; ChunkSize *= Factor {pool.classessize[n] = Chunksizepool.classes[n]. New = func (size int) func () interface{} {return func () interface{} {buf: = make ([]int64, size) return &AMP;BUF}} (ChunkSize) N++}return pool}func (Pool *syncpool) Alloc (size int) []int64 {if size <= pool.maxsize {for i: = 0; i < Len (pool.cla Ssessize); i++ {if pool.classessize[i] >= size {mem: = Pool.classes[i]. Get (). (*[]int64)//Return (*MEM) [: Size]return (*MEM) [: 0]}}}return Make ([]int64, 0, size)}func (pool *syncpool) free (mem [] Int64) {if size: = Cap (MEM); size <= pool.maxsize {for i: = 0; i < len (pool.classessize); i++ {if pool.classessize[i ] >= size {pool.classes[i]. Put (&AMP;MEM) Return}}}}

Invocation Example:

attrFilters := cache.Alloc(len(ids))defer cache.Free(attrFilters)

Focus on the Alloc method. In order to be able to support the variable length slice , there are several pools, the size is from 5 onwards, the largest to 30000, in multiples of 2. That is 5, 10, 20 ...

DEFAULT_SYNC_POOL = NewSyncPool(5,     30000, 2,     )
    • When allocating memory, find the pool that meets the smallest size from the pool. For example, if the application length is 2, allocate the pool with a size of 5. If it is 11, allocate the object in the pool with a size of 20;
    • If the application slice is large, exceeding the upper limit of 30000, this situation does not use the pool, directly from the memory application;
    • Of course, these parameters can be adjusted according to their actual situation;
    • In contrast to the previous practice, putting objects back in the pond is Free accomplished by means of methods.

Conclusion

In order to optimize the interface, after a year. The results are good, and TPS has a much lower 30%,tp99.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.