Dynamic memory for the Go language

Source: Internet
Author: User
Tags bitmask prefetch
This is a creation in Article, where the information may have evolved or changed.

Go language dynamic memory application and release design from Tcmalloc
Main data structure:
Mheap:the malloc Heap, Managing page
Mcentral: Free list for a specific type of small object share
Mcache: Shared free list of thread-local small objects

Assigning small objects

    1. Find the free list of the Mcache size, if it is not empty, get it directly from the
      In this case, no locking overhead is required
    2. If the free list of Mcache is empty, then get some of the available object from Mcentral
    3. If the Mcentral free list is empty, request some page from Mheap, and then add the page memory to the corresponding mcentral
    4. If the MHEAP cache page is insufficient, request some page from the operating system (at least 1M)

Assigning large Objects

Large objects from mheap directly from the assignment

Request Dynamic Memory

Allocate an object of size bytes.//Small objects is allocated from the per-p cache ' s free lists.//Large objects ; (KB) is allocated straight from the Heap.func MALLOCGC (size uintptr, Typ *_type, flags UInt32) unsafe. Pointer {if size = = 0 {return unsafe. Pointer (&zerobase)} SIZE0: = size if Flags&flagnoscan = = 0 && Typ = nil {gothrow ("malloc m Issing type ")}//This function must is atomic wrt GC, but for performance reasons//We don ' t acquirem/release m on fast path.    The code below does not has//split stack checks, so it can ' t is preempted by GC. Functions like Roundup/add is inlined.    and Onm/racemalloc are nosplit.    If Debugmalloc = True, these assumptions is checked below. If Debugmalloc {mp: = Acquirem () if mp.mallocing! = 0 {gothrow ("malloc deadlock")} mp.mallocing = 1 I F Mp.curg! = Nil {mp.curg.stackguard0 = ^uintptr (0XFFF) | 0xbad}} c: = Gomcache () var s *msPan var x unsafe. Pointer If size <= maxsmallsize {if flags&flagnoscan! = 0 && Size < maxtinysize {//Tiny Allo    Cator. Tiny allocator combines several Tiny allocation requests//into a single memory block. The resulting memory block//is freed if all subobjects is unreachable. The subobjects//Must be Flagnoscan (don ' t has pointers), this ensures that//the amount of potentially wasted me    Mory is bounded.    Size of the memory block used for combining (Maxtinysize) is tunable. Current setting was bytes, which relates to 2x worst case memory//wastage if all but one subobjects was Unrea    chable).    8 bytes would result in no wastage @ all, but provides less//opportunities for combining.    Bytes provides more opportunities for combining, and/but can leads to 4x worst case wastage.    The best case winning is 8x regardless of block size. Objects obtained from tiny allocatorMust not being freed explicitly.    So if an object is freed explicitly, we ensure that//its size >= maxtinysize. Setfinalizer have a special case for objects potentially coming//from tiny allocator, it such case it allows    To set finalizers//For an inner byte of a memory block. The main targets of tiny allocator is small strings and//standalone escaping variables.    On a JSON benchmark//The allocator reduces number of allocations by ~12% and//reduces heap size by ~20%. Tinysize: = UIntPtr (c.tinysize) if size <= tinysize {tiny: = unsafe.    Pointer (C.tiny)//Align tiny Pointer for required (conservative) alignment. If size&7 = = 0 {tiny = Roundup (tiny, 8)} else if size&3 = = 0 {tiny = Roundup (Tiny, 4)} else if siz E&1 = = 0 {tiny = Roundup (Tiny, 2)} size1: = size + (uintptr (Tiny)-uintptr (unsafe. Pointer (C.tiny))) if Size1 <= tinysize {//The object fits into ExistiNG Tiny block.    x = Tiny C.tiny = (*byte) (Add (x, size)) c.tinysize-= UIntPtr (size1) c.local_tinyallocs++ if Debugmalloc { MP: = Acquirem () if mp.mallocing = = 0 {gothrow ("Bad malloc")} mp.mallocing = 0 if Mp.curg! = Nil {MP.    curg.stackguard0 = Mp.curg.stack.lo + _stackguard}//Note:one Releasem for the acquirem just above.    The other for the Acquirem at start of malloc.    Releasem (MP) Releasem (MP)} return X}}//Allocate a new maxtinysize block. s = c.alloc[tinysizeclass] V: = s.freelist if v = = Nil {mp: = Acquirem () mp.scalararg[0] = Tinysizeclass o    NM (Mcacherefill_m) Releasem (MP) s = c.alloc[tinysizeclass] v = s.freelist} s.freelist = V.Next s.ref++ Todo:prefetch v.next x = unsafe. Pointer (v) (*[2]uint64) (x) [0] = 0 (*[2]uint64) (x) [1] = 0//See if we need to replace the existing tiny block wit    H the new one//based on amount for remaining free space. IfMaxtinysize-size > Tinysize {c.tiny = (*byte) (Add (x, size)) C.tinysize = UIntPtr (maxtinysize-size)} siz E = maxtinysize} else {var sizeclass int8 if size <= 1024-8 {sizeclass = size_to_class8[(size+7) >>    3]} else {sizeclass = size_to_class128[(size-1024+127) >>7]} size = UIntPtr (Class_to_size[sizeclass])    s = c.alloc[sizeclass] V: = s.freelist if v = = Nil {mp: = Acquirem () mp.scalararg[0] = UIntPtr (Sizeclass)    OnM (Mcacherefill_m) Releasem (MP) s = c.alloc[sizeclass] v = s.freelist} s.freelist = V.Next s.ref++ Todo:prefetch x = unsafe.  Pointer (v) if Flags&flagnozero = = 0 {v.next = nil if size > 2*ptrsize && ((*[2]uintptr) (x)) [1]! = 0 {memclr (unsafe. Pointer (v), size)}} C.local_cachealloc + = IntPtr (size)} else {mp: = Acquirem () mp.scalararg[0] = uintptr (size) mp.scalararg[1] = UIntPtr (Flags) OnM (largealloc_m) s = (*mspan) (mP.ptrarg[0]) mp.ptrarg[0] = nil Releasem (MP) x = unsafe. Pointer (UIntPtr (S.start << pageshift)) size = UIntPtr (s.elemsize)} if Flags&flagnoscan! = 0 {//A    ll objects is pre-marked as Noscan. Goto marked}//If allocating a defer+arg block, now so we ' ve picked a malloc size//large enough to hold EV    Erything, cut the ' asked for ' size down to//just the defer header, so the the GC bitmap would record the ARG block    As containing nothing at all (as if it were unused space at the end of//a malloc block caused by size rounding).    The defer arg areas is scanned as part of Scanstack. if Typ = = Defertype {size0 = unsafe. Sizeof (_defer{})}//From this till marked label marking the object as allocated//and storing type info in th    E GC bitmap. {Arena_start: = uintptr (unsafe. Pointer (Mheap_.arena_start)) off: = (uintptr (x)-Arena_start)/ptrsize xbits: = (*uint8) (unsafe. Pointer (arena_start-off/wordsPerBitmapByte-1)) Shift: = (off% wordsperbitmapbyte) * gcbits if Debugmalloc && (*xbits>>shift ) & (Bitmask|bitptrmask))! = bitboundary {println ("runtime:bits =", (*xbits>>shift) & (Bitmask|bitptrmask  )) Gothrow ("Bad bits in Markallocated")} var ti, Te uintptr var ptrmask *uint8 if size = = ptrsize {//    It's one word and it has pointers, it must is a pointer. *xbits |= (bitspointer << 2) << shift goto marked} if Typ.kind&kindgcprog! = 0 {nptr: = (uin Tptr (typ.size) + ptrSize-1)/ptrsize masksize: = nptr if masksize%2! = 0 {masksize *= 2//repeated} m  Asksize = masksize * POINTERSPERBYTE/8//4 bits per word masksize++//unroll flag in The beginning if masksize > Maxgcmask && typ.gc[1]! = 0 {//If the mask is too large, unroll the Progra m directly//into the GC bitmap. It ' s 7 times slower than copying//from the PRE-unrolled mask, but saves 1/16 of type size//memory for the mask. MP: = Acquirem () mp.ptrarg[0] = x mp.ptrarg[1] = unsafe. Pointer (typ) mp.scalararg[0] = uintptr (size) mp.scalararg[1] = UIntPtr (SIZE0) OnM (unrollgcproginplace_m) Relea SEM (MP) goto marked} Ptrmask = (*uint8) (unsafe.    Pointer (UIntPtr (typ.gc[0)))//Check whether the program is already unrolled. If UIntPtr (ATOMICLOADP (unsafe. Pointer (ptrmask)) &0xff = = 0 {mp: = Acquirem () mp.ptrarg[0] = unsafe. Pointer (Typ) OnM (Unrollgcprog_m) Releasem (MP)} Ptrmask = (*uint8) (Add (unsafe). Pointer (Ptrmask), 1)//Skip the UNROLL flag byte} else {ptrmask = (*uint8) (unsafe. Pointer (Typ.gc[0]))//Pointer to unrolled mask} if size = = 2*ptrsize {*xbits = *ptrmask | bitboundary goto    marked} TE = UIntPtr (typ.size)/ptrsize//If The type occupies odd number of words, its mask is repeated. If te%2 = = 0 {te/= 2}//Copy pointer bitmask into tHe bitmap. For I: = uintptr (0); i < SIZE0; i + = 2 * Ptrsize {V: = * (*uint8) (Add (unsafe). Pointer (Ptrmask), TI)) ti++ if ti = = te {ti = 0} if i = = 0 {v |= bitboundary} if i+ptrsize = = size0 {v &^= uint8 (bitptrmask << 4)} *xbits = v xbits = (*byte) (Add (unsafe). Pointer (Xbits), ^uintptr (0))} if size0% (2*ptrsize) = = 0 && size0 < size {//Mark the word after Las    T object ' s word as bitsdead. *xbits = Bitsdead << 2}}marked:if raceenabled {racemalloc (x, size)} If Debugmalloc {mp: = Acquirem () if mp.mallocing = = 0 {gothrow ("Bad malloc")} mp.mallocing = 0 if Mp.curg! = Nil {Mp.curg    . stackguard0 = Mp.curg.stack.lo + _stackguard}//Note:one Releasem for the acquirem just above.    The other for the Acquirem at start of malloc. Releasem (MP) Releasem (MP)} if Debug.allocfreetrace! = 0 {tracealloc (x, Size, Typ)} if rate: = MEMPROfilerate;    Rate > 0 {if size < UIntPtr (rate) && int32 (size) < C.next_sample {C.next_sample-= int32 (Size) } else {mp: = Acquirem () Profilealloc (MP, X, size) Releasem (MP)}} if Memstats.heap_alloc >= MEMS tats.next_gc {gogc (0)} return x}

Releasing dynamic memory

Go does not exist like the free function in C, the release of dynamic memory is carried out by the GC, each time the release is not a single object, but a span of n objects

Free n objects from a span s back to the central free list c.//called during sweep.//Returns true if the span is R  eturned to Heap. Sets Sweepgen to//the latest generation.//If preserve=true, don ' t return the span to heap nor relink in Mcentral lists;/ /caller takes care of It.boolruntime     Mcentral_freespan (mcentral *c, Mspan *s, Int32 n, MLink *start, MLink *end, BOOL preserve) {bool wasempty;     if (S->incache) runtime throw ("Freespan into cached span");    Add the objects back to S's free list.    Wasempty = S->freelist = = Nil;    End->next = s->freelist;    S->freelist = start;     S->ref-= n; if (preserve) {///preserve is set only when called from Mcentral_cachespan above,//the span must being in the empty L    Ist.    if (S->next = = nil) runtime throw ("can ' t preserve unlinked span");    Runtime Atomicstore (&s->sweepgen, runtime Mheap.sweepgen);    return false;     } Runtime Lock (&c->lock); Move to nonempty if necessary. if (wasempty) {runtime    Mspanlist_remove (s); Runtime    Mspanlist_insert (&c->nonempty, s);  }//delay updating Sweepgen until here. This was the signal that//the span could be used in a mcache, so it must come after the//linked list operations AB    Ove (actually, just after the//lock of C above.)     Runtime Atomicstore (&s->sweepgen, runtime Mheap.sweepgen);    if (s->ref! = 0) {runtime unlock (&c->lock);    return false;    }//S is completely freed, return it to the heap. Runtime    Mspanlist_remove (s);    S->needzero = 1;    S->freelist = nil;    Runtime unlock (&c->lock);    Runtime Unmarkspan ((byte*) (S->start<<pageshift), s->npages<<pageshift); Runtime    Mheap_free (&runtime mheap, S, 0); return true;}
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.