Lua_gc source code learning 2

Source: Internet
Author: User

Common sense:GC is the garbage collector resource recycler;

Early Lua GC adopted the implementation of stop the world. Once gc is generated, All gc processes are expected to be completed. Lua itself is a very streamlined system, but it does not mean that the amount of data processed must be small.

Start with Lua 5.1 and change GC implementation to step-by-step. Although the world is still stopped, each step can be executed in stages. In this way, the time for repeated stranded is relatively small. As a result, the code of this department is also chaotic. The most critical issue for step-by-step execution is that it must be handled between GC steps. If the state of the Data Association changes, how can we ensure the accuracy of GC. The GC step-by-step execution is relatively complete, and the total overhead is not zero. But in terms of implementation, we should try our best to reduce the extra growth price.

First, let's look at the GC process.

LThe GC of ua is divided into five major stages..The stage of GC (called the state in the Code), which is based on the gcstate field in global_State.. The status is defined in 14 lines of lgc. h In macro format.

/*** Possible states of the Garbage Collector*/#define GCSpause    0#define GCSpropagate    1#define GCSsweepstring  2#define GCSsweep    3#define GCSfinalize 4

The granularity of Status values also indicates their fulfillment order. It should be noted that not every program is blocked in the GC execution process.

The GCSpause stage is the start step of each GC process.. Only identifies the root node of the table tagging system. See the 561 line of lgc. c.

switch (g->gcstate) 
{    case GCSpause: markroot(L);  /* start a new collection */      return 0;

What the markroot function does is to mark the main thread object, the local table of the main thread, the Registry, and the meta table registered for the local sample.The detailed process of marking will be described later.

After the GCSpause stage is completed, the status is immediately switched to GCSpropagate. This is a marking process. This process ends step by step. When another object is detected to be marked, the iteration tag (repeated call of propagatemark); after all, there will beA marking process cannot be interrupted. These operations are executed in a function called atomic.. See row 565 of lgc. c:

case GCSpropagate: 
{      if (g->gray)        return propagatemark(g);      else /* no more `gray' objects */        atomic(L);  /* finish mark phase */        return 0;}

Here, we need to mention the gray domain. It refers to the gray node chain in GCObject. It is gray, that is, it is between white and black. The color of the node will be clarified immediately.

The next step is to clear the process.

As we mentioned earlier,Strings are independently managed in Lua, so they need to be cleared separately. This is what GCSsweepstring does. String table manages all strings in the form of hash tables. In GCSsweepstring, each step clears a column in the hash table.. For the code, see line 1 of lgc. c.

case GCSsweepstring: lu_mem old = g->totalbytes;      sweepwholelist(L, &g->strt.hash[g->sweepstrgc++]);      if (g->sweepstrgc >= g->strt.size)  /* nothing more to sweep? */        g->gcstate = GCSsweep;  /* end sweep-string phase */      lua_assert(old >= g->totalbytes);      g->estimate -= old - g->totalbytes;      return GCSWEEPCOST;

Here we can seeEstimate and totalbytes are separated by their names, indicating the number of memory bytes occupied by lua vm and the actual number of allocated bytes.

Ps. If you have implemented the Memory Manager, when you know that memory governance has exclusive memory overhead. If you need to precisely control the memory size, I prefer the continuous Memory Manager to calculate the precise memory usage environment. For example, if you request 8 bytes of memory from the memory manager, the actual memory overhead may be 12 bytes, or even more. If you want to perform this failover so that lua gc can reflect the actual memory usage, you can modify line 76 of lmem. c and the luaM_realloc _ function. Memory usage in all lua changes the city through this function.

From the code below, we also sawThe secret number GCSWEEPCOST. This number is used to control GC progress.. This is beyond the topic of today. For future analysis.

The next step is to clean up all unmarked GCObject.. That isGCSsweepPhase. It is similar to the GCSsweepstring above.

The final stage is GCSfinalize.. If the userdata object that needs to call the gc meta method is found in the previous stage, it will be called one by one at this stage. The function used to do this is
GCTM.

As mentioned later, the data of all the userdata objects with gc essentials and their relationships will not be cleared in the previous root disconnection phase. (Because of the Single Sign) all the methods of misappropriation are safe. However, their actual exclusion needs to be compared with the next GC process. Or the call duration is eliminated in lua_close.

Ps. lua_close does not actually perform full gc operations, but simply handles the gc meta Method for punishing all userdata and the memory effectively implemented by the release. It is relatively inexpensive.

Next let's take a lookGC marking processSome ideas.

To put it simplyLua thinks that each GCObject (a tool that needs to be processed by the GC collector) has a color. Headers, all nodes are white. The newly created node is also acquiesced to be configured in red.

In the marking stage, visible nodes are set to Xuan colors one by one. Some nodes are relatively large and will be associated with other nodes. Lua thinks it is gray before the associated nodes are processed.

The node color is stored in the CommonHeader of GCObject and placed in the marked domain.. In order to save memory, use a bit for storage.Marked is a single-byte volume. A total of 8 tags can be stored.. Lua 5.1.4 uses seven mark bits. In line 41 of lgc. h, there are some interpretations:

/*** Layout for bit use in `marked' field:** bit 0 - object is white (type 0)** bit 1 - object is white (type 1)** bit 2 - object is black** bit 3 - for userdata: has been finalized** bit 3 - for tables: has weak keys** bit 4 - for tables: has weak values** bit 5 - object is fixed (should not be collected)** bit 6 - object is "super" fixed (only the main thread)*/#define WHITE0BIT   0#define WHITE1BIT   1#define BLACKBIT    2#define FINALIZEDBIT    3#define KEYWEAKBIT  3#define VALUEWEAKBIT    4#define FIXEDBIT    5#define SFIXEDBIT   6#define WHITEBITS   bit2mask(WHITE0BIT, WHITE1BIT)

Lua defines a set of macros to manipulate these markers, and the code will not be listed any more. You only need to open lgc. h to understand these macro functions.

White and black are tags. When an object is not white or black, it is regarded as gray.

Why are there two white symbols? This is a small skill that lua uses. The GC flag process is completed, but the liquidation process is not completed. Once the relationship between objects changes, it is like adding a new object. The lifetime of these objects is unpredictable. The most stable way is to mark them as irrevocable. As the cleaning process ends, all objects need to be set back to white to facilitate the next liquidation. Lua is actually a single scan. After processing a node, it resets the color of a node. Simply setting the newly created object to Black may cause it to become white after GC streaming.

The simple method is to set the status from the third. That is, 2nd white.

In Lua, two white states are a Ping-Pong switch. When a zero-type white node needs to be deleted, the one-type white node is protected. The opposite is also true.

In the future, if the white color is 0 or 1, see the currentwhite field of global_State. Otherwhite () is used for table tennis switching. Obtain the appropriate front-white state, using the macro defined in lgcc. h 77 rows:

#define luaC_white(g)   cast(lu_byte, (g)->currentwhite & WHITEBITS)

FINALIZEDBIT is used to mark userdata. Set this flag when userdata is determined not to be used. It is different from the Color Mark. Due to the existence of the gc metadata method, the memory occupied by the release must be placed after the gc metadata method is called. This flag ensures that the metadata method is not frequently called.

KEYWEAKBIT and VALUEWEAKBIT are used to mark the weak attribute of a table..

FIXEDBIT ensures that a GCObject is not cleared in the GC process.. Why is there such a status? The key is that lua itself uses a string that may not be referenced by any location, but is used repeatedly. Then, these strings will be cherished and set to FIXEDBIT,Active ssh killing and reconnecting under Mac.

In the 24 line definition of lstring. h are:

#define luaS_fix(s) l_setbit((s)->tsv.marked, FIXEDBIT)

A string can be configured to be protected.

For the use of the model, see 64 lines of llex. c:

void luaX_init (lua_State *L) {  int i;  for (i=0; i<NUM_RESERVED; i++) 
TString *ts="luaS_new(L," luaX_tokens[i]); 
luaS_fix(ts); 
reserved words are never collected * lua_assert(strlen(luaX_tokens[i])+1 <="TOKEN_LEN);" ts->tsv.reserved = cast_byte(i+1);  /* reserved word */}

And 30 lines of ltm. c:

void luaT_init (lua_State *L) {  static const char *const luaT_eventname[] = /* ORDER TM */    "__index", "__newindex",    "__gc", "__mode", "__eq",    "__add", "__sub", "__mul", "__div", "__mod",    "__pow", "__unm", "__len", "__lt", "__le",    "__concat", "__call";  int i;  for (i=0; itmname[i] = luaS_new(L, luaT_eventname[i]);    luaS_fix(G(L)->tmname[i]);  /* never collect these names */}

Taking the metadata method as an example, if we manipulate the lua scale api to simulate the actions of the retriable, we cannot write highly efficient data with the native meta mechanism. Because, when we get a table key and want to know whether it is _ index, we need to use strcmp for comparison at the end; or press the string to be compared into lua_State using lua_pushlstring first, then compare.

We know that the lua value of the string differences with a string tool, that is, the TString location is all. The price for comparing two lua strings is very small (only one pointer is needed), which is more efficient than the strcmp function. However, lua_pushlstring has been deprecated. It is necessary to ignore the hash value and question the hash table (string table ).

Lua's GC algorithm does not perform memory clearance, and it does not migrate data in the memory.. In fact, if you can determine that a string will not be cleared, its memory location is also stable, resulting in optimized space. Ltm. c is doing this.

See line 93 of lstate. c:

TString *tmname[TM_N];  /* array with tag-method names */

In global_State, The tmname field simply uses the TString pointer to name all metadata methods. If we use the lua api as the criterion, we need to put these strings in the registry or in the case table to ensure that they are not purged by gc and can be obtained during comparison. The completion of lua itself deceives FIXEDBIT for further optimization.

Finally, let's take a look at SFIXEDBIT. In fact, only one of its functions is to mark the main mainthread. That is, the starting point of the concept. We call the layout returned by lua_newstate.

Why should we take this layout into special consideration? Because even when lua_close is reached, this structure cannot be cleared. Let's take a look at what the French performed at the end of the day? See the 105 rows of lstate. c.

static void close_state (lua_State *L) global_State *g = G(L);  luaF_close(L, L->stack);  /* close all upvalues for this thread */  luaC_freeall(L);  /* collect all objects */  lua_assert(g->rootgc == obj2gco(L));  lua_assert(g->strt.nuse == 0);  luaM_freearray(L, G(L)->strt.hash, G(L)->strt.size, TString *);  luaZ_freebuffer(L, &g->buff);  freestack(L, L);  lua_assert(g->totalbytes == sizeof(LG));  (*g->frealloc)(g->ud, fromstate(L), state_size(LG), 0);

This is the last step of lua_close. LuaC_freeall releases all GCObject, but does not include the SFIXEDBIT mainthread object. See lgc. c 484.

void luaC_freeall (lua_State *L) bitmask(SFIXEDBIT);  /* mask to collect all elements */  sweepwholelist(L, &g->rootgc);  for (i = 0; i < g->strt.size; i++)  /* free all string lists */    sweepwholelist(L, &g->strt.hash[i]);

Here FIXEDBIT was neglected, and before that, FIXEDBIT was guarded. See lstate. c's 153 rows (lua_newstate function ):

g->currentwhite = bit2mask(WHITE0BIT, FIXEDBIT);

It is easy to understand that all the root data of the lua world is stored in this object. If it is cleared early, the code on the back will have a question. The actual release of this object is not in GC, but the last sentence:

lua_assert(g->totalbytes == sizeof(LG));  (*g->frealloc)(g->ud, fromstate(L), state_size(LG), 0);

After a while, assert is added. Finally, is there only this structure left in the world.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.