Golang Internals, part 2:diving into the Go Compiler

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed.

All Parts:part 1 | Part 2 | Part 3 | Part 4 | Part 5

Do you know what exactly happens in the Go runtime and when are you use a variable via interface reference? This isn't a trivial question, because in Go a type the implements an interface does not contain any references interface whatsoever. Still, we can try answering it, using our knowledge of the Go compiler, which is discussed in the previous blog post.

So, let's take a deep dive to the go compiler:create a basic Go program and see the internal workings of the go Typecas Ting. Using it as an example, I-ll explain how a node tree is generated and utilized. So, you can further apply the knowledge to other Go compiler ' s features.

Before you start

To perform the experiment, we'll need to work directly with the go compiler (not the Go tool). You can access it by using the command:

Go Tool 6g Test.go


It would compile the test.go source file and create an object file. Here, 6g was the name of the compiler on my machine, which has an AMD64 architecture. Note that you should use different compilers for different architectures.

When we are directly with the compiler, we can use some handy command line arguments (more details here). For the purposes of this experiment, we'll need the- w flag that would print the layout of the node tree.

Creating a simple Go program

First of all, we is going to create a sample Go program. My version is below:

  1 package  main  2   3  type I interface {  4          dosomework ()  5  }  6   7  type T struct {  8          a int  9  } ten  one  func (t *t) dosomework () {b  },  func main ( {          T: = &t{}          I: = i (t)-          print (i)  


Really simple, isn ' t it? The only thing that might seem unnecessary are the 17th line, where we print the i variable. Nevertheless, without it, I 'll remain unused and the program won't be compiled. The next step is to compile we program using THE-W switch:

Go Tool 6g-w Test.go


After doing this, you'll see output of that contains node trees for each method defined in the program. In our case, these is the main and init methods. The init method is here because it's implicitly defined for all programs, but we actually does not care about it R Ight now.

For each method, the compiler prints the versions of the node tree. The first one is the original node tree, we get after parsing the source file. The second one is the version, the we get after type checking and applying all the necessary modifications.

Understanding the node tree of the Main method

Let's take a closer look at the original version of the node tree from the main method and try to understand what exactly is going on.

DCL L (15). name-main.t U (1) A (1) G (1) L (+) x (0+0) class (Pauto) f (1) LD (1) TC (1) Used (1) ptr64-*main.   TAS L (Colas) (1) TC (1). name-main.t U (1) A (1) G (1) L (+) x (0+0) class (Pauto) f (1) LD (1) TC (1) Used (1) ptr64-*main. T. Ptrlit L (+) ESC (NO) LD (1) TC (1) ptr64-*main.   T.. Structlit L () TC (1) Main.   T..   .  TYPEL () TC (1) implicit (1) type=ptr64-*main. T Ptr64-*main.   TDCL L (16). Name-main.i U (1) A (1) G (2) L (+) x (0+0) class (Pauto) f (1) LD (1) TC (1) Used (1) main.   IAS L (+) TC (1). name-main.autotmp_0000 U (1) A (1) l (+) x (0+0) class (Pauto) ESC (N) TC (1) Used (1) ptr64-*main. T. name-main.t U (1) A (1) G (1) L (+) x (0+0) class (Pauto) f (1) LD (1) TC (1) Used (1) ptr64-*main.   TAS L (+) Colas (1) TC (1). Name-main.i U (1) A (1) G (2) L (+) x (0+0) class (Pauto) f (1) LD (1) TC (1) Used (1) main. I. Conviface L (+) TC (1) Main.   I.. name-main.autotmp_0000 U (1) A (1) l (+) x (0+0) class (Pauto) ESC (N) TC (1) Used (1) ptr64-*main.   Tvarkill L (+) TC (1). name-main.autotmp_0000 U (1) A (1) l (+) x (0+0) class (Pauto) ESC (N) TC (1) Used (1) ptr64-*main.   Tprint L (+) TC (1) print-list. Name-main.i U (1) A (1) G (2) L (+) x (0+0) class (Pauto) f (1) LD (1) TC (1) Used (1) main. I


In the explanation below, I'll use a abridged version, from which I removed all the unnecessary details.

The first node is rather simple:

DCL L (a).   name-main.t L (ptr64-*main). T


The first node is a declaration node. L (15) tells us, this node is defined The declaration node references the name node that represents the main.t variable. This variable was defined in the main and was actually a 64-bit pointer to the main. T type. You can look at line and easily understand what declaration is represented there.

The next one is a bit trickier.

As L ().   name-main.t L (ptr64-*main). T.   Ptrlit L (ptr64-*main). T.   .   Structlit L (Main). T ...   TYPE L (type=ptr64-*main). T Ptr64-*main. T


The root node is the assignment node. Its first child is the name node, that represents the main.t variable. The second child is a node, we assign to main.t-A pointer literal node (&). It has a child struct literal, which, and its turn, and points to the type node that represents the actual type (main. T).

The next node is another declaration. This time, it's a declaration of the main.i variable that belongs to the main. I type.

DCL L (+).   Name-main.i L (+) main. I


Then, the compiler creates another variable, autotmp_0000, and assigns the main.t variable to it.

As L (+) TC (1).   name-main.autotmp_0000 L (ptr64-*main). T.   name-main.t L (ptr64-*main). T


Finally, we came to the nodes, which we are actually inetersted in.

As L (+).   Name-main.i L (+) main. I.   Conviface L (+) main. I.   .   name-main.autotmp_0000 Ptr64-*main. T


Here, we can see that the compiler have assigned a special node called Conviface to the main.i variable. But this does not give us much information on what ' s happening under the hood. To find out what's going on, we need to look into the node tree of the main method after all node tree modifications has been applied (you can find this information in the ' After Walk Main ' section of your output).

How the compiler translates the assignment node

Below, you can see how the compiler translates our assignment node:

As-init.   As L (16).   .   name-main.autotmp_0003 L (ptr64-*uint8).   . Name-go.itab.* "". T. "".   I L (+) ptr64-*uint8.   IF L (16).   If-test.   .   EQ L (+) bool.   .   .   name-main.autotmp_0003 L (ptr64-*uint8).   .   .   Literal-nil I (ptr64-*uint8).   If-body.   .   As L (16).   .   .   name-main.autotmp_0003 L (ptr64-*uint8).   .   .   Callfunc L (ptr64-*byte).   .   .   .   Name-runtime.typ2itab L (2) func-funcstruct-(field-.   .   .   .   .   Name-runtime.typ 2 L (2) Ptr64-*byte, field-.   .   .   .   .   NAME-RUNTIME.TYP2 3 L (2) Ptr64-*byte ptr64-*byte, field-.   .   .   .   .   Name-runtime.cache 4 L (2) Ptr64-*ptr64-*byte ptr64-*ptr64-*byte) ptr64-*byte.   .   .   Callfunc-list.   .   .   .   As L (16).   .   .   .   .   Indreg-sp L (+) Runtime.typ 2 G0 ptr64-*byte.   .   .   .   .   ADDR L (ptr64-*uint8).   .   .   .   .   . Name-type.* "".   T L (one) uint8.   .   .   .   As L (16).   .   .   .   .   Indreg-sp L (+) Runtime.typ2 3 G0 ptr64-*byte.   .   .   .   . ADDRL (+) ptr64-*uint8.   .   .   .   .   . Name-type. "".   I L (+) uint8.   .   .   .   As L (16).   .   .   .   .   Indreg-sp L (+) Runtime.cache 4 G0 ptr64-*ptr64-*byte.   .   .   .   .   ADDR L (ptr64-*ptr64-*uint8).   .   .   .   .   . Name-go.itab.* "". T. "".   I L (+) Ptr64-*uint8as L (16). Name-main.i L (+) main. I. Eface L (+) main.   I..   name-main.autotmp_0003 L (ptr64-*uint8).   . name-main.autotmp_0000 L (ptr64-*main). T


As can see from the output, the compiler first adds a initialization node list (As-init) to the assignment n Ode. Inside the as-init node, it creates a new variable, main.autotmp_0003, and assigns the value of the Go.itab.* "". T. "". I variable to it. After the, it checks whether this variable is nil. If the variable is nil, the compiler calls the Runtime.typ2itab function and passes the following to it:

A pointer to the main. T Type,
A pointer to the main. I interface Type,
And a pointer to the go.itab.* "". T. "". I variable.

From this code, it's quite evident that this variable are for caching the result of type conversion from Main. T to Main. I.

Inside the getitab method

The next logical step is to find runtime.typ2itab. Below is the listing of this function:

Func typ2itab (t *_type, Inter *interfacetype, cache **itab) *itab {tab: = Getitab (Inter, T, false) Atomicstorep (unsafe. Pointer (cache), unsafe. Pointer (tab)) return tab}


It is quite evident that the actual work was done inside the Getitab method, because the second line simply stores The Created tab variable in the cache. So, let's look inside getitab. Since It is rather big, I only copied the most valuable part.

m =     (*itab) (Persistentalloc (unsafe). Sizeof (itab{}) +uintptr (Len (INTER.MHDR)-1) *ptrsize, 0,    &memstats.other_sys))    M.inter = Interm._type = TYPNI: = Len (INTER.MHDR) NT: = Len (X.MHDR) J: = 0for k: = 0; K < ni; k++ {i: = &inter.mhdr[k]iname: = I.nameipkgpath: = i.pkgpathitype: = i._typefor; J < NT; J + + {t: = &x.mhdr[j]i F T.mtyp = = Itype && T.name = = Iname && T.pkgpath = ipkgpath {if m! = Nil {* (*unsafe. Pointer) (Add (unsafe). Pointer (&m.fun[0]), UIntPtr (k) *ptrsize)) = T.ifn}}}


First, we allocate memory for the result:

(*itab) (Persistentalloc (unsafe. Sizeof (itab{}) +uintptr (Len (INTER.MHDR)-1) *ptrsize, 0, &memstats.other_sys))


Why should we allocate memory in Go and what is this do in such a strange? To answer this question, we need-look at the itab struct definition.

Type itab struct {Inter  *interfacetype_type  *_typelink   *itabbad    int32unused int32fun    [1] UINTPTR//variable sized}


The last property, fun, was defined as an array of one element, but it was actually variable-sized. Later, we'll see the This property contains an array of pointers to methods defined in a particular type. These methods correspond to the methods in the interface type. The authors of Go use dynamic memory allocation for this property (yes, such things is possible, when you use an unsafe p Ackage). The amount of memory to being allocated is calculated by adding the size of the struct itself to the number of methods in the interface multiplied by a pointer size.

Unsafe. Sizeof (itab{}) +uintptr (Len (INTER.MHDR)-1) *ptrsize


Next, you can see the nested loops. First, we iterate through all interface methods. For each method in the interface, we try to find a corresponding method in a particular type (the methods is stored in th E mhdr collection). The process of checking whether, methods is equal is quite self-explanatory.

if T.mtyp = = Itype && T.name = = Iname && T.pkgpath = Ipkgpath


If we find a match, we store a pointer to the method in the "the" and "the" Result:

* (*unsafe. Pointer) (Add (unsafe). Pointer (&m.fun[0]), UIntPtr (k) *ptrsize)) = T.IFN


A Small Note on Performance:since methods is sorted alphabetically for interface and pre-set type definitions, this nest Ed Loop can repeat o (n + m) times instead of O (n * m) times, where n and m correspond To the number of methods.

Finally, do you remember the last part of the assignment?

As L (+).   Name-main.i L (+) main. I.   Eface L (+) main. I.   .   name-main.autotmp_0003 L (ptr64-*uint8   ). name-main.autotmp_0000 L (ptr64-*main). T


Here, we assign the eface node to the MAIN.I variable. This node (eface) contains references to the main.autotmp_0003 variable-a Pointer to the itab struct tha T is returned by the Runtime.typ2itab Method-and to the autotmp_0000 variable that contains the same VA Lue as the main.t variable. This is a we need to call methods by interface references.

So, the main.i variable contains a instance of the iface struct defined in the runtime package:

Type iface struct {tab  *itabdata unsafe. Pointer}

What ' s next?

I understand that I ' ve had only covered a very small part of the go compiler and the go runtime so far. There is still plenty of interesting things to talk about, such as object files, the linker, relocations, Etc.-they would Be overviewed in the upcoming blog posts.

Read all parts of the Series:part 1 | Part 2 | Part 3 | Part 4 | Part 5

About the Author: Sergey Matyukevich is a Cloud Engineer and Go Developer at Altoros. With 6+ years in software engineering, he's an expert on cloud automation and designing architectures for complex cloud-b ased systems. An active member of the Go community, Sergey are a frequent contributor to Open-source projects, such as Ubuntu and Juju Ch Arms.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.