This is a creation in Article, where the information may have evolved or changed. This article by Bole Online-yhx translation, Jasper School Draft. without permission, no reprint!
English Source: Sergey Matyukevich. Welcome to join the translation team.
When you use a variable through an interface reference, do you know what the Go runtime does? The question is not easy to answer. This is because in Go, a type implements an interface, but this type does not contain any references to this interface. As with the previous blog, Go Language Insider (1): The main concept and project structure, you can use the knowledge of the go compiler to answer this question. About the Go compiler we've already discussed a part of it in the previous article.
Here, let's explore the go compiler in more depth: Create a simple go program to see what the go interior does in the type conversion process. With this example, I'll explain how the node tree is generated and used. In the same way, you can apply the knowledge of this blog to other go compiler features.
Objective
To complete this experiment, we need to use the Go compiler directly (instead of using the Go tool). You can use the following command:
Go Tool 6g Test.go
This command compiles the Test.go source file and generates the target file. Here, 6g is the name of the compiler on the AMD64 schema. Please note that if you are on a different schema, use a different compiler.
We may use some command-line arguments when using the compiler directly (see here for details). In this experiment, we will use the-w parameter, which will output the node tree's layout structure.
Create a simple Go program
First, we need to write a simple go program first. The program I wrote is as follows:
Package MainType I Interface { dosomework ()}type t struct { a int}func (T *t) dosomework () {}func main () { T: = &t{} I: = i (T) print (i)}
This piece of code is very simple, isn't it? Where the 17th output the value of the variable I, the line of code looks superfluous. However, without this line of code, the variable i is not used in the program, then the whole program will not be compiled. Next, we'll use the-w parameter to compile our program:
Go Tool 6g-w Test.go
Once the compilation is complete, you will see that the output contains a node tree for each method defined in the program. In our example, there is the main and Init methods. The Init method is implicitly generated, and all programs will have this method. Here, we temporarily put this method aside.
For each method, the compiler outputs a two-version node tree. The first is the original node tree that you just finished parsing the source file generation. The other is the completion of type checking and some necessary modified node trees.
Parse the node tree of the Main method
Let's take a closer look at the original node tree of the main method and try to figure out exactly what the Go compiler does.
DCL L (15). name-main.t U (1) A (1) G (1) L (+) x (0+0) class (Pauto) f (1) LD (1) TC (1) Used (1) ptr64-*main. TAS L (Colas) (1) TC (1). name-main.t U (1) A (1) G (1) L (+) x (0+0) class (Pauto) f (1) LD (1) TC (1) Used (1) ptr64-*main. T. Ptrlit L (+) ESC (NO) LD (1) TC (1) ptr64-*main. T.. Structlit L () TC (1) Main. T.. . TYPE L (() TC (1) implicit (1) type=ptr64-*main. T Ptr64-*main. TDCL L (16). Name-main.i U (1) A (1) G (2) L (+) x (0+0) class (Pauto) f (1) LD (1) TC (1) Used (1) main. IAS L (+) TC (1). name-main.autotmp_0000 U (1) A (1) l (+) x (0+0) class (Pauto) ESC (N) TC (1) Used (1) ptr64-*main. T. name-main.t U (1) A (1) G (1) L (+) x (0+0) class (Pauto) f (1) LD (1) TC (1) Used (1) ptr64-*main. TAS L (+) Colas (1) TC (1). Name-main.i U (1) A (1) G (2) L (+) x (0+0) class (Pauto) f (1) LD (1) TC (1) Used (1) main. I. Conviface L (+) TC (1) Main. I.. name-main.autotmp_0000 U (1) A (1) l (+) x (0+0) class (Pauto) ESC (N) TC (1) Used (1) ptr64-*main. Tvarkill L (+) TC (1). name-main.autotmp_0000 U (1) A (1) l (+) x (0+0) CLASS (Pauto) ESC (N) TC (1) Used (1) ptr64-*main. Tprint L (+) TC (1) print-list. Name-main.i U (1) A (1) G (2) L (+) x (0+0) class (Pauto) f (1) LD (1) TC (1) Used (1) main. I
In the following parsing process, I delete some unnecessary information from the node tree.
The first node is very simple:
DCL L (a). name-main.t L (ptr64-*main). T
The first node is a declaration node. L (15) Describes the definition of this node in the 15th line of the source code. This declaration node refers to the name node that represents the MAIN.T variable. This variable is defined in the main package that points to main. A 64-bit pointer of type T. It's easy to see the 15th line in the source code and see what the statement represents.
The next node is also a declaration node. This time, this declaration node declares one that belongs to main. Variable main.i of type T.
DCL L (+). Name-main.i L (+) main. I
The compiler then creates another variable, autotmp_0000, and assigns the variable MAIN.T to the variable.
As L (+) TC (1). name-main.autotmp_0000 L (ptr64-*main). T. name-main.t L (ptr64-*main). T
Finally, we finally see the nodes that we are really interested in.
As L (+). Name-main.i L (+) main. I. Conviface L (+) main. I. . name-main.autotmp_0000 Ptr64-*main. T
We can see that the compiler assigns a special node conviface to the variable main.i. But that doesn't tell us what's going on behind this assignment. To figure out the truth behind the scenes, we need to analyze the main method node tree after the modification is complete (you can see the relevant information in the "After Walk main" section of the output information).
How the compiler translates the assignment nodes
Below, you will see how the compiler translates the assignment nodes:
As-init. As L (16). . name-main.autotmp_0003 L (ptr64-*uint8). . Name-go.itab.* "". T. "". I L (+) ptr64-*uint8. IF L (16). If-test. . EQ L (+) bool. . . name-main.autotmp_0003 L (ptr64-*uint8). . . Literal-nil I (ptr64-*uint8). If-body. . As L (16). . . name-main.autotmp_0003 L (ptr64-*uint8). . . Callfunc L (ptr64-*byte). . . . Name-runtime.typ2itab L (2) func-funcstruct-(field-. . . . . Name-runtime.typ 2 L (2) Ptr64-*byte, field-. . . . . NAME-RUNTIME.TYP2 3 L (2) Ptr64-*byte ptr64-*byte, field-. . . . . Name-runtime.cache 4 L (2) Ptr64-*ptr64-*byte ptr64-*ptr64-*byte) ptr64-*byte. . . Callfunc-list. . . . As L (16). . . . . Indreg-sp L (+) Runtime.typ 2 G0 ptr64-*byte. . . . . ADDR L (ptr64-*uint8). . . . . . Name-type.* "". T L (one) uint8. . . . As L (16). . . . . Indreg-sp L (+) Runtime.typ2 3 G0 ptr64-*byte. . . . . ADDRL (+) ptr64-*uint8. . . . . . Name-type. "". I L (+) uint8. . . . As L (16). . . . . Indreg-sp L (+) Runtime.cache 4 G0 ptr64-*ptr64-*byte. . . . . ADDR L (ptr64-*ptr64-*uint8). . . . . . Name-go.itab.* "". T. "". I L (+) Ptr64-*uint8as L (16). Name-main.i L (+) main. I. Eface L (+) main. I.. name-main.autotmp_0003 L (ptr64-*uint8). . name-main.autotmp_0000 L (ptr64-*main). T
As you can see in the input, the compiler first adds an initialization node list (as-init) to the assignment node to allocate nodes, and in the As-init node, it creates a new variable main.autotmp_0003 and go.itab.* "". T. "". The value of the I variable is assigned to the newly generated variable. The variable is then checked for nil. If the variable is nil, the compiler calls the Runtime.type2itab function with the following parameters:
A pointer to the main. T type, a pointer to the main. I interface Type,and A pointer to the go.itab.* "". T. "". I variable.
From this part of the code it is easy to see that this variable is used for caching from main. T converts to main. Intermediate result of I.
Getitab Method Interior
Logically, the next step is to find the Runtime.typ2itab method. Here's how to do this:
Func typ2itab (t *_type, Inter *interfacetype, cache **itab) *itab {tab: = Getitab (Inter, T, false) Atomicstorep (unsafe. Pointer (cache), unsafe. Pointer (tab)) return tab}
Obviously, the second line in the Runtime.typ2itab method simply creates a tab variable, so the real work is done in the Getitab method. So we're going to explore the Getitab method again. Because the code of this method is very large, I only copied the most important part of it.
m = (*itab) (Persistentalloc (unsafe). Sizeof (itab{}) +uintptr (Len (INTER.MHDR)-1) *ptrsize, 0, &memstats.other_sys)) M.inter = Interm._type = TYPNI: = Len (INTER.MHDR) NT: = Len (X.MHDR) J: = 0for k: = 0; K < ni; k++ {i: = &inter.mhdr[k]iname: = I.nameipkgpath: = i.pkgpathitype: = i._typefor; J < NT; J + + {t: = &x.mhdr[j]i F T.mtyp = = Itype && T.name = = Iname && T.pkgpath = ipkgpath {if m! = Nil {* (*unsafe. Pointer) (Add (unsafe). Pointer (&m.fun[0]), UIntPtr (k) *ptrsize)) = T.ifn}}}
First, we allocated a memory space for the result:
(*itab) (Persistentalloc (unsafe. Sizeof (itab{}) +uintptr (Len (INTER.MHDR)-1) *ptrsize, 0, &memstats.other_sys))
Why should we allocate memory space and in such a strange way? To answer this question, we need to look at the definition of the ITAB structure.
Type itab struct {Inter *interfacetype_type *_typelink *itabbad int32unused int32fun [1] UINTPTR//variable sized}
The last attribute fun is defined as an array of only one element, but the length of the array is actually variable. We will then see that this mutable array stores pointers to methods defined in the type. These methods correspond to the methods of the interface type. The Go language author uses the dynamic memory allocation method to allocate space for this property (yes, this is possible if you are using the unsafe package). The size of the allocated memory is the sum of the number of methods in the interface multiplied by the size of the pointer plus the size of the struct itself.
Unsafe. Sizeof (itab{}) +uintptr (Len (INTER.MHDR)-1) *ptrsize
Next, you'll see a nested loop. First, we iterate through the methods of all interfaces. For each method in the interface, we try to find a corresponding method in the type (these methods are stored in the Mhdr collection). The method of checking whether the two methods are the same is quite clear.
* (*unsafe. Pointer) (Add (unsafe). Pointer (&m.fun[0]), UIntPtr (k) *ptrsize)) = T.IFN
Here's a bit of a performance improvement: these interfaces and the methods of the preset types are all in alphabetical order, and this nested loop requires only O (n + m) instead o (n * m), where N and M correspond to the number of methods respectively.
Do you remember the last part of the assignment?
As L (+). Name-main.i L (+) main. I. Eface L (+) main. I. . name-main.autotmp_0003 L (ptr64-*uint8 ). name-main.autotmp_0000 L (ptr64-*main). T
Here, we assign the Eface node to the MAIN.I variable. This node (eface) contains a reference to the variable main.autotmp_0003 – a pointer to the ITAB structure returned by the Runtime.typ2itab method, a reference to the autotmp_0000 variable, a package in the autotmp_0000 variable Contains the same value as the MAIN.T variable. The above is all the information we need to invoke the method through the interface.
Therefore, the MAIN.I variable stores an instance of the Iface struct defined in the run-time package:
Type iface struct {tab *itabdata unsafe. Pointer}
What does the next article say?
So far, we've only analyzed a small piece of code from the go compiler and go runtime. There is a lot of interesting content waiting for us to explore, such as target files, linker, relocation and so on. In the next blog I will analyze the content in turn.