This is a creation in Article, where the information may have evolved or changed. Golang Internals
There is little information about the internal details of the go language in Chinese, so I studied it myself.
Statement: This article mainly comes from my research on the source code, as well as some information found on the Web, not guarantee the complete correctness
-------------------------------------------------------
Function call Protocol
The non-contiguous stack used in the Go language. The reason is that you need to support goroutine.
Assuming that go func is called, the Func function runs in a new go thread, and obviously the new goroutine cannot be used on the same stack as the current go thread, otherwise it will overwrite each other.
So the call protocol to the GO keyword is different from the normal function call. Unlike regular C-language calls, which are called Func directly after the push parameter, the above code is compiled as follows:
parameter in stack
Push func
Push 12
Call Runtime.newproc
Pop
Pop
12 is the size that the parameter occupies. In Runtime.newproc, a new stack space is created, the 12 bytes of the stack parameter are copied to the new stack space and the stack pointer points to the parameter.
At this point the thread state is a bit like when the scheduler robs the CPU, the PC,SP is stored in a struct struct g that is similar to the process Control block. Func is stored in the entry domain of struct g, and the scheduler will let Goroutine execute from Func when dispatched later.
The DEFER keyword invocation process is similar to go, unlike Call's Runtime.deferproc
When the function returns, if it contains a defer statement, it is not called the Add xx SP, return
But call Runtime.deferreturn,add Sp,return.
Multi-value return has not been studied to understand how to achieve, if not mistaken, C language in the return value seems to be placed in the EAX, this estimate to put in the stack. remains to be verified.
-----------------------------------------------------------------------
Compilation Process Analysis
$GOROOT/SRC/CMD/GC directory, where GC is not garbage collection, but go compiler
The main function of the 6g/8g source file is in the LEX.C
From this file you can see the entire compilation process. First, using Bison to do lexical analysis Yyparse ()
The following is the syntax analysis, the first step in the note is the second step ... Finally, the target file is generated. 8 or. 6, equivalent to C. o
GO.Y is a syntax definition file for Bison
In fact, go in the compile phase will simply put all the content into the nodelist data structure as the result of parsing, and then export to write a *.8 ( such as I386 's architecture), this. 8 file is probably like this:
Go Object Linux 386 go1 X:none
Exports automatically generated from
Hello.go in package "even"
$$//Exports
Package even
Import Runtime "Runtime"
Type @ "". T struct {@ "". ID int}
Func (@ "". This *@ "". T "Noescape") Id () (? int) {return @ "". this.@ "". ID}
Func @ "". Even (@ "". I int) (? bool) {return @ "". I% 2 = = 0}
Func @ "". Odd (@ "". I int) (? bool) {return @ "". I% 2 = = 1}
$$//local types
$$
....
You can do your own experiment, write a hello.go, run Go tool 8g Hello.go
Specific file format, you can refer to the implementation of the DumpObj function in SRC/CMD/GC/OBJ.C
And if we write an import in the source file, it will actually import the obj file into the current lexical parsing process, such as
Import xxx
It just loads the pkg/amd64-linux/xxx.a in, and then parses the obj file.
If we look at GO.Y's syntax analysis definition, we will see many definitions of hidden and there naming, such as Import_there, Hidden_import, etc., which are actually defined from the obj file.
Another example is that we might see some syntax definitions that don't exist in the source code at all, but it did compile because the source files were inserted into some other fragments as needed during the compilation, such as some libraries of builtin or some of the Lib libraries that were customized.
Understanding this, basically on the go to the compilation process has a knowledge of the fact that the go to compile the process of doing is to turn it into obj finished, at least we do not see more work at the moment. The next step is to get a deeper understanding of the implementation of the XL, which is part of the process of turning obj into executable code, which should be more interesting.
---------------------------------------------------------------------------------------------
Scheduler-related in runtime
$GOROOT/src/pkg/runtime Directory is important, it is worth studying, the source code can start reading from runtime.h.
Goroutine implements its own set of threading systems, language-level support, regardless of pthread or system-level threads.
Some important structural bodies are defined in runtime.h. Two important structural bodies are G and M
The struct G name should be the abbreviation of Goroutine, which is equivalent to the process control block in the operating system, where the control structure of the thread is, and is the abstraction of the thread.
These include
Goid//thread ID
status//thread state, such as Gidle,grunnable,grunning,gsyscall,gwaiting,gdead, etc.
There is a resident register extern register g* G is used, this is the thread control block pointer of the current thread. This register in AMD64 is used with the R15, and the 0 (GS) segment registers are used in the x86
The structure m name should be the abbreviation of machine. Is the abstraction of the machine, here is the CPU core available.
PROC.C is the implementation of thread scheduling related.
If you have the experience of writing an operating system, it will be more enjoyable to see this.
Scheduler scheduling is the time when a thread enters a system call, or requests memory, or is blocked by waiting for a pipe
------------------------------------------------------------------------------------------
Initialization of the system
There is a note in PROC.C
The bootstrap sequence is:
//
Call Osinit
Call Schedinit
Make & Queue New G
Call Runtime Mstart
//
The new G calls runtime main.
This can be seen in the $GOROOT/SRC/PKG/RUNTIME/ASM_386.S. The program that the go compiler generates should be executed from the beginning of this file.
Saved argc, argv
...
Call Runtime args (SB)
Call Runtime Osinit (SB)//This sets the number of CPU cores
Call Runtime Schedinit (SB)
Create a new goroutine to start program
PUSHL $runtime Main (SB)//Entry
PUSHL//ARG size
Call Runtime Newproc (SB)
POPL AX
POPL AX
Start this M
Call Runtime Mstart (SB)
Remember the call protocol for the go thread mentioned earlier? Push the parameter first, then push the adjusted function and the parameter byte number, then call Runtime.newproc
So this is actually a new thread execution Runtime.main
Runtime.newproc will put Runtime.main in the Ready thread queue.
This thread continues to execute runtime.mstart,m meaning machine. Runtime.mstart will be called to schedule
The schedule function does not return, and it runs by picking one of the thread states of the current threads queue.
And then dispatched to the Runtime.main function, Runtime.main will call the user's main function, that is, main.main to enter the user code
Summing up the function call process is
Runtime.osinit--Runtime.schedinit---Runtime.newproc, Runtime.mstart--and schedule-
Runtime.main-Main.main
This could be a HelloWorld. Debug with GDB, step-by-step with
-----------------------------------------------------------------------------------------------
The realization of interface
Suppose we divide the type into specific types and interface types.
Specific types such as type myint int32 or type MyType struct {...}
The interface type is for example type I interface {}
The value of the interface type, which is stored in memory as two fields, a pointer to the real data (data of the specific type), and a itab pointer.
See $goroot/src/pkg/reflect/value.go for type nonemptyinterface struct {...} definition
Itab contains the type descriptor information for the data (specific type) and a method table
The method table is similar to the virtual function table of objects in C + +, which is full of function pointers.
The method table is generated dynamically when the interface value is initialized. In particular, say:
For each specific type, a type description structure is generated, and the type description structure contains a list of methods of this type
For an interface type, a type description structure is also generated, and the type description structure contains a list of methods for the interface
When an interface value is initialized, a method table that uses a concrete type of method table to dynamically generate an interface value.
For example, the process of var i i = MyType is:
Constructs the value of an interface type I, the first field of a value is a pointer to a copy of the MyType data. Note that the copy is not the MyType data itself, because if this is not the case, the value of the MyType is changed.
The second field of the value is a type descriptor that points to a dynamically constructed Itab,itab value field is a type descriptor that is stored mytype, and the Itab Method table field is a copy of the corresponding function pointer of the method table of the MyType type descriptor. The function of constructing Itab code in $ROOT/SRC/PKG/RUNTIME/IFACE.C
Static itab* Itab (InterfaceType *inter, Type *type, Int32 canfail)
Here's a little detail. The method table of the type descriptor is sorted by the method name so that the dynamic build process of the itab is faster and the complexity is O (interface type method table length + concrete type method table length)
Someone may have had a question: how does the compiler know if a type implements an interface? This solves the question here:
In the process of var i i = MyType, if it is found that the method table in the MyType type descriptor is not the same as the method table in the type descriptor of interface I, this initialization process will be faulted, suggesting that MyType does not implement a certain method in the interface.
Another detail, all methods, are converted into functions during the compilation process.
For example, the Func (S *mytype) get () will be turned into a Func get (s *mytype).
When a method call is made to an interface value, a function pointer to the method table in Itab is found, and the first parameter passes through the first field of the interface value, which is a pointer to the specific type of data.
In the implementation of the above also have some optimization process, such as the interface value of the real data pointer that field, if the real data size is 32 bits, you do not have to save the pointer, directly stored data itself. Then there is the class interface type interface{}, whose itab does not require a method table, so this is not a itab and is directly a pointer to the type description structure of the real data.
-------------------------------------------------------------------------------------------------
Some of the links collected about Go internals:
Http://code.google.com/p/try-catch-finally/wiki/GoInternals
Http://research.swtch.com/gopackage
Http://research.swtch.com/interfaces
Http://research.swtch.com/goabstract
http://blog.csdn.net/hopingwhite/article/details/5782888