This is a creation in Article, where the information may have evolved or changed.
It's no surprise to use CGO to get Go to work with C. A large number of third-party packages are directly encapsulated in the C library and are available for Go use. As you can see from the code of the Go project itself, there is not only C code, but also assembly code. Is it possible to integrate with the assembly in your own project? This article provides a complete and clear explanation of how to get Go and the assembly working together. Really sensitive to performance? On the Assembly Bar!!
———— Translation Divider Line ————
Go and assembly
One of my favorite parts about Go is its unwavering pragmatism. Sometimes we put too much emphasis on language design and forget about the other things that programming contains. For example:
- The Go compiler soon
- Go has a powerful standard library
- Go can work on multiple platforms
- Go has a full document that can be accessed via command line/local WEB service/Internet
- All Go code is statically compiled, so the problem with deployment is negligible
- All Go codes are published in a good format and can be read online (like this)
- Go has a well-defined (and documented) syntax. (not like C + + or Ruby)
- Go comes with package management tools.
go get X
(e.g. go get code.google.com/p/go.net/websocket
)
- Like other languages, Go has coded style guidelines, some are compiler-enforced (such as uppercase and lowercase), while others are just conventions, but it also provides a tool for organizing the code:
gofmt name_of_file.go
.
- There are also tools to
go fix
automatically migrate go code from an earlier version to a new version
- Go comes with a test kit to test the package:
go test /path/to/package
. It can also perform performance evaluations.
- You can debug and evaluate Go programs.
- You know there's a playground. Can you try Go online?
- The C library can be consolidated via CGO Go.
There are some examples of these, but here I would like to focus on a less well-known topic:Go can seamlessly invoke assembly-written functions .
How to use the assembly in Go
Suppose we need to write a compiled version of the sum
function. First, create a sum.go
file called the following:
Package Sumfunc Sum (xs []int64) Int64 { var n int64 for _, V: = range xs { n + = v } return n}
This function adds the slice of an integral type and returns the result. To test this function, create a sum_test.go
file called the following:
Package Sumimport ( "testing") type ( testCase struct { n int64 xs []int64 }) var ( cases = [] testcase{ {0, []int64{}}, {A, []int64{1,2,3,4,5}}, }] func testsum (t *testing. T) { for _, TC: = Range Cases { N: = Sum (TC.XS) if TC.N! = n { t.error ("expected", TC.N, "got", N, "fo R ", Tc.xs)}}}
It's a good idea to write tests for your code, not only to check the library's code ( as long as package main
it's not |: The method in package main is also available go test
for testing), or a good way to experiment. go test
You can run this test at the command-line input.
Now let's replace this function with a compilation. We can look at what the Go compiler actually generates. Use the command go tool 6g-s sum.go
instead of go test
or go build
(for 64-bit). You will get the following content:
---Prog list "Sum"---0000 (sum.go:3) TEXT sum+0 (SB), $ 16-240001 (Sum.go:4) movq $0,si0002 (sum.go:5) movq xs+0 (FP), BX0003 (sum.go:5) movq bx,autotmp_0000+-16 (SP) 0004 ( Sum.go:5) movl xs+8 (FP), BX0005 (sum.go:5) movl bx,autotmp_0000+-8 (SP) 0006 (sum.go:5) movl xs+12 (FP), BX0007 (SUM.G O:5) movl bx,autotmp_0000+-4 (sp) 0008 (sum.go:5) movl $0,ax0009 (sum.go:5) movl autotmp_0000+-8 (sp), DI0010 (sum.go : 5) Leaq autotmp_0000+-16 (SP), BX0011 (Sum.go:5) movq (BX), CX0012 (Sum.go:5) JMP, 140013 (Sum.go:5) incl, AX00 (sum.go:5) Cmpl ax,di0015 (sum.go:5) Jge, 200016 (sum.go:5) movq (CX), BP0017 (sum.go:5) addq $8,cx0018 (su M.go:6) ADDQ bp,si0019 (sum.go:5) JMP, 130020 (sum.go:8) movq si,.noname+16 (FP) 0021 (sum.go:8) RET, Sum.go:3 : Sum XS does not escape
The compilation is quite difficult to understand, we will learn more about this part in a moment ... First, however, use this as a template and then go down. sum.go
create a file called in the same directory sum_amd64.s
, with the following content:
Func Sum (xs []int64) Int64text Sum (SB), $ movq $0,si movq xs+0 (FP), BX movq bx,autotmp_0000+-16 (SP) movl xs+8 ( FP), BX movl bx,autotmp_0000+-8 (sp) movl xs+12 (FP), BX movl bx,autotmp_0000+-4 (SP) movl $0,ax movl autotmp_0000+-8 (sp), DI leaq autotmp_0000+-16 (sp), BX movq (BX), CX JMP l2l1:incl axl2:cmpl ax,di jge L3 movq (CX), BP addq $8,cx ADDQ Bp,si JMP l1l3:movq si,.noname+16 (FP) RET
Basically, all I do is replace the hard-coded line number used for the jump (JMP,JGE) with the label, and add the midpoint character (·) before the function name. (Make sure the file is saved as UTF-8 encoding) Next, sum.go
Remove our function definitions from the:
Package Sumfunc Sum (xs []int64) Int64
Now it should be possible to go test
run the test, which will use the custom assembly version of the function.
Working principle
Here are some more detailed explanations of the assembly. I will briefly explain what it has done.
MOVQ $0,si
First, put 0 into the SI (source-change) register, which represents the position of the executed instruction. Q means four words, 8 bits, and I see that L is 4 bits below. The order of the parameters is (source, target).
Movq xs+0 (FP), bxmovq bx,autotmp_0000+-16 (sp) movl xs+8 (FP), BXMOVL bx,autotmp_0000+-8 (sp) MOVL xs+12 (FP), BXMOVL Bx,autotmp_0000+-4 (SP)
Next, the incoming parameters are received and their values are saved on the stack. A Go slice has three parts: pointer, length, and capacity pointing to the memory in which it resides. The pointer is 8 bits, and the length and capacity are 4 bits. So this code copies the values out of the BX register. (see here for more details on slice)
MOVL $0,axmovl autotmp_0000+-8 (sp), Dileaq autotmp_0000+-16 (sp), Bxmovq (BX), CX
Next, put 0 into AX for the loop variable. Put the length of the slice into DI and load a pointer to the XS element to CX.
JMP l2l1:incl axl2:cmpl ax,di jge L3
The body of the code is now reached. First jump to L2 compare AX and DI. If it is equal, it means that all elements in slice have been calculated, so jump to L3. (i.e. i == len(xs)
).
Movq (CX), BPADDQ $8,cxaddq bp,sijmp L1
The summation is done here. First, the values obtained from CX are saved to BP. Then move the CX forward by 8 bytes. Finally, add BP to SI and jump to L1. L1 add AX and start the loop again.
L3:movq si,.noname+16 (FP) RET
After the sum is finished, the result is saved after all arguments passed to the function (because a slice is 16 bytes, so this is 16 bytes). This is the time to return.
Rewrite
Here I rewrite the code:
Func Sum (xs []int64) Int64text Sum2 (SB), 7,$0 movq $ , SI //n movq xs+0 (FP), bx/bx = &xs[0] movl xs+8 (FP), CX/len (XS) MOVLQSX CX, CX //Len as Int64 INCQ cx //Cx++start: Decq cx //cx-- JZ Done //Jump if CX = 0 ADDQ (BX), SI //n + = *bx addq $8, bx //bx + = 8 JMP startdone: movq SI,. Noname +16 (FP)//return n RET
Hopefully this will be easier to understand.
Advice
It's cool to be able to do this, but don't overlook the advice:
- Compilation is difficult to write, especially difficult to write well. Usually the compiler will write faster code than you (the Go compiler will do better in the previous article).
- The assembly can only run on one platform. In this example, the code can only run on AMD64. One solution to this problem is to give Go to different versions of x86 and arm code (like this).
- The assembler binds you to the bottom, and the standard Go does not. For example, the length of the slice is currently a 32-bit integer. But it is not impossible for a long integer type. When these changes occur, the code is destroyed (or possibly a more disgusting way that the compiler cannot detect)
- The go compiler cannot compile the assembly into a function inline, but it is possible for a small go function. So using a compilation may mean making your program slower.
This is useful for the following two reasons:
Sometimes the need to assemble gives you some strength (whether for performance reasons or some fairly special CPU-related operations). For when to use it, the Go source includes a few pretty good examples (see crypto and math).
because it's so easy to practice, it's definitely a good way to learn a compilation.