A major improvement over C/c++,golang is the introduction of a GC mechanism that eliminates the need for users to manage their own memory, significantly reducing the bugs introduced by the program due to memory leaks, but at the same time the GC introduces additional performance overhead, sometimes even due to improper use, which makes GC a performance bottleneck, so Golang programming, special attention should be paid to the reuse of objects to reduce the pressure of GC. While slice and string are basic types of golang, understanding the internal mechanisms of these basic types helps us to better reuse these objects
Slice and string internal structure
The internal structure of the slice and string can be $GOROOT/src/reflect/value.go
found inside
type StringHeader struct { Data uintptr Len int}type SliceHeader struct { Data uintptr Len int Cap int}
You can see that a string contains a data pointer and a length, and the length is immutable
The slice contains a data pointer, a length and a capacity, and when the capacity is insufficient, it will re-request new memory, the data pointer will point to the new address, the original address space will be freed
From these structures, it can be seen that the assignment of string and slice, including as a parameter pass, and a custom struct, are just shallow copies of the Data pointer.
Slice Reuse
Append operation
si1 := []int{1, 2, 3, 4, 5, 6, 7, 8, 9}si2 := si1si2 = append(si2, 0)Convey("重新分配内存", func() { header1 := (*reflect.SliceHeader)(unsafe.Pointer(&si1)) header2 := (*reflect.SliceHeader)(unsafe.Pointer(&si2)) fmt.Println(header1.Data) fmt.Println(header2.Data) So(header1.Data, ShouldNotEqual, header2.Data)})
SI1 and Si2 begin to point to the same array, and when the append operation is performed on the Si2, because the original cap value is not enough, the Data value is changed, and there is a $GOROOT/src/reflect/value.go
policy on the new CAP value in this file, where grow
function, when the cap is less than 1024, it is multiplied, over the time, each increase of 25%, and this memory growth is not only the data copy (from the old address copy to the new address) need to consume extra performance, the release of old address memory will also cause additional burden to the GC, So if you can know the length of the data case, try to use make([]int, len, cap)
pre-allocated memory, do not know the length of the case, you can consider the following memory reuse method
Memory Reuse
si1 := []int{1, 2, 3, 4, 5, 6, 7, 8, 9}si2 := si1[:7]Convey("不重新分配内存", func() { header1 := (*reflect.SliceHeader)(unsafe.Pointer(&si1)) header2 := (*reflect.SliceHeader)(unsafe.Pointer(&si2)) fmt.Println(header1.Data) fmt.Println(header2.Data) So(header1.Data, ShouldEqual, header2.Data)})Convey("往切片里面 append 一个值", func() { si2 = append(si2, 10) Convey("改变了原 slice 的值", func() { header1 := (*reflect.SliceHeader)(unsafe.Pointer(&si1)) header2 := (*reflect.SliceHeader)(unsafe.Pointer(&si2)) fmt.Println(header1.Data) fmt.Println(header2.Data) So(header1.Data, ShouldEqual, header2.Data) So(si1[7], ShouldEqual, 10) })})
Si2 is a slice of SI1, from the first code can see the slices do not reallocate memory, SI2 and SI1 Data pointer to the same address, and the second code can see that when we si2 a new value in the Append, we found that there is still no memory allocation, and this operation So that the value of the SI1 also changed, because the two are pointing to the same piece of Data area, using this feature, we just have to si1 = si1[:0]
be able to continuously empty the contents of SI1, to achieve memory reuse
PS: You can use to copy(si2, si1)
implement deep copy
String
Convey("字符串常量", func() { str1 := "hello world" str2 := "hello world" Convey("地址相同", func() { header1 := (*reflect.StringHeader)(unsafe.Pointer(&str1)) header2 := (*reflect.StringHeader)(unsafe.Pointer(&str2)) fmt.Println(header1.Data) fmt.Println(header2.Data) So(header1.Data, ShouldEqual, header2.Data) })})
This example is simple, string constants use the same address area
Convey("相同字符串的不同子串", func() { str1 := "hello world"[:6] str2 := "hello world"[:5] Convey("地址相同", func() { header1 := (*reflect.StringHeader)(unsafe.Pointer(&str1)) header2 := (*reflect.StringHeader)(unsafe.Pointer(&str2)) fmt.Println(header1.Data, str1) fmt.Println(header2.Data, str2) So(str1, ShouldNotEqual, str2) So(header1.Data, ShouldEqual, header2.Data) })})
Different substrings of the same string, no additional requests for new memory, but note that the same string here, refers str1.Data == str2.Data && str1.Len == str2.Len
to, rather than str1 == str2
, the following example can be explained str1 == str2
but its Data is not the same
Convey("不同字符串的相同子串", func() { str1 := "hello world"[:5] str2 := "hello golang"[:5] Convey("地址不同", func() { header1 := (*reflect.StringHeader)(unsafe.Pointer(&str1)) header2 := (*reflect.StringHeader)(unsafe.Pointer(&str2)) fmt.Println(header1.Data, str1) fmt.Println(header2.Data, str2) So(str1, ShouldEqual, str2) So(header1.Data, ShouldNotEqual, header2.Data) })})
In fact, for a string, you just have to remember that the string is immutable, no strings are requested for extra memory (for internal data pointers only), I've been smart about designing a cache to store strings to reduce the space occupied by repeating strings, in fact, Unless the string itself is created by itself, []byte
the string itself is a substring of another string (such as strings.Split
a string obtained) and would not have applied for additional space, which is superfluous
Reference links
- Go Slices:usage and Internals:https://blog.golang.org/go-slices-usage-and-internals
- Test Code Link: https://github.com/hatlonely/hellogolang/blob/master/internal/buildin/reuse_test.go
thumb_up
Reprint Please indicate the source
This article links:http://www.hatlonely.com/2018/03/17/golang-slice-and-string-reuse/