This is a creation in Article, where the information may have evolved or changed.
Write in front
Common routines for developing HashSet:
map[int]int8map[int]bool
We generally only use the map key to save the data, the value is useless. So caching the collection data can result in a waste of memory.
Empty Object
An empty object is a magical thing. It refers to a struct type without a field.
type Q struct{}
It is a place where:
Can operate like normal structures
var a = []struct{}{struct{}{}}fmt.Println(len(a)) // prints 1
Do not occupy space
var s struct{}fmt.Println(unsafe.Sizeof(s)) // prints 0
Declares two empty objects that point to the same address
type A struct{}a := A{}b := A{}fmt.Println(&a == &b) // prints true
The reason for this result is that the Golang compiler will treat this empty object as a runtime.zerobase
handler.
var zerobase uintptr
HashSet
With the above introduction, we can use the empty structure to optimize the hashset.
var itemExists = struct{}{}type Set struct { items map[interface{}]struct{}}func New() *Set { return &Set{items: make(map[interface{}]struct{})}}func (set *Set) Add(item interface{}) { set.items[item] = itemExists}func (set *Set) Remove(item interface{}) { delete(set.items, item)}func (set *Set) Contains(item interface{}) bool { if _, contains := set.items[item]; !contains { return false } return true}
A simple HashSet implementation is complete.
Performance comparison
func BenchmarkIntSet(b *testing.B) { var B = NewIntSet(3) B.Set(10).Set(11) for i := 0; i < b.N; i++ { if B.Exists(1) { } if B.Exists(11) { } if B.Exists(1000000) { } }}func BenchmarkMap(b *testing.B) { var B = make(map[int]int8, 3) B[10] = 1 B[11] = 1 for i := 0; i < b.N; i++ { if _, exists := B[1]; exists { } if _, exists := B[11]; exists { } if _, exists := B[1000000]; exists { } }}BenchmarkIntSet-2 50000000 35.3 ns/op 0 B/op 0 allocs/opBenchmarkMap-2 30000000 41.2 ns/op 0 B/op 0 allocs/op
Conclusion
performance, somewhat elevated, but not particularly noticeable. In particular, there should be no noticeable change in the performance of low pressure on the line;
Memory consumption. Our service cache is more, memory is large, through this optimization measurement can reduce 1.6 GB of space. However, the space for this optimization depends on the amount of data.
Reference documents