Remember once Golang implement the Twitter snowflake algorithm to generate a globally unique ID efficiently

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed.

Recently started to prepare a H5 game
Because this is the first time I've touched a game like this.
I want to do it well even if I don't have much.
When I design the table structure, I think of the problem of table global unique ID.
Since it's a game
Then there must be more people online dot Point (Operation ideal state haha ha)
Initially wanted to use MongoDB's objectid as a globally unique ID
But the efficiency of a string as an index is certainly not as good as the integral type.

The main difference between the two is that the character type has the concept of a character set, and each time there is a character set encoding process from the storage side to the presentation end. The main consumption of this process is CPU resources, for the operation of In-memory, this is a negligible consumption. If you use integer substitution, you can reduce the overhead of CPU operations and memory and IO.

So finally considering the ideal state of efficiency and visual effects (integer), consider looking for a pure integer ID replacement scheme
Inadvertently saw Twitter's snowflake algorithm.

This content is mostly borrowed from the Web content, integrated only to help themselves and you crossing better understanding of snowflake principle

Snowflake Snowflake Algorithm

The Snowflake ID algorithm is the unique ID generation algorithm used by Twitter to meet Twitter requests for tens of thousands of messages per second, with a unique, sequential ID for each message, and supports distributed generation.

Principle

It's really simple, just to understand: a machine with a separate identifier (assigning a separate ID to a machine) generates IDs with different serial numbers within 1 milliseconds
So the generated ID is sequential and unique.

Constitute

This is a direct reference to the previous collation, just to give you
a clearer explanation

The structure of the Snowflake ID is a single-bit int data.

    • 1th bit:
      The top 1 in the binary is negative, but the ID we need should be integers, so the top level here should be 0.
    • 41 bit in the back:
      Used to record the millisecond timestamp of the generation ID, where milliseconds are used only to represent positive integers (integers in the computer contain 0), so the range of values that can be represented is 0 to 2^41-1 (why 1 is a lot of people will be confused, remember that the computer values are calculated from 0 instead of 1)
    • The next 10 bit:
      ID used to record the work machine
    • The last 12 bits:
      Used to indicate the ID number generated per millisecond for a single machine
      The 12 bit can represent a maximum positive integer of 2^12-1 = 4096, you can use 0, 1, 2, 3...4095 4096 (note is calculated from 0) number to represent the machine-generated sequence number in 1 milliseconds (this algorithm limits a single machine 1 milliseconds to generate up to 4,096 IDs, Wait for the next millisecond to regenerate)

Finally, the above 4 bits are stitched together by bit operations to form 64 bit bits.

Realize

Here we use Golang to achieve the following snowflake
First define the following snowflake the most basic constants, each constant of the user I have to use comments to tell you in detail

// 因为snowFlake目的是解决分布式下生成唯一id 所以ID中是包含集群和节点编号在内的const (    numberBits uint8 = 12 // 表示每个集群下的每个节点,1毫秒内可生成的id序号的二进制位 对应中的最后一段    workerBits uint8 = 10 // 每台机器(节点)的ID位数 10位最大可以有2^10=1024个节点数 即每毫秒可生成 2^12-1=4096个唯一ID 对应中的倒数第二段      // 这里求最大值使用了位运算,-1 的二进制表示为 1 的补码,感兴趣的同学可以自己算算试试 -1 ^ (-1 << nodeBits) 这里是不是等于 1023    workerMax int64 = -1 ^ (-1 << workerBits) // 节点ID的最大值,用于防止溢出     numberMax int64 = -1 ^ (-1 << numberBits) // 同上,用来表示生成id序号的最大值    timeShift uint8 = workerBits + numberBits // 时间戳向左的偏移量    workerShift uint8 = numberBits // 节点ID向左的偏移量    // 41位字节作为时间戳数值的话,大约68年就会用完    // 假如你2010年1月1日开始开发系统 如果不减去2010年1月1日的时间戳 那么白白浪费40年的时间戳啊!    // 这个一旦定义且开始生成ID后千万不要改了 不然可能会生成相同的ID    epoch int64 = 1525705533000 // 这个是我在写epoch这个常量时的时间戳(毫秒))

The two offsets in the above code Timeshift and Workershift are the positions of the timestamp and the work node in the corresponding diagram
The timestamp starts at the right-to-left workerbits + numberbits (that is, 22), and it's easy to see how you can count.
Workershift

Worker worker Node

Because it's a distributed ID generation algorithm, we're going to generate multiple workers, so here's an abstraction of the basic parameters needed for a woker working node.

// 定义一个woker工作节点所需要的基本参数type Worker struct {    mu sync.Mutex // 添加互斥锁 确保并发安全    timestamp int64 // 记录上一次生成id的时间戳    workerId int64 // 该节点的ID    number int64 // 当前毫秒已经生成的id序列号(从0开始累加) 1毫秒内最多生成4096个ID}

Instantiating a work node

Because it is distributed, we should assign separate IDs to each machine through external configuration files or other means

// 实例化一个工作节点// workerId 为当前节点的idfunc NewWorker(workerId int64) (*Worker, error) {    // 要先检测workerId是否在上面定义的范围内    if workerId < 0 || workerId > workerMax {        return nil, errors.New("Worker ID excess of quantity")    }    // 生成一个新节点    return &Worker{        timestamp: 0,        workerId: workerId,        number: 0,    }, nil}
Redis can be used to generate a unique ID for each machine in a distributed environment
This part is not contained within the algorithm

Build ID

 //generation method must be mounted under a certain woker, so the logic will be clearer specify a node generation Idfunc (W *worker) GetId () Int64 {//Get ID The key point of lock Plus Lock plus lock W.mu.lock () defer w.mu.unlock ()//After the build is complete remember to unlock unlock//Get the timestamp of the build now: = time. Now ().        Unixnano ()/1e6//nanosecond to milliseconds if W.timestamp = = Now {w.number++//here to determine whether the current working node has generated Numbermax IDs within 1 milliseconds                If W.number > Numbermax {//If the current working node generates an ID exceeding the upper limit in 1 milliseconds, it needs to wait 1 milliseconds before continuing to generate for-now <= W.timestamp { now = time. Now ().         Unixnano ()/1e6}}} else {//If the current time does not coincide with the time of the previous build ID of the work node, you will need to reset the serial number of the worker node generation ID w.number = 0 The following code sees a lot of predecessors written on the if outside, regardless of whether the time stamp of the last generation ID of the node is the same as the current time, it will increase the extra cost of a loss, so I'm here to choose to put in else W.timestamp = Now//Will The time the machine was last generated ID is updated to the current time} ID: = Int64 ((now-epoch) << timeshift | (W.workerid << Workershift) | (W.number)) return ID}  

Many newly-started friends may see the last id: = xxxxx << XXX | xxxxxx << XX | xxxxx a little confused.
Here is the bit for each part to be normalized and by bitwise OR operation (that is, this ' | ') Integrate it
Use a picture to explain


It must have been clear after we finished it.
As to the beginning of a certain number of digits may not be enough? Don't worry, binary slots will automatically fill 0!

For this "|" Explain it a little bit.

The two numbers of the participating operations, converted to binary (0, 1), are performed or calculated. As long as there is 1 on the corresponding bit, then the bit takes 1, not 1, which is 0

Also read the picture is very clear (Baidu will say I steal pictures ah t.t)

Test

Next we'll test the code we just generated with the Golang test package.

package snowFlakeByGoimport (    "testing"    "fmt")func TestSnowFlakeByGo(t *testing.T) {    // 测试脚本    // 生成节点实例    worker, err := NewWorker(1)    if err != nil {        fmt.Println(err)        return    }    ch := make(chan int64)    count := 10000    // 并发 count 个 goroutine 进行 snowflake ID 生成    for i := 0; i < count; i++ {        go func() {            id := worker.GetId()            ch <- id        }()    }    defer close(ch)    m := make(map[int64]int)    for i := 0; i < count; i++  {        id := <- ch        // 如果 map 中存在为 id 的 key, 说明生成的 snowflake ID 有重复        _, ok := m[id]        if ok {            t.Error("ID is not unique!\n")            return        }        // 将 id 作为 key 存入 map        m[id] = i    }    // 成功生成 snowflake ID    fmt.Println("All", count, "snowflake ID Get successed!")}

Results

Tested with 17 version 13 "MacBook Pro (non-touch Bar)

wbyMacBook-Pro:snowFlakeByGo xxx$ go testAll 10000 snowflake ID Get successed!PASSok      github.com/holdno/snowFlakeByGo 0.031s

And occurs with 10,000 IDs spents 0.031 seconds
If you can run on a distributed server estimate faster ~
Enough, enough.

This article combined with the network content and some of their own small optimization organized into
Last attached to GitHub address: Https://github.com/holdno/sno ...
I thought it would be useful for a star.
It's not too early to go to the wash and sleep.
Good Night ~

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.