Snowflake Algorithm (Java edition)

Last Update:2016-08-06 Source: Internet

Author: User

Tags unique id

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Transferred from: http://www.cnblogs.com/haoxinyue/p/5208136.html

1. Database self-growing sequence or field

The most common way. Using a database, the whole database is unique.

Advantages:

1) Simple, code convenient, performance can be accepted.

2) The digital ID is a natural sort, which is helpful for paging or the need to sort the results.

Disadvantages:

1) Different database syntax and implementation, database migration time or multi-database version support needs to be processed.

2) in the case of a single database or read-write separation or a master-multi-slave, only one master library can be generated. Risk of a single point of failure.

3) When performance is not up to the requirements of the case, it is difficult to expand.

4) It is quite painful to meet multiple systems that require merging or involve data migration.

5) There will be trouble when the table is divided into libraries.

Optimization scenarios:

1) for the main library point, if you have more than one master library, the starting number for each master library setting is different, the same as the step size, which can be the number of master. For example: Master1 generated is 1,4,7,10,master2 generated is 2,5,8,11 Master3 generated is 3,6,9,12. This effectively generates a unique ID in the cluster, and can significantly reduce the load on the ID generation database operation.

2. UUID

A common way. The database can also be generated using the program, which is generally unique worldwide.

Advantages:

1) simple, convenient code.

2) Build ID performance is very good, there is basically no performance issues.

3) The world's only, in the face of data migration, system data consolidation, or database changes, etc., can be calmly addressed.

Disadvantages:

1) There is no order, there is no guarantee of trend increment.

2) UUID is often used to store strings, and queries are less efficient.

3) storage space is relatively large, if it is a huge database, it is necessary to consider the problem of memory reserves.

4) Large amount of transmitted data

5) Not readable.

3. Variants of the UUID

1) in order to resolve the UUID unreadable, you can use the UUID to Int64 method. And

<summary>///gets a unique number sequence based on the GUID//</summary>public static long GuidToInt64 () {    byte[] bytes = Guid.NewGuid (). Tobytearray ();    Return Bitconverter.toint64 (bytes, 0);}

2) in order to solve the problem of UUID disorder, NHibernate provides the comb algorithm (combined Guid/timestamp) in its primary key generation mode. The 10 bytes of the GUID are reserved, and the time (DateTime) in which the GUID is generated is represented by a different 6 bytes.

<summary>//Generate a new <see cref= "Guid"/> using the comb algorithm. </summary> private Guid Generatecomb () {byte[] Guidarray = Guid.NewGuid ().     Tobytearray ();    DateTime basedate = new DateTime (1900, 1, 1);     DateTime now = DateTime.Now; Get the days and milliseconds which'll be used to build//the a byte TimeSpan days = new TimeSpan ( Now.    Ticks-basedate.ticks); TimeSpan msecs = Now.     TimeOfDay; Convert to a byte array//Note This SQL Server is accurate to 1/300th of a//millisecond so we divi De by 3.333333 byte[] Daysarray = Bitconverter.getbytes (days.    Days); byte[] Msecsarray = Bitconverter.getbytes ((long) (msecs.     totalmilliseconds/3.333333));    Reverse the bytes to match SQL Servers ordering Array.reverse (Daysarray);     Array.reverse (Msecsarray); Copy the bytes into the GUID array.copy (Daysarray, Daysarray.length-2, Guidarray, Guidarray.length- 6, 2);     Array.copy (Msecsarray, msecsarray.length-4, Guidarray, guidarray.length-4, 4); return new Guid (Guidarray);}

Test with the above algorithm, get the following results: As a comparison, the previous 3 is the result of using the comb algorithm, the last 12 strings are the time sequence (unified millisecond generated 3 uuid), over time if generated again, 12 strings will be larger than the diagram. The following 3 are directly generated GUIDs.

Twitter's snowflake algorithm

Snowflake is the open-source distributed ID generation algorithm for Twitter, and the result is a long ID. The core idea is: use 41bit as the number of milliseconds, 10bit as the machine ID (5 bit is the data center, 5 bit Machine ID), 12bit as the number of milliseconds (meaning that each node can produce 4,096 IDs per millisecond), and finally a sign bit, is always 0. The specific implementation of the code can be see Https://github.com/twitter/snowflake.

The C # code is as follows:

<summary>///From:https://github.com/twitter/snowflake/A object that generates IDs.     This was broken into a separate class in case////We ever want to support multiple worker threads//per process        </summary> public class Idworker {private Long workerid;        Private long Datacenterid;        Private long sequence = 0L;        private static long Twepoch = 1288834974657L;        private static long workeridbits = 5L;        private static long datacenteridbits = 5L;        private static Long Maxworkerid = -1l ^ ( -1l << (int) workeridbits);        private static Long Maxdatacenterid = -1l ^ ( -1l << (int) datacenteridbits);        private static long sequencebits = 12L;        Private long workeridshift = sequencebits;        Private Long Datacenteridshift = Sequencebits + workeridbits;        Private Long Timestampleftshift = sequencebits + workeridbits + datacenteridbits; Private long Sequencemask = -1l ^ ( -1l << (int) sequencebits);        Private long lasttimestamp = -1l;        private static Object syncRoot = new Object ();  Public Idworker (Long Workerid, long Datacenterid) {//sanity check for Workerid if (Workerid > Maxworkerid | | Workerid < 0) {throw new ArgumentException (string.            Format ("Worker Id can ' t is greater than%d or less than 0", Maxworkerid)); } if (Datacenterid > Maxdatacenterid | | Datacenterid < 0) {throw new Argumente Xception (String.            Format ("Datacenter Id can ' t is greater than%d or less than 0", Maxdatacenterid));            } This.workerid = Workerid;        This.datacenterid = Datacenterid;                } public Long NextID () {lock (syncRoot) {Long timestamp = Timegen (); if (Timestamp < Lasttimestamp) {throw new ApplicationexcEption (String.  Format ("Clock moved backwards.                Refusing to generate ID for%d milliseconds ", lasttimestamp-timestamp)); } if (Lasttimestamp = = timestamp) {sequence = (sequence + 1) & Seque                    Ncemask;                    if (sequence = = 0) {timestamp = Tilnextmillis (Lasttimestamp);                }} else {sequence = 0L;                } Lasttimestamp = timestamp; Return ((Timestamp-twepoch) << (int) timestampleftshift) | (Datacenterid << (int) datacenteridshift) | (Workerid << (int) workeridshift) |            Sequence            }} protected long Tilnextmillis (long Lasttimestamp) {Long timestamp = Timegen ();            while (timestamp <= lasttimestamp) {timestamp = Timegen (); } return TimEstamp; } protected Long Timegen () {return (long) (Datetime.utcnow-new DateTime (1970, 1, 1, 0, 0, 0, Da TETIMEKIND.UTC)).        TotalMilliseconds; }    }

The test code is as follows:

private static void Testidworker () {hashset<long> set = new hashset<long> ();            Idworker idWorker1 = new Idworker (0, 0);            Idworker idWorker2 = new Idworker (1, 0);            thread T1 = new Thread (() = Dotestidwoker (IdWorker1, set));            Thread t2 = new Thread (() = Dotestidwoker (IdWorker2, set)); T1.            IsBackground = true; T2.            IsBackground = true; T1.            Start (); T2.            Start ();                try {thread.sleep (30000); T1.                Abort (); T2.            Abort ();        } catch (Exception e) {} Console.WriteLine ("Done");            } private static void Dotestidwoker (Idworker idworker, hashset<long> set) {while (true)                {Long id = idworker.nextid (); if (!set.      ADD (ID)) {Console.WriteLine ("Duplicate:" + ID);          } thread.sleep (1); }        }

The snowflake algorithm can be modified according to the needs of its own project. For example, estimate the number of data centers in the future, the number of machines per data center, and the number of concurrent milliseconds that a unified millisecond can have to adjust the number of bits required in the algorithm.

Advantages:

1) not dependent on the database, flexible and convenient, and performance is better than the database.

2) The ID is incremented on a single machine by time.

Disadvantages:

1) is incremental on a single machine, but due to the distributed environment, the clock on each of the machines can not be fully synchronized, perhaps sometimes there is not a global increase in the situation.

Snowflake Algorithm (Java edition)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More