Liu Bing, nickname, open source technology enthusiasts, high performance Redis middleware Nredis-proxy author, the current research direction for Java Middleware, micro-services and other technologies.
first, what is the distributed generator
We should be thankful for this time when we are talking about the distributed generator. As the internet becomes more and more popular in China, a single system or a small system can no longer meet the needs, as users gradually increase the volume of data, individual applications or individual databases can not meet the demand, in the application so that the advent of micro services, In the database storage aspects of the sub-Library table to solve the problem, but the new problem arises, how to do multiple applications can have a unique primary key or serial number, to prevent duplication of data. Distributed generator is just to solve this problem, can let everyone need not worry about this problem, this is I write this article original intention.
second, the advantages of distributed generator number
1) solve the problem of the unique ordinal in the sub-list of the library
2) solve the problem of unique serial number in distributed application or micro-service framework
3) Customizable build rules to customize extensions to your business needs
4) High performance and simple and stable system
5) system can be arbitrarily extended
third, distributed generator architecture diagram
iv. Distributed Generator Flowchart 1. Important field of distributed generator number
2, Concurrentvalue non-existent flowchart
3. Flowchart of Concurrentvalue existence
Five, there is a distributed generator solution 1. UUID
Universally Unique IDentifier (UUID), with serious RFC specification, is a 128bit number, can also be expressed as 32 16 characters (each character 0-f character represents 4bit), in the middle with "-" split. Timestamp +uuid version number: Three segments of 16 characters (60bit+4bit) Clock sequence and reserved fields: 4 characters (13bit+3bit) Node ID: 12 characters (48bit) 2, Hibernate
Hibernate's Customversiononestrategy.java, resolved the two issues of the previous version 1 timestamp (6bytes, 48bit): The millisecond level, from 1970 onwards, can support 8925 .... Sequence Number (2bytes, 16bit, maximum 65535): No time to poke over a millisecond to zero things, each engaged, short overflow to negative 0. Machine identification (4bytes 32bit): Take the IP address of localhost, IPV4 exactly 4 byte, but if it is IPV6 to 16 bytes, just take the first 4 bytes. Process ID (4bytes 32bit): Use the current timestamp to move right 8 bits and then take the integer to cope, the two threads will start at the same time. 3. MongoDB
MongoDB's Objectid.java
Timestamp (4 bytes 32bit): Is the second level, from 1970 onwards, can support 136 years. Self-increment sequence (3bytes 24bit, Max 16 million): is a random number beginning (wit) int constantly plus one, there is no time to poke over a second to zero things, each engage in each. Because only 3bytes, so a 4bytes int also to intercept after 3bytes. Machine identification (3bytes 24bit): The MAC addresses of all network cards are hashcode together, and the same int is truncated after 3bytes. The network card can not be confused with a random number of the past. Process ID (2bytes 16bits): From the jmx to get back to the process number, do not have to use the process name of the hash or random number to mix it over.
As can be seen, each of the MongoDB field design is more reasonable than hibernate, the timestamp is the second level, the self-increment sequence becomes longer, the process identity is shorter. The total length is also reduced to bytes 96bit. 4, Twitter's snowflake dispatch number device
Snowflake is also a dispatch, based on the thrift service, but not with redis simple self-increment, but similar to the UUID Version1, only a long 64bit length, so idworker tight is allocated: Timestamp (42bit) : The number of milliseconds since 2012 (compared to those who have lived from 1970) has lasted for 139 years. Self-increment sequence (12bit, max. 4096): In milliseconds, the self-increment, after a millisecond will reset 0. DataCenter ID (5 bit, Max 32): Configuration value, support multi-engine room. Worker ID (5 bit, Max 32), configuration value, because it is the ID of the dispatch number, a room of up to 32 of the number of transmitter is enough, but also in ZK to register.
Visible, because it is the central dispatch, the at least 40bit node identification is saved, and replaced by a 10bit dispatch symbol. So the entire UID can be expressed in only one long.
In addition, this dispatch, the client can only one ID at a time, can not be bulk taken, so the additional delay is a problem, and only 1024 machine range.
The same problem is not customizable, the number of bits is too long.