There is also a need for a primary key for a distributed database, but why not use the UUID directly as a primary key? As a person who has been confused by this question, try to answer
1. UUID generation Rate is low
Java's UUID relies on the Securerandom.nextbytes method, and SecureRandom relies on the random number source provided by the operating system.
Under the Linux system, its default dependency is/dev/random, and the source is blocked.
Most frightening of all, this nextbytes method is still a synchronized method, that is, if the multi-threaded call UUID, the generation rate does not rise and fall.
Test result: On a 64-thread server, call the Uuid.randomuuid method, generate 10 million UUID average time spent in 130s,tps less than 8w
2. UUID primary key will cause performance issues in InnoDB
A. The primary key index in InnoDB is also a clustered index, and if the inserted data is sequential, the B + tree leaves are basically full, and the cache can work well.
If the inserted data is completely unordered, then the leaf nodes are split frequently and the cache is largely invalid. This will reduce the TPS
B. UUID occupies a larger space
3. UUID is completely meaningless, if a primary key is globally self-increment, then the order of data is the insertion order of the data
Solution:
1. Distributed global sequence generation (using ZK's Distributedatomiclong, one time to increment a step, the user ran out of steps within the sequence, and then find ZK to)
2. Twitter's snowflake algorithm
Of course, the self-increment sequence is also not perfect, because in the case of extreme concurrency, a contention occurs when a self-increment primary key is inserted, and a hotspot appears on the upper bound of the primary key. But overall, it's acceptable.
Source:
Why is the UUID not used as the primary key in a distributed database?
Why is the UUID not used as the primary key in a distributed database?