Tags: mycat database middleware MySQL record developmentDatabase routing Middleware Mycat-using chapter (2) Basic Concepts 3. Shard 3.1 Shard node (dataNode)
When a table is sliced horizontally, the database in which each shard table resides is a shard node. A shard node corresponds to a database (MySQL database). A shard node can only hold one shard per shard table, because a table with the same name is not allowed in the DB.3.2-node host (datahost)
The Shard node is placed on that host. MySQL instance in MySQL: One host can deploy multiple MySQL instances, and one MySQL instance can have multiple databases. In order to avoid the single node host concurrency limit, try to put the read-write high-pressure shard node (dataNode) balanced on different node hosts (datahost).Shard Rule
Determines how the records of a partitioned table are distributed across different shard nodes. There are a number of sharding rules, and we decide which partitioning scheme to use according to the business needs and the difficulty of development, maintenance and expansion. Mycat support our own development of our own shard rules, how to develop, we will talk about (the following rules are best not to copy, after reference and according to their own needs to develop their own sharding scheme):1. Hash modulo:
This is the most common one of the sharding scheme, according to the Shard field (usually the primary key, because the main key to find the number of scenes) of the hash value, the number of shards modulo operation, according to the results of the decision to log to which Shard.
The average number of shards is preferably 2 of the N-square, so that the calculation can be used and operation (x (2^n) =x& ( 2^n-1)).
Benefit: Record average distribution (unless the ID generator intentionally generates a modulo that is just the same number of IDs), the pressure is evenly distributed, the data is not skewed
Cons: Scaling up (adding shards) is a big problem, with the number of shards changing, it's almost impossible to migrate data
Maintain a corresponding table configuration file, as follows:
Beijing =0 Shanghai =1 Shenzhen =2 Guangzhou =2 default=312345
It means that the Shard field is from Beijing to Shard 0, Shanghai to Shard 1, Shenzhen and Guangzhou to Shard 2, and other to Shard 3.
If you find that the shards in Beijing need to be expanded one day, you can migrate the data in Beijing to a larger shard, then overload the configuration. Mycat support for online overload configuration
Benefits: More flexible expansion
Cons: Data is prone to tilt, expansion is not very flexible, and the Shard field is difficult to be a common query field (if the query field is not a shard field, is a full Shard search)
Also maintains a file as follows:
This means that the Shard field is in the range of 0~1000w to Shard 0 ...
Benefits: Ensure that each shard data stability, expansion is more convenient
Cons: You need to mate with the ID generator, otherwise in order the self-increment will have the problem of stress concentrating on one shard. At the same time, the MYCAT configuration and the ID generator configuration should be changed at the same time. In time to do data cleanup, ID Best can be reused, this rule can be very good application.
Combines a hash modulo with a range route.
After the hash modulo range in 0~15 the flow to Shard 1 ....
This can reduce the pressure of expansion to some extent
Sometimes we don't want all the contents of a field to be shards. We can take a portion of a field as a shard basis. Used with the ID generator.6. General Agreement:
Refer to some items of hundred X, they are the project start to build 64 libraries, 64 sheets per library. Assuming that each table has 1000w of data, it can withstand a total of 40.96 billion of the data ... From now on it is estimated that the project will be dead. &NBSP;
However, this gives us a way of thinking that we need to estimate the amount of future n years according to the project and allocate so many libraries at the beginning of the project. These libraries may be located on the same instance at the beginning of the project. When not enough, we migrate some of these libraries to other instances. &NBSP;
The average distribution pressure here requires a hash modulo, or we can get the ID generator to do the average mycat. &NBSP;
For example the following Id:&NBSP;
We use the first-previous number as the Shard ID directly, and we have allocated 64 libraries in the beginning to account for future business growth. The ID generator starts with a random average generation of 00~03, and then the business grows to a certain extent, and then it is generated in a range as much as needed.
is to extract a segment from the Shard field to do the Shard route, and then take another section to do the automatic hash shard. It also specifies that a scope is a shard rule and the other is another shard rule.
Beijing (a0000000~a9999999) =0,1,2,3,4 Beijing (B0000000) =5,6,7,8,9 Shanghai (00000000~10000000) =10,11 Shanghai =10,11,12,13,14,151234
It means that the starting point for the range of Beijing in a0000000~a9999999 is based on the hash value on the back of the 5 modulus averaged across the 0,1,2,3,4 shard node. The beginning of the range of Beijing above B0000000 is based on the subsequent hash value on the average of 5 modulo distribution on the 5,6,7,8,9 shard node. The starting point for the range of Shanghai in 00000000~10000000 is based on the hash value on the back of the 2 modulus evenly distributed on the 10,11 shard node, the remainder of the beginning is Shanghai, and the average distribution of 6 modulo is on 10,11,12,13,14,15.
This allows you to change the Shard rule at any time, without affecting access to the previous data, when you find that the Shard at the beginning is not enough.
This article is from the "Zhang Hashi Technology Blog" blog, make sure to keep this source http://zhxhash.blog.51cto.com/10645572/1743715
Mycat-Use article (2)