The original connection cannot be opened.
1. Beautiful blueprint
When I first came into contact with MongoDB, I saw its auto-sharding function diagram. In combination with the replica sets, I felt like a unified world. BOTH:
In the figure, the shard machine mongos can have multiple nodes, and the config machine can be configured as master-slave or replica sets. Each sharding node is a replica set composed of three shards. High Availability and unlimited scalability. It looks magnificent and spectacular.
However, when I really want to use the auto-sharding function, I find that this design is not applicable at all. Let's talk about it in my personal opinion.
Sharding refers to sharding, which means to split data horizontally. The simplest example is the multi-node Hash Storage in the memcached cache. MongoDB's auto-sharding function focuses on an auto (automatic sharding ). To use this function, you only need to specify the value of a field on which data fragment depends. You can store your data regardless of the number of nodes, the data is automatically evenly distributed to the back-end node. When the end point is added, the data is automatically migrated to balance the load of the entire system. Compared with our traditional database/table sharding, It is auto (automatic), while the traditional method is manual (Manual.
2. Foursquare downtime
The above description seems to have no reason to be picky, but it is not as beautiful as it is. From the analysis of Foursquare's 11-hour downtime event, we can see the following problems:
- Auto-sharding algorithm causes uneven data distribution: In fact, data is not evenly distributed among machines, and the short-board effect caused by this is intolerable.
- High cost of data migration: When a new sharding node is added, the data is automatically migrated, which has an impact on the services of the original online service nodes. Therefore, we 'd better start this migration a long time before the storage bottleneck arrives. In this case, the advantage is no longer obvious compared with the manual sharding and pre-estimation of the data volume.
- Fragmentation caused by data migration: It is also a failure experience of Foursquare. During data migration, the data we need to migrate is generally not stored continuously on the disk (unless your hash condition is a natural insert timestamp ). We know that MongoDB uses a pre-allocated Disk Space Mechanism, so during data migration, it may not reduce disk space usage, but will make the disk fragmented. What's more, because MongoDB uses MMAP to accelerate data access, disk holes do not reduce the size of MMAP, but lead to memory fragmentation.
The preceding problems do not exist in manual-sharding, where we have pre-planned storage capacity.
Iii. Cold data
I think the storage planning can be simply evaluated from two aspects: disk usage, that is, the total data volume, and memory usage, that is, the hot data volume.The proportion of hot data in all data is usually smaller and smaller.This is due to the decrease in the access volume of old data. We consider this situation. If your application user volume and daily new data volume are already quite stable, we will evaluate the data from the past three days as hot data, the total data size will increase with time, and the hot data size will always be three days. Here is an extreme example. Your user volume may increase every day, but the proportion of hot data in all data is usually less and less. This fact cannot be refuted.
As we have said so much above, let's look at our problems.
- Auto-sharding data volume changes: As we have mentioned above, the auto-sharding cluster will expand the storage capacity of the cluster by adding nodes. The number of knots is linearly proportional to the amount of data.
- Auto-sharding hot data Ratio: Once a node is configured for auto-sharding, the memory-to-disk ratio of these machines is determined. Therefore, the proportion of hot data in the entire cluster to all data is also certain.
Compared with the above statement, the proportion of hot data in all data will become smaller and smaller. So we can think that the old nodes in auto-sharding waste a lot of memory and CPU resources.
If we manually split the data, we can completely control the data storage, set our own LRU or TTL mechanism, and store and transfer old and cold data, it no longer occupies the functions of high-performance machines. So that our company can make the best use of the machines we buy with big money.
All right, let's talk about this. The understanding is not necessarily correct. You are welcome to discuss it.