Mongodb Note 07 Shard

Source: Internet
Author: User
Tags mongodb server database sharding

Sharding

1. sharding (sharding) refers to the process of splitting data and spreading it across different machines. This concept is sometimes represented by partitioning (partitioning). Spread the data across different machines without the need for powerful mainframe computers to

Store more data and handle larger loads.

2. MongoDB supports automatic sharding (autosharding), which can make the database schema invisible to the application or simplify system administration. For applications, it seems as if you are always using a single-machine MongoDB server. On the other hand

MongoDB automatically handles the distribution of data on shards and makes it easier to add and remove sharding technologies.

3. Replication is different from sharding: replication allows multiple servers to have the same copy of the data, each server is mirrored by other servers, and each shard has a different subset of data than other shards.

4. Routing server: In order to hide the details of the database schema from the application, perform a mongos first route before sharding. This routing server maintains a "table of contents" that indicates what data content each shard contains. Application

The program only needs to connect to the routing server, it can be as normal as using a single machine request.

5. Run Sh.status () to see the status of the cluster: Shard summary Confidence, database summary information, collection summary information.

6. To Shard a collection, first you want to enable sharding for the database of this collection, execute the following command: sh.enablesharding ("Test")

7. Tablet key: The tablet key is a key to the collection, and MongoDB splits the data according to this key. For example: Username. Before you enable sharding, you now want to create an index on the key that is the slice key: Db.users.ensureIndex ({"username": 1})

8. To the collection Shard: Sh.shardcollection ("Test.users", {"username": 1})

9. The collection is split into multiple chunks, each of which is a subset of the data in the collection. This is arranged according to the range of the slice key ({"username": minvalue}-->>{"username": MaxValue} indicates the range of each data block).

10. Queries that contain a slice key can be sent directly to the target shard or to a subset of the cluster shards. Such queries are called directed queries (targetd query). Some queries must be sent to all shards, and such queries are called Scatter-aggregate queries (

Scatter-gather query); MONGOs scattered the queries across all the shards, and then aggregated the query results of each shard.

Cluster.stop () closes the entire cluster.

Bson type

To configure a shard:

1. When to Shard: deciding when to Shard is an issue worth balancing. It is usually not necessary to fragment too early, because sharding not only increases the operational complexity of the deployment, it also requires design decisions, which are difficult to change after the decision is made. In addition, preferably not in the system

It is too long to be fragmented, because it is difficult to allocate on an overloaded system without downtime.

2. Purpose of sharding: Increase available RAM, increase free disk space, reduce the load on a single server, and handle the throughput that a single mongod cannot tolerate.

3. In general, you should create at least 3 or more shards.

4. Start the server:

1). Configure the server: The configuration server is the brain of the cluster, which holds the metadata of the cluster and shards, which is the information of what data each shard contains. Therefore, the configuration server should be first established, the prison it contains the data of extreme importance, must be enabled

Its logging capabilities and ensures that its data is stored on non-volatile drives. Each configuration server should be located on a separate physical machine, preferably a machine row distributed across different address locations.

A. Start the configuration server: Mongod--configsvr--dbpath/var/lib/mongodb-f/var/lib/config/mognd.conf. Three configuration servers need to be started and are writable.

Why are 3 configuration servers? Because we need to think about contingencies. However, there is no need to configure the server too much because the confirmation operation on the configuration server is time-consuming. In addition, if a server goes down, the cluster metadata becomes read-only.

The--CONFIGSVR option specifies that Mongod is the new configuration server. This option is not a required option because it does nothing but change the default listening port of Mongod to 27019 and the large default data directory to/DATA/CONFIGDB (you can use

The--port and--dbpath options Modify both configurations). However, it is recommended to use the--CONFIGSVR option because it provides a straightforward explanation of the purpose of these configuration servers.

Configure the server's 1KB equivalent to 200MB knowledge data, which holds the distribution table of the real data. Because the configuration server does not require too many resources, it can be deployed on servers running other programs.

2). MONGOs process: After three configuration servers are running, a mongos process is started for the application to connect. The MONGOs process needs to configure the server's address, so you must start MONGOs with the--CONFIGDB option:

MONGOs--configdb config-1:27019,config-2:27019,config-3:27019-f/var/lib/mongos.conf

By default, MONGOs runs on port 27017. The MONGOs itself does not save the data, it loads the cluster data from the configuration server at startup.

You can start any number of mongos processes. The usual settings are when each application server uses a MONGOs process (running on the same machine as the application server)

Each mongos process must be sorted by list, using the same list of configuration servers.

3). Convert a replica set to a shard: There are two possibilities: there is already a replica set, or a cluster is built from scratch. The following example assumes that we already have a replica set. If it is zero-based, you can initially draw an empty copy set, and then follow this example.

A. Informing MONGOs replica set name and replica set member list: Sh.addshard ("spock/server-1:27017,server-2:27017,server-4:27017") MONGOs can automatically detect members that are not included in the replica set member table.

B. After the replica set is added as a shard to the cluster, you can change the application settings from connecting to the replica set to connect to MONGOs.

C. Replica set name SPOKC is used as the Shard name. If you later want to remove the Shard or migrate the data to the Shard, you can use Spock to flag the Shard.

D. After the Shard is configured, the client must be set up to send all requests to MONGOs instead of replica sets. Configure firewall rules at the same time to ensure that customer orders cannot send requests directly to the Shard.

E. There is a--shardsvr option, similar to the--CONFIGSVR option described earlier, and it is not practical (just change the default port to 27018), but it is recommended in action.

F. It is not recommended to create a single Mongod server shard (instead of replica set shards), and the conversion of a single server shard to a replica set requires a downtime operation.

4). Increase cluster Capacity: Increase the cluster capacity by adding shards.

5). Data sharding: MongoDB does not automatically split the data unless a rule is explicitly specified. If necessary, you must explicitly inform the database and the collection. Joining the artists collection in the music database is sharding by name.

Db.enablesharding ("Music") is a prerequisite to a collection shard for database sharding

Sh.shardcollection ("Music.artists", {"name": 1}) for the collection Shard, the collection is fragmented according to the name Key. If it is a shard of a collection that already exists, then the name key must have an index on it, otherwise an error will be returned.

The shardcollection () command splits the collection into multiple chunks, which are the basic units of MONGODB migration data. After the command is executed, MongoDB will distribute the data evenly across the shards of the cluster.

Mongodb Note 07 Shard

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.