MongoDB Basics (ix) Shards

Source: Internet
Author: User
Tags mongodb server


Sharding (sharding) is the method of storing data through multiple servers. MongoDB uses shards to support the deployment of very large datasets and high throughput operations. The ability of a single server is limited in all aspects, such as CPU, IO, RAM, storage space, etc. To solve the problem of scaling, the database provides two methods: vertical scaling and sharding.


Vertical Scaling: increases CPU, RAM, storage resources, and so on, which is also limited by hardware devices. Some cloud-based vendors also require users to use small systems.


sharding (horizontal scaling): divides the data set, distributes the data across multiple servers, each fragment (Chard) is a separate database, which together form a logical database. (This is similar to Windows dynamic disk stripe)


The Shard cluster structure in MONGODB is as follows:


There are three components for a shard cluster: Shards,query routers and configservers.

shards: fragmentation, storing data, providing high availability and consistency of data. In a shard cluster, each fragment is a copy set.

Query Routers: an interface that queries the routing, or MONGOs instance, of the client application to manipulate fragmentation directly. Queries the route processing and positioning operations into fragments and returns the relevant data to the client. A shard cluster contains multiple query routes to divide the client's request pressure.

configservers: configures the server to store metadata in the cluster. This data contains the mapping of the cluster data to fragmentation. Query routing uses these metadata targeting operations into clear fragments. A shared cluster requires 3 configuration servers.


Note: For testing, the value can be configured with 1 config servers


MONGODB distributes data or fragmentation at the collection level. Shards are data that divides a collection by Shardkey. Shard Key can be either an index key column or a composite key column in each document. Mongodb divides the Shard key value into chunk and distributes the chunk evenly into fragments. Mongodb uses the partitioning method as a range partition or a hash partition. (More reference: Shardkey)


Shard Cluster Deployment:

MongoDB server: (Red hatenterprise Linux 6 64-bit + Mongodb 3.0.2)

192.168.1.11 mongodb11.kk.net 21017

192.168.1.12 mongodb12.kk.net 21018

192.168.1.13 mongodb13.kk.net 21019

192.168.1.14 mongodb14.kk.net 21020


Used for the test structure as follows:


Note: Before configuring, make sure that members that are joined to the cluster can connect to each other.



"1. Configure Config Servers "(on the 192.168.1.14 server)

Configuration server (config servers) stores cluster metadata, so first configure the server. The configuration server needs to use the parameter-configsvr to start the Mongod service. If you have more than one configuration server, each configuration server will completely save the elements of the cluster.

1.1 Create database directory ConfigDB:

[Email protected] ~]# mkdir/var/lib/mongo/configdb[[email protected] ~]# chown mongod:mongod/var/lib/mongo/configdb/

1.2. Configure the Startup parameters file:

[Email protected] ~]# vi/etc/mongod.conf

192.168.1.14

Logpath=/var/log/mongodb/mongod.log

Pidfilepath=/var/run/mongodb/mongod.pid

Logappend=true

Fork=true

port=27020

bind_ip=192.168.1.14

Dbpath=/var/lib/mongo/configdb

Configsvr=true


1.3. Restart the Mongod service:

[Email protected] ~]# service Mongod restart


"2. Configure Router "(on the 192.168.1.11 server)

2.1. Enable MONGOs (MongoDB Shard) instance, connect to config servers: (More reference: MONGOs )

#使用mongos连接到config servers, specify the local port, otherwise the default 27017# current server Mongod port is 27017, so configure the MONGOs port to 27016#mongo--host 

In the actual environment, if more than one config servers is configured, MONGOs can specify multiple simultaneously.

Mongos--configdb mongodb14.kk.net:27020, mongodb15.kk.net:27020,mongodb16.kk.net:27020 ...


"3. Add Shard member to cluster "(add IP 11, 12, 13 shard set, take 192.168.1.11 as an example)

3.1. Configure the Startup parameters file:

[Email protected] ~]# vi/etc/mongod.conf

192.168.1.11

192.168.1.12

192.168.1.13

Logpath=/var/log/mongodb/mongod.log

Pidfilepath=/var/run/mongodb/mongod.pid

Logappend=true

Fork=true

port=27017

bind_ip=192.168.1.11

Dbpath=/var/lib/mongo

Shardsvr=true

Logpath=/var/log/mongodb/mongod.log

Pidfilepath=/var/run/mongodb/mongod.pid

Logappend=true

Fork=true

port=27018

bind_ip=192.168.1.12

Dbpath=/var/lib/mongo

Shardsvr=true

Logpath=/var/log/mongodb/mongod.log

Pidfilepath=/var/run/mongodb/mongod.pid

Logappend=true

Fork=true

port=27019

bind_ip=192.168.1.13

Dbpath=/var/lib/mongo

Shardsvr=true


3.2. Restart the Mongod service:

[Email protected] ~]# service Mongod restart

3.3 Each Shard member is added to the MONGOs instance (the existing user data is removed or deleted before it is added):

[Email protected] ~]# MONGO 192.168.1.11:27016mongos> sh.addshard ("mongodb11.kk.net:27017") mongos> Sh.addshard ("mongodb12.kk.net:27018") mongos> Sh.addshard ("mongodb13.kk.net:27019")

3.4 Add complete! ~ Connect to MONGOs to view system-related information:

Configsvr> Show dbsconfigsvr> use configconfigsvr> show collectionsconfigsvr> configsvr> db.mongos.find () {"_id": "mongodb11.kk.net:27016", "ping": Isodate ("2015-05-23t11:16:47.624z"), "up": 1221, "Waiting": true, "mongov  Ersion ":" 3.0.2 "}configsvr> configsvr> db.shards.find () {" _id ":" shard0000 "," host ":" mongodb11.kk.net:27017 "} { "_id": "shard0001", "host": "mongodb12.kk.net:27018"} {"_id": "shard0002", "host": "mongodb13.kk.net:27019"}configs Vr> configsvr> Db.databases.find () {"_id": "admin", "partitioned": false, "PRIMARY": "config"} {"_id": "MyDB", "Partitioned": false, "PRIMARY": "shard0000"} {"_id": "Test", "partitioned": false, "PRIMARY": "shard0000"}


"4. To Database Enable sharding "

4.1 Currently can connect to MONGOs view the Shard condition of the database or collection (no shards):

Mongos> db.stats () mongos> db.tab.stats ()

4.2 Activate the Shard feature on the database:

[Email protected] ~]# MONGO 192.168.1.11:27016mongos> sh.enablesharding ("test") #或者 [[email protected] ~]# MONGO 192.168.1.11:27016mongos> use adminmongos> db.runcommand ({enablesharding: "Test"})

4.3 When the database partition is viewed, partitioned becomes "true".

Configsvr> use configswitched to DB configconfigsvr> Db.databases.find () {"_id": "admin", "partitioned": false, "p Rimary ":" config "} {" _id ":" MyDB "," partitioned ": true," PRIMARY ":" shard0000 "} {" _id ":" Test "," Partitioned ": Tru E, "PRIMARY": "shard0000"}


Enabling database shards does not separate the data, and it also requires sharding of the collection.

"5. To Collection Enable Sharding "

Before enabling, there are a few questions to consider:

1. Select which key column to use as Shard key. (More reference: Considerations for Selecting Shard Keys)

2. If the data already exists in the collection, the key column selected as Shard key must be indexed, and if the collection is empty, MongoDB will create an index when the collection shard (sh.shardcollection) is activated.

3. Aggregate shard function sh.shardcollection ,


Sh.shardcollection ("<database>.<collection>", Shard-key-pattern)


Mongos> sh.shardcollection ("Test.tab", {"_id": "Hashed"})


Test:

for (var i=1; i<100000; i++) {Db.kk.insert ({"id": I, "myName": "KK" +i, "mydate": New Date ()});} Mongos> show Collectionsmongos> db.kk.find () mongos> Db.kk.createIndex ({"id": "hashed"}) mongos> Db.kk.getIndexes () mongos> sh.shardcollection ("Test.kk", {"id": "hashed"}) mongos> Db.stats () mongos> Db.kk.stats ()


Because the data partition takes time, the data distribution is viewed again:

Total number of rows: 99999

Mongos> Db.kk.count () 99999

Mongos> db.printshardingstatus ();---sharding Status---sharding version: {"_id": 1, "mincompatibleversion": 5, "cur Rentversion ": 6," Clusterid ": ObjectId (" 556023c02c2ebfdfbc8d39eb ")} shards:{" _id ":" shard0000 "," host ":" Mongodb11. " kk.net:27017 "} {" _id ":" shard0001 "," host ":" mongodb12.kk.net:27018 "} {" _id ":" shard0002 "," host ":" Mongodb13.kk . net:27019 "} balancer:currently enabled:yescurrently running:nofailed balancer rounds in last 5 Attempts:0migratio N Results for the last hours:1334:success2:failed with error ' could not acquire collection lock for TEST.KK to MiG  Rate chunk [{: Minkey},{: Maxkey}]:: Caused by:: Lock for migrating chunk [{: Minkey}, {: Maxkey}] in Test.kk are Taken. ', from shard0000 to shard0001 databases:{"_id": "admin", "partitioned": false, "PRIMARY": "config"} {"_i D ":" MyDB "," partitioned ": true," PRIMARY ":" shard0000 "} {" _id ":" Test "," partitioned ": true," PRIMARY ":" Shar d0000 "}test.kkshard key: {"id": "hashed"}chunks:shard0000667shard0001667shard0002667too many chunks to print, use verbose if you want  print{"_id": "Events", "partitioned": false, "PRIMARY": "shard0002"}mongos>


Look here chunks:
shard0000 667
shard0001 667
shard0002 667

Originally shard0000 largest, shard0001 and shard0002 for 0. In the end, the data will stabilize and no longer change.

Mongos> db.kk.stats () {"sharded": True, "Paddingfactornote": "Paddingfactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only. "," userflags ": 1," capped ": false," ns ":" Test.kk "," Count ": 99999," n Umextents ":", "size": 11199888, "storagesize": 44871680, "totalindexsize": 10416224, "indexsizes": {"_id_": 4750256, "I D_hashed ": 5665968}," Avgobjsize ":," nindexes ": 2," Nchunks ": 2001," Shards ": {" shard0000 ": {" ns ":" Test.kk "," Count ": 33500," size ": 3752000," avgobjsize ":" Numextents ": 7," storagesize ": 22507520," lastextentsize ": 11325440," Paddi Ngfactor ": 1," Paddingfactornote ":" Paddingfactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only. "," userflags ": 1," capped ": false," nindexes ": 2," Totalindexsize ": 3 605616, "indexsizes": {"_id_": 1913184, "id_hashed": 1692432}, "OK": 1}, "shard0001": {"ns": "Test.kk", "Count": 32852, " Size ": 3679424," avgobjsize ":" Numextents ": 6," storagesize ":11182080, "lastextentsize": 8388608, "Paddingfactor": 1, "Paddingfactornote": "Paddingfactor is unused and unmaintained I N 3.0. It remains hard coded to 1.0 for compatibility only. "," userflags ": 1," capped ": false," nindexes ": 2," Totalindexsize ": 3 343984, "indexsizes": {"_id_": 1389920, "id_hashed": 1954064}, "OK": 1}, "shard0002": {"ns": "Test.kk", "Count": 33647, "  Size ": 3768464," avgobjsize ":" Numextents ": 6," storagesize ": 11182080," lastextentsize ": 8388608," Paddingfactor ": 1, "Paddingfactornote": "Paddingfactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only. "," userflags ": 1," capped ": false," nindexes ": 2," Totalindexsize ": 3  466624, "indexsizes": {"_id_": 1447152, "id_hashed": 2019472}, "OK": 1}}, "OK": 1}mongos>


Above, the distribution of the Shard data:

"shard0000" "Count": 33500

"shard0001" "Count": 32852

"Shard0002" "Count": 33647


A total of 99999 lines, completely accurate, the data distribution is also very average.

(test data as much as possible, otherwise you will not see the effect.) I began to test less data, less than 1000 lines, no effect, but also thought where there is a problem, and more tossing for 2 hours! ~



Reference: Sharding Introduction

(The steps of the official document are not very clear, Daoteng for a long time.) Online also some blog introduction, but also just Bo master summary, for a newcomer, in which operation, operation what not detailed)




MongoDB Basics (ix) Shards

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.