Configure MongoDB cluster shards

Last Update:2018-12-05 Source: Internet

Author: User

Tags mongodb sharding

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Reprinted from http://my.oschina.net/zhzhenqin/blog/97268

At present, many tutorials on the internet talk about MongoDB sharding configuration, but most of them have not gone through practical practices and are reproduced randomly. In addition, the different configurations of MongoDB versions are messy, making the users confused.

Recently I have also made MongoDB sharding and pasted my configuration. In addition, let's talk to everyone about the issues that need attention. I hope you can correct them if they are inappropriate. We also hope that later users can bypass these problems.

Formal environment to ensure data security to back up, for the partition copy see Alibaba Tutorial: http://www.taobaodba.com/html/525_525.html

The cluster I configured is used for testing and has not been copied. It is just a simple test of multipart storage data. During the test, it is difficult to split and copy multiple machines. Many examples on the Internet are incorrect, so it took me a lot of time.

The structure after configuration is complete is:

My Mongo version is: mongodb-linux-x86_64-2.0.8, the computer is normal PC.

Here I would like to note:

1. In Big Data scenarios, it is best to use a 64-bit machine. A 32-bit machine cannot create a single file larger than 2 GB. Small Data size does not matter

2. The system of each computer in the cluster should be the same. Do not use 32-bit or 64-bit pairs.

-- This is what I did when I started the test. There are 2 sets of 64, 2 sets of 32. when the data of about million data is injected and the index is greater than 3-4 GB, a 32-bit machine error occurs. A single file larger than 2 GB cannot be created on a 32-bit system. The whole cluster is paralyzed. I have been looking for reasons for this for a long time.

Let's get started!

The unzipped Mongo directory structure is as follows: the installation directory is $ {mongo_install}

Run the following commands on $ {cmd_install} on the first computer:

`1`	`mkdir` `-p /data/shard11`

`1`	`bin/mongod -shardsvr -port 27017 -dbpath=/data/shard11/ -logpath=/data/shard11.log --fork`

Run the following commands on the other machine:

`1`	`mkdir` `-p /data/shard12`

`1`	`bin/mongod -shardsvr -port 27017 -dbpath=/data/shard12/ -logpath=/data/shard12.log --fork`

Normally, you can start these two nodes. If the startup fails. That is, mkdir-P/data/shard11 is not successfully created.

In ubuntu, you need the root permission:

`1`	`sudo` `mkdir` `/data/shard11`

`2`	`sudo` `chmod` `-R 777 /data/shard11`

Then start. Run the following command on the terminal to check whether the mongod process is successfully started:

`1`	`ps` `-ef \|grep` `mongod`

If none of them are successfully started, repeat the above until you find the cause.

OK. Now we have successfully started one mongod instance on the two computers, and mongod is the process that actually stores data. In the group, you also need a configuration server to store the configuration information shared by each node and the metadata of the data [metadata], as shown in the preceding figure.

Config does not occupy too many resources. We start the mono config on any of them. The shell is as follows:

`1`	`# Config also stores a small amount of data. Do not forget to create a folder for it to store data.`

`2`	`mkdir` `/data/config`

`1`	`bin/mongod -configsvr -dbpath=/data/config -port 20000 -logpath=/data/config.log --fork`

You may have noticed that-shardsvr is added to the startup parameters of two shard servers and-configsvr is added to the configuration instance. Mongo is differentiated in this way. Of course, the copied configuration should be:-replset setname. Setname is the alias of the replication cluster.

When all of the above are successfully started, we can enable the mongos service. Run the following command on any machine:

`1`	`# The mongos process does not require dbpath, but logpath is required.`

`2`	`# In the mongos startup parameter, chunksize is used to specify the chunk size. The unit is MB and the default size is 200 MB.`

`3`	`bin/mongos -configdb ip:20000 -port 30000 -chunkSize 512 -logpath=/data/mongos.log --fork`

Note that the IP address above should be the IP address and port of the machine on which you started config.

If it succeeds, you should be able to easily start the mongos process. Run the following command: PS-Ef | grep mongos.

Now the configuration is left. Let the mongos process know which machines need to be added to the shard. Run the following command on any machine: [using SIP as the IP address of the machine that starts the mongos Service]:

`1`	`bin/mongo ip：30000/admin`

Note that the admin behind the configuration must be linked to the admin set. After the connection is successful, you can add the parts to the cluster:

`1`	`db.runCommand({"addshard":"192.168.1.23:27017"})`

`2`	`db.runCommand({"addshard":"192.168.1.22:27017"})`

The two above IP addresses are the shard IP addresses started for the first time, not config.

If yes, you should be able to see the {"OK": 1. In this way, you have successfully added two shard shards. Now you need to create sharding rules.

`1`	`db.runCommand({"shardcollection":"dbname.tablename",` `"key":{"primaryKey":1}})`

Finally, you need to activate the shard settings.

`1`	`db.runCommand({"enablesharding":` `"ndmongo"})`

OK. At the end, you should have successfully configured the parts.

Now you need to specify the dbname and tablename to insert a certain amount of data. Test cluster:

`1`	`db.printShardingStatus()`

You can see the output similar to the following: [There are two machines in the shard, and databases also has the information that your dbname partitioned is true]:

Now you should be able to experience your own parts.

If your machine is 64-bit, you can easily use my configuration. If you use 32-bit, you should also note that when starting Shard, add:

-- Journal

Because the 64-bit start enables Journal by default, 32-bit does not. I am not quite clear about the role of journal. Google is the reader.

Because I started testing on 32 or 64 machines, there were so many annoying problems. I hope other people like me will not repeat them.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More