Configure MongoDB cluster shards

Source: Internet
Author: User
Tags mongodb sharding

Reprinted from http://my.oschina.net/zhzhenqin/blog/97268

At present, many tutorials on the internet talk about MongoDB sharding configuration, but most of them have not gone through practical practices and are reproduced randomly. In addition, the different configurations of MongoDB versions are messy, making the users confused.

Recently I have also made MongoDB sharding and pasted my configuration. In addition, let's talk to everyone about the issues that need attention. I hope you can correct them if they are inappropriate. We also hope that later users can bypass these problems.

Formal environment to ensure data security to back up, for the partition copy see Alibaba Tutorial: http://www.taobaodba.com/html/525_525.html

The cluster I configured is used for testing and has not been copied. It is just a simple test of multipart storage data. During the test, it is difficult to split and copy multiple machines. Many examples on the Internet are incorrect, so it took me a lot of time.

The structure after configuration is complete is:

My Mongo version is: mongodb-linux-x86_64-2.0.8, the computer is normal PC.

Here I would like to note:

1. In Big Data scenarios, it is best to use a 64-bit machine. A 32-bit machine cannot create a single file larger than 2 GB. Small Data size does not matter

2. The system of each computer in the cluster should be the same. Do not use 32-bit or 64-bit pairs.

-- This is what I did when I started the test. There are 2 sets of 64, 2 sets of 32. when the data of about million data is injected and the index is greater than 3-4 GB, a 32-bit machine error occurs. A single file larger than 2 GB cannot be created on a 32-bit system. The whole cluster is paralyzed. I have been looking for reasons for this for a long time.

Let's get started!

The unzipped Mongo directory structure is as follows: the installation directory is $ {mongo_install}

Run the following commands on $ {cmd_install} on the first computer:

1 mkdir -p /data/shard11
1 bin/mongod -shardsvr -port 27017 -dbpath=/data/shard11/ -logpath=/data/shard11.log --fork

Run the following commands on the other machine:

1 mkdir -p /data/shard12
1 bin/mongod -shardsvr -port 27017 -dbpath=/data/shard12/ -logpath=/data/shard12.log --fork

Normally, you can start these two nodes. If the startup fails. That is, mkdir-P/data/shard11 is not successfully created.

In ubuntu, you need the root permission:

1 sudo mkdir /data/shard11
2 sudo chmod -R 777 /data/shard11

Then start. Run the following command on the terminal to check whether the mongod process is successfully started:

1 ps -ef |grep mongod

If none of them are successfully started, repeat the above until you find the cause.

OK. Now we have successfully started one mongod instance on the two computers, and mongod is the process that actually stores data. In the group, you also need a configuration server to store the configuration information shared by each node and the metadata of the data [metadata], as shown in the preceding figure.

Config does not occupy too many resources. We start the mono config on any of them. The shell is as follows:

1 # Config also stores a small amount of data. Do not forget to create a folder for it to store data.
2 mkdir /data/config
1 bin/mongod -configsvr -dbpath=/data/config -port 20000 -logpath=/data/config.log --fork

You may have noticed that-shardsvr is added to the startup parameters of two shard servers and-configsvr is added to the configuration instance. Mongo is differentiated in this way. Of course, the copied configuration should be:-replset setname. Setname is the alias of the replication cluster.

When all of the above are successfully started, we can enable the mongos service. Run the following command on any machine:

1 # The mongos process does not require dbpath, but logpath is required.
2 # In the mongos startup parameter, chunksize is used to specify the chunk size. The unit is MB and the default size is 200 MB.
3 bin/mongos -configdb ip:20000 -port 30000 -chunkSize 512 -logpath=/data/mongos.log --fork

Note that the IP address above should be the IP address and port of the machine on which you started config.

If it succeeds, you should be able to easily start the mongos process. Run the following command: PS-Ef | grep mongos.

Now the configuration is left. Let the mongos process know which machines need to be added to the shard. Run the following command on any machine: [using SIP as the IP address of the machine that starts the mongos Service]:

1 bin/mongo ip:30000/admin

Note that the admin behind the configuration must be linked to the admin set. After the connection is successful, you can add the parts to the cluster:

1 db.runCommand({"addshard":"192.168.1.23:27017"})
2 db.runCommand({"addshard":"192.168.1.22:27017"})

The two above IP addresses are the shard IP addresses started for the first time, not config.

If yes, you should be able to see the {"OK": 1. In this way, you have successfully added two shard shards. Now you need to create sharding rules.

1 db.runCommand({"shardcollection":"dbname.tablename""key":{"primaryKey":1}})

Finally, you need to activate the shard settings.

1 db.runCommand({"enablesharding""ndmongo"})

OK. At the end, you should have successfully configured the parts.

Now you need to specify the dbname and tablename to insert a certain amount of data. Test cluster:

1 db.printShardingStatus()

You can see the output similar to the following: [There are two machines in the shard, and databases also has the information that your dbname partitioned is true]:

Now you should be able to experience your own parts.

If your machine is 64-bit, you can easily use my configuration. If you use 32-bit, you should also note that when starting Shard, add:

-- Journal

Because the 64-bit start enables Journal by default, 32-bit does not. I am not quite clear about the role of journal. Google is the reader.

Because I started testing on 32 or 64 machines, there were so many annoying problems. I hope other people like me will not repeat them.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.