The principle, construction and application of MongoDB Shard

Source: Internet
Author: User
Tags mongodb

First, the concept:

Sharding (sharding) refers to the process of splitting a database and dispersing it across different machines. Spread the data across different machines without the need for powerful servers to store more data and handle larger loads. The basic idea is to cut the set into small pieces, which are scattered across several slices, each of which is responsible for only a portion of the total data, and finally a equalizer to equalize each shard (data migration). With a routing process called MONGOs, MONGOs knows the correspondence between the data and the slice (by configuring the server). Most of the scenarios used to solve disk space problems, the write may be worse (+ + + + + + +), the query will try to avoid cross-shard query. Time to use shards:

1, the machine's disk is not enough. Use sharding to resolve disk space issues.
2, a single mongod can no longer meet the performance requirements for writing data. Partitioning allows write pressure to spread across shards, using the resources of the Shard server itself.
3, want to put a lot of data in memory to improve performance. As above, the Shard server's own resources are used through sharding.

II. Deployment Installation: If MongoDB is installed (3.0 test for this article)

Before you build a shard, understand the role of each role in the Shard.

① Configure the server. is a separate mongod process that holds the metadata for clusters and shards, which is information about what data each shard contains. Start building first, enable logging. Start the configuration server like normal Mongod, and specify the CONFIGSVR option. Without too much space and resources, configuring the server's 1KB space is equivalent to 200MB of real data. Only the distribution table of the data is saved. When the service is unavailable, it becomes read-only, unable to block, migrate data.
② routing server. That is, MONGOs, a routing function for the program to connect. Do not save the data itself, load the cluster information from the configuration server at startup, open the MONGOs process needs to know the address of the configuration server, specify the CONFIGDB option.
③ The Shard server. is an independent normal mongod process that holds data information. Can be a replica set or it can be a separate server.

Deployment environment: 3 machines

A: Configuration (3), Route 1, Shard 1;

B: Shard 2, Route 2;

C: Shard 3

Before deploying, it is important to understand the meaning of the chip key, and a good chip key is critical to the Shard. The slice key must be an index, and the data is split and dispersed according to the slice key. Indexes are automatically created by Sh.shardcollection. A self-increasing tablet key is not very good for writing and distributing data evenly, because the self-increasing chip key is always written on a shard, and subsequent reaches of a threshold may be written to another shard. However, it is very efficient to follow the tablet key query. The random key has a good effect on the uniform distribution of the data. Be careful to avoid querying on multiple shards as much as possible. When querying on all shards, MONGOs sorts the results.

Start these services as they are running in the background, so start with the configuration file and configure the file description.

1) Configure the server to start. (a opens 3, port:20000, 21000, 22000)

Configuring the server is a normal mongod process, so you only need to open a new instance. Configuration server must open 1 or 3, open 2 will be an error:

Badvalue need either 1 or 3 Configdbs

Because you want to put it in the background and start with a profile, you need to modify the configuration file:

/etc/mongod_20000.conf

#数据目录
dbpath=/usr/local/config/
#日志文件
logpath=/var/log/mongodb/mongodb_config.log
#日志追加
Logappend=true
#端口
port = 20000
#最大连接数
Maxconns =
Pidfilepath =/var/run/mongo_20000.pid
#日志, redo log
journal = True
#刷写提交机制
journalcommitinterval =
#守护进程模式
fork = True
#刷写数据到日志的频率
syncdelay =
#storageEngine = Wiredtiger
#操作日志, Unit m
oplogsize =
# The file size of the namespace, default 16M, Max 2G.
nssize =
Noauth = True
unixsocketprefix =/tmp
CONFIGSVR = True

/etc/mongod_21000.conf

Data Catalog
dbpath=/usr/local/config1/
#日志文件
logpath=/var/log/mongodb/mongodb_config1.log
#日志追加
logappend=true
#端口
port = 21000
#最大连接数
Maxconns =
Pidfilepath =/var/run/mongo_ 21000.pid
#日志, redo log
journal = True
#刷写提交机制
journalcommitinterval =
#守护进程模式
Fork = True
#刷写数据到日志的频率
syncdelay =
#storageEngine = Wiredtiger
#操作日志, Unit m
oplogsize = 1000
#命名空间的文件大小, default 16M, Max 2G.
nssize =
Noauth = True
unixsocketprefix =/tmp
CONFIGSVR = True

To turn on the configuration server:

root@mongo1:~# mongod-f/etc/mongod_20000.conf about 
to fork child process, waiting until server was ready for connect ions.
Forked process:8545 Child
process started successfully, parent exiting root@mongo1:~# mongod-f/etc/mongod_

21000.conf about 
-to-fork child process, waiting until server was ready for connections.
Forked process:8595 Child
process started successfully, parent exiting

Similarly, a 22000-port configuration server. View Code

2) Start of the routing server. (A, B on each open 1, port:30000)

The routing server does not save the data and logs it.

# MONGOs

#日志文件
logpath=/var/log/mongodb/mongodb_route.log
#日志追加
logappend=true
#端口
Port = 30000
#最大连接数
Maxconns =
#绑定地址
#bind_ip =192.168.200.*,...,

pidfilepath =/var/run /mongo_30000.pid

configdb=192.168.200.a:20000,192.168.200.a:21000,192.168.200.a:22000  #必须是1个或则3个配置.
#configdb =127.0.0.1:20000  #报错
#守护进程模式 Fork = True

One of the most important parameters is configdb, which cannot be written in localhost or 127.0.0.1 for the address of the configuration server that is behind the patch, and it needs to be set to an address that can be accessed by other shards, namely 192.168.200.a:20000/21000/22000. Otherwise, you will get an error when Addshard:

{
"OK": 0,
"errmsg": "Can ' t use localhost as a shard since all shards need to communicate. Either use all shards and configdbs on localhost or all in actual IPs  host:172.16.5.104:20000 islocalhost:0 "
}

Open MONGOs:

root@mongo1:~# mongos-f/etc/mongod_30000.conf 
2015-07-10t14:42:58.741+0800 W sharding running with 1 config server Should be do only for testing purposes and are not recommended for production on
to fork child process, waiting UN Til server is a ready for connections.
Forked process:8965 Child
process started successfully, parent exiting
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.