How to build an index when the amount of MONGODB data is large-reducing the least business impact

Source: Internet
Author: User
Tags auth mongodb server naming convention

When the data volume is large or the request quantity is large, the direct index has a significant effect on performance, it can be used to make use of the replica set (which is usually the online environment when the data is large, and use the copy set as the inevitable choice or use the Shard). Partial machine downtime does not affect the nature of the copy set work, and then the index is established.

Note: Indexed tables use the WT engine, which has a data volume of about 150 million.


1. Replica Set configuration parameters

Node 1:

$ more shard1.conf

Dbpath=/data/users/mgousr01/mongodb/dbdata/shard1_1

Logpath=/data/users/mgousr01/mongodb/logs/shard1_1.log

Pidfilepath=/data/users/mgousr01/mongodb/dbdata/shard1_1/shard1-1.pid

Directoryperdb=true

Logappend=true

Replset=shard1

Shardsvr=true

bind_ip=127.0.0.1,x.x.x.x

port=37017

oplogsize=9024

Fork=true

#noprealloc =true

#auth =true

Journal=true

Profile=1

slowms=10

maxconns=12000

Storageengine = Wiredtiger

wiredtigercachesizegb=96

#clusterAuthMode =keyfile

Keyfile=/data/users/mgousr01/mongodb/etc/keyfilers0.key

Wiredtigerdirectoryforindexes=on

Wiredtigercollectionblockcompressor=zlib

Wiredtigerjournalcompressor=zlib


Node 2:

$ more shard2.conf

Dbpath=/data/users/mgousr01/mongodb/dbdata/shard2_1

Logpath=/data/users/mgousr01/mongodb/logs/shard2_1.log

Pidfilepath=/data/users/mgousr01/mongodb/dbdata/shard2_1/shard2-1.pid

Directoryperdb=true

Logappend=true

Replset=shard1

Shardsvr=true

bind_ip=127.0.0.1,x.x.x.x

port=37017

oplogsize=9024

Fork=true

#noprealloc =true

#auth =true

Journal=true

Profile=1

slowms=10

maxconns=12000

Storageengine = Wiredtiger

wiredtigercachesizegb=96

#clusterAuthMode =keyfile

Keyfile=/data/users/mgousr01/mongodb/etc/keyfilers0.key

Wiredtigerdirectoryforindexes=on

Wiredtigercollectionblockcompressor=zlib

Wiredtigerjournalcompressor=zlib


Node 3:

[Email protected] etc]$ more shard3.conf

Dbpath=/data/users/mgousr01/mongodb/dbdata/shard3_1

Logpath=/data/users/mgousr01/mongodb/logs/shard3_1.log

Pidfilepath=/data/users/mgousr01/mongodb/dbdata/shard3_1/shard3-1.pid

Directoryperdb=true

Logappend=true

Replset=shard1

Shardsvr=true

bind_ip=127.0.0.1,x.x.x.x

port=37017

oplogsize=9024

Fork=true

#noprealloc =true

#auth =true

Journal=true

Profile=1

slowms=10

maxconns=12000

Storageengine = Wiredtiger

wiredtigercachesizegb=96

#clusterAuthMode =keyfile

Keyfile=/data/users/mgousr01/mongodb/etc/keyfilers0.key

Wiredtigerdirectoryforindexes=on

Wiredtigercollectionblockcompressor=zlib

Wiredtigerjournalcompressor=zlib


2. Start MongoDB

Mongod-f < parameter name > start


3. Configure the replica set command (log in to any host)

config={_id: ' Shard1 ', members:[{_id:0,host: ' x.x.x.x:37017 ', priority:1,tags:{' use ': ' xxx '}},{_id:1,host: ' x.x.x.x : 37017 ', priority:1,tags:{' use ': ' xxx '}},{_id:2,host: ' x.x.x.x:37017 ', priority:1,tags:{' use ': ' XXX '}}}

Rs.initiate (config)


4. On the primary library for write operations on the analog line

for (i=0;i<100000;i++) {Db.users.insert ({"username": "User" +i, "Age": Math.floor (Math.random () *120), "created": New Date ()})}

For example, the user information growth is faster than more than 100 million data volume and other conditions ...


5. General method description for creating indexes when data volume is large

Creating an index is no longer secondary node creation, only the master can create an index .

To minimize the impact of indexing on MongoDB server, one way to do this is to convert MongoDB server into standalone mode. The following are the specific practices:

(1) First the secondary server is stopped, after canceling the--replset parameter, and after changing the MongoDB port to restart MongoDB, this time MongoDB will enter the standalone mode;

(2). Run the command in standalone mode ensureindex indexing, using foreground mode can also, it is recommended to use the background mode of operation;

(3) When the index is complete, close secondary server starts normally;

4. According to the above-mentioned steps, the secondary is indexed in turn, and finally the primary server is temporarily converted to secondary server, and the index is indexed by the same method and converted to primary server.

This approach is still cumbersome, but it is possible to minimize the impact of indexing operations on MongoDB, which in some cases is worth doing.


6. Specific practices

(1) Stop one of the secondary nodes

The above replica set is a three node: Node 1 is the primary node, node 2 and node 3 are secondary nodes.

Take the second secondary of node 3 as an example:

$ pwd

/data/users/mgousr01/mongodb/etc

$ mongod-f shard3.conf--shutdown Close Mongod process

(2) Comment out the replset=shard1 of the shard3.conf configuration file

$ vim shard3.conf

......

#replSet =shard1

......

(3) then start

Mongod-f shard3.conf

(4) Log in to create an index

MONGO X.x.x.x:37017/admin

> Use Chicago

Build the Index

> Db.users.ensureIndex ({username:1,created:1},{unique:true},{name: "Username_created_unique"},{background: True})

"We recommend that you create an index later to have a naming convention"

(5) View index information

> db.users.getIndexes ()

[

{

"V": 1,

"Key": {

"_ID": 1

},

"Name": "_id_",

"NS": "Chicago.users"

},

{

"V": 1,

"Unique": true,

"Key": {

"Username": 1,

"Created": 1

},

"Name": "Username_1_created_1",

"NS": "Chicago.users"

}

]


(6) Stop the mongod process of the replica set again

$ pwd

/data/users/mgousr01/mongodb/etc

$ mongod-f shard3.conf--shutdown

(7) Start the Mongod process

Remove the Replset=shard1 comment from the shard3.conf configuration file;

Mongod-f shard3.conf

MONGO Ip:37017/admin


Once started, the node is joined to the replica set, and then the primary data is synchronized, and the index on the secondary does not affect the main cause of the master-slave inconsistency.



7. For the second secondary copy operation--node 2

Repeat the 6th step.


8. Building all Secondarys indexes approximate steps

For each secondary in the set, build a index according to the following steps:

(1) Stop One secondary

(2) Build the Index

(3) Restart the program Mongod


9. Building the primary node index

(1) Login to the master node

MONGO Ip:37017/admin

(2) Demote the master node

Shard1:primary> Rs.stepdown (30)

2016-04-19t12:49:44.423+0800 I NETWORK Dbclientcursor::init call () failed

2016-04-19t12:49:44.426+0800 E QUERY Error:error doing query:failed

At Dbquery._exec (src/mongo/shell/query.js:83:36)

At Dbquery.hasnext (src/mongo/shell/query.js:240:10)

At Dbcollection.findone (src/mongo/shell/collection.js:187:19)

At Db.runcommand (src/mongo/shell/db.js:58:41)

At Db.admincommand (src/mongo/shell/db.js:66:41)

At Function.rs.stepDown (src/mongo/shell/utils.js:1006:15)

at (Shell): 1:4 at src/mongo/shell/query.js:83

2016-04-19t12:49:44.427+0800 I NETWORK trying reconnect to XXXX failed

2016-04-19t12:49:44.428+0800 I NETWORK reconnect xxxx OK

Shard1:secondary>

When the downgrade command executes, it becomes an active secondary node. These two secondary nodes will have one node elected as primary nodes.

(3) The following method of building the index is the same as the 6th step.


Description

Let primary downgrade: Rs.stepdown (DOWNSECONDS=60), primary downgrade, it will not participate in the election, if the downgrade period, the replica set or not primary, it will participate in the election.

Preventing Electoins

Keep the secondaries state: rs.freeze (seconds) so that you can deal with primary at this time without worrying about the election.

Unlock Status: Rs.freeze (0).






This article is from the "nine Finger God hack" blog, please be sure to keep this source http://beigai.blog.51cto.com/8308160/1765457

How to build an index when the amount of MONGODB data is large-reducing the least business impact

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.