MongoDB資料量較大時如何構建索引--減少業務最少影響

最後更新：2016-04-19 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

標籤：大資料量索引

在資料量較大或請求量較大,直接建立索引對效能有顯著影響時,可以利用複製集(資料量較大時一般為線上環境,使用複製集為必然選擇或者使用分區.)中部分機器宕機不影響複製集工作的特性,繼而建立索引。

備忘：添加索引的表使用WT引擎，資料量有1.5億左右。

1. 複本集配置參數

節點1：

$ more shard1.conf

dbpath=/data/users/mgousr01/mongodb/dbdata/shard1_1

logpath=/data/users/mgousr01/mongodb/logs/shard1_1.log

pidfilepath=/data/users/mgousr01/mongodb/dbdata/shard1_1/shard1-1.pid

directoryperdb=true

logappend=true

replSet=shard1

shardsvr=true

bind_ip=127.0.0.1,x.x.x.x

port=37017

oplogSize=9024

fork=true

#noprealloc=true

#auth=true

journal=true

profile=1

slowms=10

maxConns=12000

storageEngine = wiredTiger

wiredTigerCacheSizeGB=96

#clusterAuthMode=keyFile

keyFile=/data/users/mgousr01/mongodb/etc/keyFilers0.key

wiredTigerDirectoryForIndexes=on

wiredTigerCollectionBlockCompressor=zlib

wiredTigerJournalCompressor=zlib

節點2：

$ more shard2.conf

dbpath=/data/users/mgousr01/mongodb/dbdata/shard2_1

logpath=/data/users/mgousr01/mongodb/logs/shard2_1.log

pidfilepath=/data/users/mgousr01/mongodb/dbdata/shard2_1/shard2-1.pid

directoryperdb=true

logappend=true

replSet=shard1

shardsvr=true

bind_ip=127.0.0.1,x.x.x.x

port=37017

oplogSize=9024

fork=true

#noprealloc=true

#auth=true

journal=true

profile=1

slowms=10

maxConns=12000

storageEngine = wiredTiger

wiredTigerCacheSizeGB=96

#clusterAuthMode=keyFile

keyFile=/data/users/mgousr01/mongodb/etc/keyFilers0.key

wiredTigerDirectoryForIndexes=on

wiredTigerCollectionBlockCompressor=zlib

wiredTigerJournalCompressor=zlib

節點3：

[[email protected] etc]$ more shard3.conf

dbpath=/data/users/mgousr01/mongodb/dbdata/shard3_1

logpath=/data/users/mgousr01/mongodb/logs/shard3_1.log

pidfilepath=/data/users/mgousr01/mongodb/dbdata/shard3_1/shard3-1.pid

directoryperdb=true

logappend=true

replSet=shard1

shardsvr=true

bind_ip=127.0.0.1,x.x.x.x

port=37017

oplogSize=9024

fork=true

#noprealloc=true

#auth=true

journal=true

profile=1

slowms=10

maxConns=12000

storageEngine = wiredTiger

wiredTigerCacheSizeGB=96

#clusterAuthMode=keyFile

keyFile=/data/users/mgousr01/mongodb/etc/keyFilers0.key

wiredTigerDirectoryForIndexes=on

wiredTigerCollectionBlockCompressor=zlib

wiredTigerJournalCompressor=zlib

2. 啟動mongodb

mongod -f <參數名稱> 啟動

3. 配置複本集命令(登陸任意一台主機)

config={_id:‘shard1‘,members:[{_id:0,host:‘x.x.x.x:37017‘,priority:1,tags:{‘use‘:‘xxx‘}},{_id:1,host:‘x.x.x.x:37017‘,priority:1,tags:{‘use‘:‘xxx‘}},{_id:2,host:‘x.x.x.x:37017‘,priority:1,tags:{‘use‘:‘xxx‘}}]}

rs.initiate(config)

4. 在primary庫進行寫操作類比線上

for(i=0;i<100000;i++){ db.users.insert( { "username":"user"+i, "age":Math.floor(Math.random()*120), "created":new Date() })}

比如，使用者資訊增長比較快，超過了1億多資料量等狀況……

5. 資料量較大時建立索引的通用方法說明

建立索引不能再secondary節點建立，只能在主上建立索引。

為了盡量降低建立索引對 MongoDB Server 的影響，有一種方法是把 MongoDB Server 轉換成standalone模式後建立。具體做法如下：

(1)首先把 secondary server 停止，在取消 --replSet 參數，並且更改 MongoDB port 之後重新啟動 MongoDB，這時候 MongoDB 將進入 standalone 模式；

(2).在 standalone 模式下運行命令 ensureIndex 建立索引，使用 foreground 方式運行也可以，建議使用background方式運行；

(3)建立索引完畢之後關閉 secondary server 按正常方式啟動;

4.根據上述 1~3 的步驟輪流為 secondary 建立索引，最後把 primary server 臨時轉換為 secondary server，同樣按 1~3 的方法建立索引，再把其轉換為 primary server。

這種方式還是比較麻煩的，但可以把建立索引操作對 MongoDB 的影響降到最低，在有些情況下還是值得做的。

6. 具體做法

(1)停其中一台Secondary節點

上述複本集是三節點：節點1是primary節點，節點2和節點3是secondary節點。

以節點3第二個secondary為例操作：

$ pwd

/data/users/mgousr01/mongodb/etc

$ mongod -f shard3.conf --shutdown 關閉mongod進程

(2)將shard3.conf設定檔的replSet=shard1注釋掉

$ vim shard3.conf

……

#replSet=shard1

……

(3)然後啟動

mongod -f shard3.conf

(4)登陸進去建立索引

mongo x.x.x.x:37017/admin

> use chicago

Build the Index

> db.users.ensureIndex({username:1,created:1},{unique:true},{name:"username_created_unique"},{background:true})

【建議以後建立索引要有命名規範】

(5)查看索引資訊

> db.users.getIndexes()

[

{

"v" : 1,

"key" : {

"_id" : 1

"name" : "_id_",

"ns" : "chicago.users"

{

"v" : 1,

"unique" : true,

"key" : {

"username" : 1,

"created" : 1

"name" : "username_1_created_1",

"ns" : "chicago.users"

}

]

(6)再次停掉複本集的mongod進程

$ pwd

/data/users/mgousr01/mongodb/etc

$ mongod -f shard3.conf --shutdown

(7)啟動mongod進程

將shard3.conf設定檔的replSet=shard1注釋去掉；然後啟動

mongod -f shard3.conf

mongo ip:37017/admin

啟動後，會將節點加入到複本集中，然後同步primary資料，secondary上的索引不會對主造成影響導致主從不一致狀況發生。

7.對第二個secondary副本操作--節點2

重複第6步即可。

8.構建所有secondarys索引大致步驟

For each secondary in the set, build an index according to the following steps:

(1)Stop One Secondary

(2)Build the Index

(3)Restart the Program mongod

9.構建primary節點索引

(1)登陸到主節點

mongo ip:37017/admin

(2) 將主節點降級

shard1:PRIMARY> rs.stepDown(30)

2016-04-19T12:49:44.423+0800 I NETWORK DBClientCursor::init call() failed

2016-04-19T12:49:44.426+0800 E QUERY Error: error doing query: failed

at DBQuery._exec (src/mongo/shell/query.js:83:36)

at DBQuery.hasNext (src/mongo/shell/query.js:240:10)

at DBCollection.findOne (src/mongo/shell/collection.js:187:19)

at DB.runCommand (src/mongo/shell/db.js:58:41)

at DB.adminCommand (src/mongo/shell/db.js:66:41)

at Function.rs.stepDown (src/mongo/shell/utils.js:1006:15)

at (shell):1:4 at src/mongo/shell/query.js:83

2016-04-19T12:49:44.427+0800 I NETWORK trying reconnect to xxxx failed

2016-04-19T12:49:44.428+0800 I NETWORK reconnect xxxx ok

shard1:SECONDARY>

降級命令執行後，會主動變成secondary節點。上述兩個secondary節點會有一個節點選舉成為primary節點。

(3)後續構建索引的方法和第6步一樣。

說明：

讓primary降級：rs.stepDown(downseconds=60)，primary降級期間，它不會參與選舉，如果降級時限過後，複本集還是沒有primary，它會參與選舉。

Preventing Electoins

讓secondaries保持狀態：rs.freeze(seconds)，這樣就可以在這個時間內對primary做處理，而不用擔心除非選舉。

解除狀態鎖定：rs.freeze(0)。

本文出自 “九指神丐” 部落格，請務必保留此出處http://beigai.blog.51cto.com/8308160/1765457

MongoDB資料量較大時如何構建索引--減少業務最少影響

本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

MongoDB資料量較大時如何構建索引--減少業務最少影響

聯繫我們

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support