MySQL Library has a table, the amount of data around 1 billion, basically for this table operation is meaningless, so want to test the migration to MongoDB shared architecture will be improved. So whined whined built a 3 shard MONGO environment, and the MySQL in this large table data exported to CSV format, and finally use the Mongoimport tool to successfully import more than 1 billion of the data into the MONGO cluster. But after the guide to see the next MongoDB in the corresponding DB information, almost did not scare urine, see:
# # See the DB information in MONGOs, showing the migrated lens_mobapp_trace library size is 1197.416gbmongos> show Dbs;admin 0.016GBconfig 0.109gblens_mobapp_trace 1197.416GBtest 0.031GB
The size of the data file I exported from MySQL is only 259GB, in CSV format, as follows:
# # Size $ ll-h Total 259g-rw-r--r--1 root root 259G Mar 16:27 nl_mob_app_error_trace.txt## format: comma-delimited csv file $ head-2 nl_mob _app_error_trace.txt "1551445226", "2015-02-11 00:00:01", "4613", "5969", "2", "19796", "1", "5397", "73", "0", "0", "46000 "," 4 "," 0 "," 499 "," 2 "," 903 "," 1 "," -370443099 "," Http://pingma.qq.com/mstat/report "," nbfs://2015/2/11/4613/4613.0. " txt?2444,2#93b "1551445229", "2015-02-11 00:00:01", "4611", "7259", "1", "34", "2", "1038", "71", "48", "4815", "16", "1", " 1482 "," 412161 "," 2 "," 903 "," 1 "," 1429877488 "," Http://api.mobile.cosmeapp.cn/cosmeapi "," nbfs://2015/2/11/4611/ 4611.0.TXT?38200,750#2C2 "
The size of this 259G data in the MySQL library is as follows:
## Data size 228gbmysql> select sum (data_length)/1024/1024/1024 as size from information_schema.tables where table_name = ' Nl_mob_app_error_trace ' and table_schema = ' lens_mobapp_trace '; +-----------------------+| size |+-----------------------+| 227.9790587359032 |+-----------------------+1 row in set (0.59 sec) # # Data + index total size 404gbmysql> select sum (data_length+index_length)/1024/1024/1024 from information_schema.tables where table_name = ' Nl_mob_app_error_trace ' and table_schema = ' lens_mobapp_trace '; +-----------------------+| size |+-----------------------+| 404.3463549343156 |+-----------------------+1 row in set (0.00 sec) ## Record Count mysql> Select count (*) from nl_mob_app_error_trace;+------------------+| count (*) |+------------------+| 1079684226 |+------------------+1 row in set (23 min 18.06 sec)
See the migration to MongoDB data to 1197.416GB, at first I was rejected, suspicion is added stunt, because all said MongoDB will pre-allocate large files for storage, so to mongos up to see the back end Shard Real storage status
mongos> use lens_mobapp_traceswitched to db lens_mobapp_trace## Here only one shard is extracted to illustrate the problem mongos> db.stats (); { "Raw" : { ........ "udb-q12c20/10.6.14.38:27017" : { "db" : "Lens_ Mobapp_trace ", " Collections " : 4, "Objects" : 352321865, "AvgObjSize" : 1005.7750069868641, "DataSize"  : 354356526232, "Storagesize" : 371299001584, "Numextents"  : 711, "Indexes" : 3, "Indexsize" : 64238570368, "FileSize" : 437856239616, "Nssizemb" : 16, "Datafileversion" : { "Major" : 4, "Minor" : 5 }, "Extentfreelist" : { "num" : 4, "TotalSize" : 180224 }, "OK" : 1 },          &NBSP, ...}
"Objects": 352321865 # # Number of records on this shard
"Avgobjsize": 1005.7750069868641 # # Average size per record
"DataSize": 354356526232 # # True data size
"Storagesize": 371299001584 # # Data size + pre-allocated table space (2G)
"Indexsize": 64238570368 # # Size of index
"FileSize": 437856239616 # # The size of the entire data file
You can see that each record occupies about 1k of space in MongoDB, and the average size of each record in MySQL is 227bytes, so the data will become so large after migrating to MongoDB. So what is the specific reason? MongoDB will automatically add a uniquely identified field _id to each record inserted, the field size is 12bytes, and the MongoDB each record will reserve some replace space for the record update operation, then there is the pre-allocation function will also occupy a certain amount of space ( The maximum is 2GB). MongoDB and MySQL are different from the management of table spaces, but one thing is for sure, that data is inflated from MySQL to MongoDB.
The version we used above for this test is mongodb2.6.9. After mongodb3.0 the underlying storage has been improved, using Wiredtiger as the built-in storage engine, the official said that this initiative can be used for MongoDB with document-level lock concurrency control and efficient document compression capabilities, and its own performance has been a significant improvement, to enhance the MongoDB With 7-10 times write performance, reduced storage footprint of 80%, reduced up to 95% operational overhead, support for up to 50 replica sets, curiosity driven, a simple test was made as follows: MongoDB starts with the Wiredtiger engine and then imports 5G of data, Finally, look at the size of the imported data.
The ## mongodb3.0.1 uses the Wiredtiger engine to start the service Mongod --dbpath=/data/db/ --logpath=/data/db/logs/mongodb.log --logappend --port=27017 -storageengine wiredtiger --fork >/dev/null## Data sample size [[Email protected] opt]# ll -h xaa-rw-r--r-- 1 root root 5.0G Apr 22 16:57 xaa## Data Sample Record Count [[email protected] opt]# wc -l xaa21300000 xaa## Data sample format [[EMAIL PROTECTED] OPT]# HEAD -2 XAA] 1551445226 "," 2015-02-11 00:00:01 "," 4613 "," 5969 "," 2 "," 19796 "," 1 "," 5397 "," 73 "," 0 "," 0 "," 46000 "," 4 "," 0 "," 499 " , "2", "903", "1", " -370443099", "Http://pingma.qq.com/mstat/report", "nbfs://2015/2/11/4613/4613.0.txt?2444,2#93b" " 1551445229 "," 2015-02-11 00:00:01 "," 4611 "," 7259 "," 1 "," 34 "," 2 "," 1038 "," 71 "," 48 "," 4815 "," 16 "," 1 "," 1482 "," 412161 "," 2 "," 903 "," 1 "," 1429877488 "," Http://api.mobile.cosmeapp.cn/cosmeapi "," Nbfs://2015/2/11/4611/4611.0.txt? " 38200,750#2C2 "
Import MongoDB in Mongoimport, and then view the library size and number of records in the library
View > show dbs;lens_mobapp_trace 2.847GBlocal in ## Library 0.000gb> use lens_mobapp_traceswitched to db lens_mobapp_trace> show collections;nl_mob_app_error_trace> db.nl_ Mob_app_error_trace.count ();21300000## View the size of the data directory in the file system [[Email protected] db]# ll -h total 2.9g-rw-r--r-- 1 mongodb mongodb 16k apr 22 22:28 collection-0-3862414736807305835.wt-rw-r--r-- 1 root root 2.7G Apr 22 23:03 collection-2-3862414736807305835.wt-rw-r--r-- 1 mongodb mongodb 16k apr 22 22:28 index-1-3862414736807305835.wt-rw-r--r-- 1 root root 185M Apr 22 23:03 Index-3-3862414736807305835.wtdrwXr-xr-x 2 mongodb mongodb 4.0k apr 22 23:03 journaldrwxr-xr-x 2 mongodb mongodb 4.0K Apr 22 22:26 logs-rw-r--r-- 1 mongodb mongodb 32k apr 22 22:31 _mdb_catalog.wt-rwxr-xr-x 1 mongodb Mongodb 5 apr 22 22:27 mongod.lock-rw-r--r-- 1 mongodb mongodb 36k apr 22 23:05 sizestorer.wt-rw-r--r-- 1 mongodb mongodb 95 apr 22 22:27 storage.bson-rw-r--r-- 1 mongodb mongodb 46 apr 22 22:27 wiredtiger-rw-r--r-- 1 mongodb mongodb 495 apr 22 22:27 wiredtiger.basecfg-rw-r--r-- 1 mongodb mongodb 21 apr 22 22:27 wiredtiger.lock-rw-r--r-- 1 root root 886 Apr 22 23:06 WiredTiger.turtle-rw-r--r-- 1 Mongodb mongodb 44k apr 22 23:06 wiredtiger.wt
See this result can not help people rejoice, it seems that after mongodb3.0, did improve a lot, import 5GB data, compression ratio in 57% or so, if the import of large amount of data, then this compression ratio should be improved, it seems with the official said 80% access is not big, Did this test result in your confidence in MongoDB again? Haha, it seems necessary to learn 3.0 of the new version, new technology in the endless, it is drifting in the sea, only day to up.
This article from "Brave forward, resolutely left" blog, reprint please contact the author!
A pit that the MySQL stepped into during the migration to MongoDB shared architecture