Real-time synchronization MongoDB Oplog Development Guide

Last Update:2017-08-21 Source: Internet

Author: User

Tags epoch time

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Reprint Please specify Joymufeng, welcome to visit Playscala Community (http://www.playscala.cn/)

Capped Collections

MongoDB has a special collection called capped collections, its insertion speed is very fast, the basic and disk write speed is similar, and support in order to insert the efficient query operation. The size of the Capped collections is fixed, and it works much like a ring buffer (circular buffers), overwriting the first inserted data when there is not enough space left.

Capped collections is characterized by efficient insertion and retrieval, so it is best not to add additional indexes on Capped collections, otherwise it will affect the insertion speed. Capped collections can be used in the following scenarios:

Storage log: The First-in-first-out feature of Capped collections just satisfies the storage order of log events;
Cache small amounts of data: Because the cache is characterized by a low number of read and write, the index can be used to improve the reading speed.

Restrictions on the use of Capped collections:

If you update the data, you need to create an index for it to prevent collection scan;
When you update data, the size of the document cannot be changed. For example, the Name property is ' ABC ', it can only be modified to a 3-character string, otherwise the operation will fail;
Data is not allowed to be deleted, if not deleted, only drop collection
Sharding not supported
Default only supports returning results in natural order (that is, insert order)

Capped collections can use the $natural operator to return the result in either the positive or reverse order of the insertion sequence:

db[-1})

Oplog

Oplog is a special kind of capped collections, which is special because it is a system-level collection, which records all operations of the database, and the cluster relies on Oplog for data synchronization. The full name of the Oplog is local.oplog.rs, located under Local data. Because local data does not allow users to be created, if you want to access Oplog users who need to use a different database and give that user permission to access the local database, for example:

"******", "roles": [{ "ReadWrite", "Play-community"}, { "read", " Local "}]})

Oplog Records are idempotent (idempotent), which means that you can perform these operations multiple times without causing data loss or inconsistency. For example, for $inc operations, Oplog automatically converts them to $set operations, such as the original data as follows:

{  "0", 1.0}

Perform the following $inc operation:

1}})

The logs for the Oplog record are:

{"TS":Timestamp (1503110518,1),"T":Numberlong (8),     "H":  Numberlong (-3967772133090765679),     "V": Span class= "Hljs-type" >numberint (2),     "OP":  "U",     "ns":  " Play-community.test ",    " O2 ": {    " _id ": " 0 "  },   " O ": {     "$set": {      " Count ": 2.0&NBSP;&NBSP;&NBSP;&NBSP;}&NBSP;&NBSP;}}

This kind of conversion can guarantee the power of oplog. Additionally Oplog is not allowed to create additional indexes to ensure insert performance.

Timestamps format

MongoDB has a special time format timestamps, which is used only for internal use, such as the above Oplog records:

Timestamp (1)

Timestamps length is 64 bits:

The first 32 bits are time_t values, representing the number of seconds since the epoch time
The last 32 bits are the ordinal value, which is an ordinal increment of the sequence number that represents the first operation in a second

Start Synchronizing Oplog

Before we start synchronizing oplog, we need to be aware of the following points:

Because Oplog does not use indexes, initial queries can be costly
TS can be saved when the Oplog data volume is large, and the TS can be used to reduce the first query overhead when the system restarts
The OPLOGREPLAY flag can significantly speed up queries that contain TS conditional filtering, but only valid for Oplog queries

Val tailingcursor = Oplogcol. Find (Json.obj ("NS"Json.obj ("$in"Set (S${db}.common-doc ",S${db}.common-article ")),"TS"Json.obj ($gte, Lastts)). Options (Queryopts (). tailable.oplogReplay.awaitData.noCursorTimeout). cursor[Bsondocument] () Tailingcursor.fold (()) {(_, doc) = =try {Val jsobj = doc.as[Jsobject] Jsobj ("Op"). as[String]match {   case  "I "= = //insert    case  "u" = //update     Case  "D" = //delete   }  //Save TS value for later use   if (tailcount.get ()%  Ten = = 0) {} } catch {   case T: throwable =>    logger.error ( "Tail Oplog error:" + t.getmessage, T)  }}

Also note that reactivemongo-streaming Akka stream implementation has a bug, if the first query does not return data, it will continue to send query requests, about dozens of to hundreds of requests per second, because the Oplog query overhead is very large, will eventually cause MongoDB memory overflow. For details refer to keep sending queries while the initial query result of a tailable cursor is empty.

Reference

MongoDB Doc-replica Set Oplog
MongoDB doc-capped Collections
MongoDB doc-tailable Cursors

Real-time synchronization MongoDB Oplog Development Guide

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More