Copy set of "Six" MongoDB management

Source: Internet
Author: User
Tags dba install mongodb mongoclient mongodb driver mongo shell



First, the introduction of replication



The so-called replication is the process of synchronizing data between multiple hosts.



1. Data redundancy and availability



Replication technology provides data redundancy and availability, using multiple copies of data on different database servers, and replication techniques to prevent data loss from a single database server that fails. By setting up from the library, you can perform disaster switching, data backup, Reporting Services, and so on. In some scenarios, you can also improve the ability to read, and the client distributes read and write requests to different servers.



2. MongoDB Replication Technology



A replica set is a set of Mongod instances that share the same set of data. When all write requests are sent to the main library, the other from the library applies these actions from the main library to ensure consistency of all member data.



Only one primary library in a replica set can receive write requests from clients. When the main library accepts write requests, data operations are recorded in the operations log, called Oplog. Copy these oplog from the library and apply these actions to ensure consistency with the data set of the main library. If the main library is unavailable at this point, a qualified slave library in the cluster becomes the new main library through the campaign takeover. You can also add a Mongod instance to the replication cluster, as the arbiter exists, and it does not maintain any data, and it has the meaning of responding to campaign requests and heartbeat detections from other members of the cluster. It does not require a high level of server hardware. It will always exist as an arbiter, regardless of whether the main library is down or if a library becomes a new main library.



1) Asynchronous replication



It is an asynchronous way to apply data operations from the library to the log.



2) Automatic takeover



When the main library and other members of the communication process, over 10 seconds can not be connected then a qualified from the library through the campaign will become the new Main library. In the mongodb3.2 version, a new version of the replication protocol was introduced to reduce the time taken to take over.



3) Read operation



By default, the client's read request is sent to the main library, but it can be set to send a read request from the client to the slave library, but because of the asynchronous way in which replication technology is used, this means that data read from the library can sometimes incur delays and inconsistencies with the master library data.



Second, the replication practice



1. Members of the replica set



It consists of three main libraries, from libraries, and arbitrators. In a replica cluster, the minimum number of members is a primary library, one from the library, and one arbiter. But most of the actual applications are a master library, two from the library. A maximum of 50 members can be reached in a cluster in version 3.0, with 12 members in version 3.2.



1) Main Library



In a replica cluster, only one main library can exist, and all write requests are received. MongoDB applies the write operation to the data file and logs the action into the log file Oplog. Copy these oplog logs from the library members and apply the actions to their datasets. In the cluster, all members can receive read requests, but the default application read requests are sent directly to the main library. When the main library is unavailable, this triggers the campaign and will select a new main library from the library that remains.



In some scenarios, there will be two nodes that think of themselves as the main library for a moment, but they have at most one write operation, which is the current main library, and the other is that the former main library is not aware that it is degraded, typically because it is a network partition. When this occurs, the client connected to the former main library may be aware of the stale data and finally roll back.



2) from the library



To replicate the data, asynchronously from the library, copy the Oplog on the main library and apply the log operations to its own dataset. There can be multiple slave libraries in a master-slave cluster environment.


    • Priority 0 Replica Set members


When you set the priority from the library to 0, it means that the from library cannot be the main library, because a member with a priority of 0 cannot participate in the campaign.


    • Hidden Replica Set Members


Hiding from the library is not visible to the application. The premise must not become the main library, that is, the priority is 0


    • Delayed Replica Set Members


How long does it take to set up a database from a library to delay the main library, which is primarily used to prevent human errors from being used for recovery, that is, as a backup. The premise is that the delay from the library must have a priority of 0 and is hidden from the library.



3) Arbitrator



It does not have a dataset and cannot be the main library, and its presence can allow an odd number of members in a master-slave replication cluster because it always has a voting power.



2. Replica set deployment architecture



The architecture of the replication cluster can affect the capacity and performance of the cluster. The standard production environment deployment architecture is a three-member replication cluster that provides good fault tolerance and redundancy. In general, we need to avoid complexity and design the cluster architecture based on actual application requirements.



Here are a few common architectures:



1) with three member replication cluster



The minimum requirement to replicate a cluster requires three members, in the three member schema, to be divided into one master two from and one Master one from one arbiter.


    • One Master two-mode: when the main library is unavailable, both from the library through the campaign to become the new main library
    • One master one from one quorum: when the main library is unavailable, this unique from library will become the new main library


Let's configure the deployment for a common two-from:



"Experimental Environment":



Host IP host name port function description



192.168.245.129 Node1 27017 Primary



192.168.245.131 Node2 27017 secodary



192.168.245.132 Node3 27017 secodary



1.1) to install MongoDB on Santai Standalone host, please refer to: http://www.cnblogs.com/mysql-dba/p/5033242.html



1.2) Ensure that the MongoDB instances on each of the three hosts are connected to each other. Here's how:


    • On the Node1 host:
MONGO--host 192.168.245.131--port 27017mongo--host 192.168.245.132--port 27017
    • On the Node2 host:
MONGO--host 192.168.245.129--port 27017mongo--host 192.168.245.132--port 27017
    • On the NODE3 host:
MONGO--host 192.168.245.129--port 27017mongo--host 192.168.245.131--port 27017


1.3) Start the Mongod service on all hosts, specifically add the--replset "rs0" parameter to specify the replication cluster name.


Mongod--dbpath=/data/db--fork--logpath=/data/log/mongodb.log--replset "Rs0"    #一个集群上成员replSet名字必须一样


1.4) Connect to the MONGO shell environment for replication configuration file initialization:


rs.initiate () #This operation only needs to be performed on one machine, generally performed on the main library
{
     "info2": "no configuration specified. Using a default configuration for the set",
     "me": "node1: 27017",
     "ok": 1
}


1.5) Confirm the configuration initialization of the replication cluster


rs0:SECONDARY> rs.conf()
{
    "_id" : "rs0",
    "version" : 1,
    "protocolVersion" : NumberLong(1),
    "members" : [
        {
            "_id" : 0,
            "host" : "node1:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {
                
            },
            "slaveDelay" : NumberLong(0),
            "votes" : 1
        }
    ],
    "settings" : {
        "chainingAllowed" : true,
        "heartbeatIntervalMillis" : 2000,
        "heartbeatTimeoutSecs" : 10,
        "electionTimeoutMillis" : 10000,
        "getLastErrorModes" : {
            
        },
        "getLastErrorDefaults" : {
            "w" : 1,
            "wtimeout" : 0
        }
    }
}


1.6) Add additional members to the main library:


rs0:PRIMARY> rs.add("192.168.245.131")
{ "ok" : 1 }
rs0:PRIMARY> rs.add("192.168.245.132")
{ "ok" : 1 }


1.7) Use the Rs.status () method to view the status of this cluster:


View Code


1.8) Perform the test to verify the master-slave replication function:


#Insert a document on the main library
rs0: PRIMARY> db.testrp.insert ({"name": "test replication"})
WriteResult ({"nInserted": 1})
#Go to view from the library, already have the data on the main library
rs0: SECONDARY> db.testrp.find ({"name": "test replication"})
{"_id": ObjectId ("5663906284c32afdaa84c21f"), "name": "test replication"}
#Kill the master library and see if the slave library can take over as the new master library
rs0: PRIMARY> rs.status ()
{
    "set": "rs0",
    "date": ISODate ("2015-10-23T01: 03: 10.811Z"),
    "myState": 1,
    "term": NumberLong (2),
    "heartbeatIntervalMillis": NumberLong (2000),
    "members": [
        {
            "_id": 0,
            "name": "node1: 27017",
            "health": 0, #At this point the main library is killed, so it is 0
            "state": 8,
            "stateStr": "(not reachable / healthy)",
            "uptime": 0,
            "optime": {
                "ts": Timestamp (0, 0),
                "t": NumberLong (-1)
            },
            "optimeDate": ISODate ("1970-01-01T00: 00: 00Z"),
            "lastHeartbeat": ISODate ("2015-10-23T01: 03: 10.475Z"),
            "lastHeartbeatRecv": ISODate ("2015-10-23T01: 02: 30.456Z"),
            "pingMs": NumberLong (0),
            "lastHeartbeatMessage": "Connection refused",
            "configVersion": -1
        },
        {
            "_id": 1,
            "name": "192.168.245.131:27017",
            "health": 1,
            "state": 1,
            "stateStr": "PRIMARY", #This has become the new main library through elections
            "uptime": 6590,
            "optime": {
                "ts": Timestamp (1449365602, 4),
                "t": NumberLong (2)
            },
            "optimeDate": ISODate ("2015-12-06T01: 33: 22Z"),
            "infoMessage": "could not find member to sync from",
            "electionTime": Timestamp (1449365602, 3),
            "electionDate": ISODate ("2015-12-06T01: 33: 22Z"),
            "configVersion": 3,
            "self": true
        },
        {
            "_id": 2,
            "name": "192.168.245.132:27017",
            "health": 1,
            "state": 2,
            "stateStr": "SECONDARY",
            "uptime": 2879,
            "optime": {
                "ts": Timestamp (1449365602, 4),
                "t": NumberLong (2)
            },
            "optimeDate": ISODate ("2015-12-06T01: 33: 22Z"),
            "lastHeartbeat": ISODate ("2015-10-23T01: 03: 10.463Z"),
            "lastHeartbeatRecv": ISODate ("2015-10-23T01: 03: 09.290Z"),
            "pingMs": NumberLong (0),
            "syncingTo": "192.168.245.131:27017",
            "configVersion": 3
        }
    ],
    "ok": 1
}


2) Four members and more replication clusters



MongoDB allows more hosts to be added to the replication cluster, and of course the following issues need to be considered:


    • Ensure that there are an odd number of voting members in the cluster, and if an even number of members are available, an arbitration member can be deployed to ensure that the number of voting members in the election is odd.
    • A MONGODB replication cluster can have up to 50 members, 7 voting members, and if it has reached 7 voting members, the host must be non-voting.
    • The location of the members in the cluster. The vast majority of cluster members must be guaranteed to be in a data center, such as a total of 5 members, then 3 in the same datacenter.


3) Geo-location-based distributed replication architecture



Deploying a replication cluster to multiple data centers ensures redundancy and increased fault tolerance for the system. Even if one data center is unavailable, another data center can continue to provide services. But the MongoDB instance priority in the other data center (which I call the second datacenter) must be set to 0 so that it does not take over the main library. For example, here is a simple geo-location-based architecture:






If the primary datacenter is not available, you can manually recover the dataset from the second datacenter with minimal downtime. This is why it is necessary to ensure that the number of members in a data center exceeds the majority, so that the number of votes can be chosen as a new master.



3. High availability of replication clusters



The replication cluster adopts automatic takeover to ensure the high availability of the system. When the main library is unavailable, a new master is chosen by voting. The members of the cluster hold the same data set, but others are independent. In some cases, the takeover process may perform a rollback in the process of taking over as a new master.


    • How is it elected at the time of takeover?


The replication cluster uses electoral system to determine which member becomes the main library. Elections typically occur at two points in time: one is after the replication cluster initialization is complete, and the second is when the main library is unavailable. In the cluster, the electoral system is an indispensable independent operation, but the election takes time to complete, in the election is in progress, this time the cluster does not have the main library and cannot receive the client's write request, the other members are read-only state (so if the settings from the library can read), the system's read service is not affected. The write service cannot be manipulated for a short time. Unless necessary, MongoDB tries to avoid the election.



Influencing voting conditions:



Heartbeat detection: Every two seconds, a member of the cluster sends a heartbeat (pings) to another host, and if there is no response within 10 seconds, the host is considered unavailable.



Priority parameters: the higher the priority, the more able to become the main library, and 0 cannot be the main library. In the election process, the lower priority members have the opportunity to become the main library, and if this happens, they will continue to vote until the priority is the primary.



Network partition: Ensure that the number of members in a data center exceeds most


    • Rolling back after takeover?


After the takeover is complete, the rollback write operation is required to restore the original repository to the cluster for re-joining. A rollback operation is required only if the primary repository has received a write operation and has not been successfully applied from the library before the outage. This way, when it is re-joined, the data consistency will be guaranteed.



MongoDB strives to avoid rollback occurrences because this rarely occurs, typically in the context of a network partition. If the primary repository is copied to any other library before it is unavailable, and that member is accessible to most members, no rollback will occur.



"Collect rollback Data":



If a rollback occurs, the DBA must decide whether to apply these rollbacks. MongoDB writes the rollback data to the Bson file, located under the rollback/of the data directory, and the file is named as follows:


<database>.<collection>.<timestamp>.bsonrecords.accounts.2011-05-09t18-10-04.0.bson    # Like this file name.


After the original repository has been rolled back and demoted to the slave library, the DBA must apply these rollback data to the new primary library. Use Bsondump to read back the contents of the file and then apply it using the Mongorestore tool.



"Avoid rollback":



For a replication cluster, the default is write concern {w:1}, which, if taken with this parameter, will still be rolled back after the main library has been down and the write operation has replicated to any of the libraries. In order to avoid this, you can use w:majority write concern instead.



"Rollback LIMIT":



Mongd instances do not roll back more than 300 bytes of data, and if you need to roll back more than 300 bytes of data, you must intervene manually.



4. Replica set processing read-write requests



1) Secure Write level (write concern)



In the background, regardless of whether MongoDB is running on a single host or in a replication cluster, it is transparent to the front-end application. For write operations, the front end needs to return a confirmation message, is the actual write successful? For a replication cluster, the default write acknowledgement is sent only to the main library, but we can also modify it to send to more hosts. There are two ways to do so:


    • Overloaded Writeconcern Method:
db.products.insert(
   { item: "envelopes", qty : 100, type: "Clasp" },
   { writeConcern: { w: 2, wtimeout: 5000 } }
)


The Writeconcern method is overloaded when data is inserted, and the value of W is changed to 2, indicating that the write acknowledgement is sent to both hosts in the cluster including the main library.



Writeconcern Method Parameter Description:


{w: <value>, J: <boolean>, Wtimeout: <number>}


W: Indicates the request for a write operation is sent to the number of Mongod instances or the Mongod instance of the specified tag. The following values are specified:



0: No write operation confirmation is indicated;



1: Send to a single mongod instance, for the replication cluster environment, sent to the main library;



Greater than 1: Indicates the number of instances sent to the cluster, but not more than the number of clusters, or there is write blocking;



In the majority:v3.2 version, the write is successful only if it is sent to most nodes in the cluster, including the main library, and must be written to the local hard disk's log file.



<tag Set>: Represents the instance sent to the specified tag;



J: Indicates whether the write operation has been written to the log file and is of type Boolean.



Wtimeout: Confirm the timeout number of the request, such as W set 10, but the cluster is only 9 nodes, then it has been blocked there, by setting the number of timeouts, avoid write acknowledgement back blocking.



When a lot of write operations, each need to overload the Writeconcern method, too cumbersome, then you can refer to the following method, modify the configuration.


    • To modify the configuration of a replication cluster:
cfg = rs.conf()
cfg.settings = {}
cfg.settings.getLastErrorDefaults = { w: "majority", wtimeout: 5000 }
rs.reconfig(cfg)


2) Read security level and read options



Here are two concepts that must be understood.



Read Security level (Readconcern), this is a newly introduced v3.2 version that allows the client to set the appropriate read isolation level. As follows:


Readconcern: {level: <majority|local>}


By default, MongoDB chooses local as the read level, which reads the most recent data on the main library but does not roll back (and may be rolled back when the main library is unavailable). You can choose the majority level, which guarantees that the data has been written to multiple nodes and is not rolled back.



In order to set the security level, you can specify the --enablemajorityreadconcern parameter in the boot instance. Or, configure theReplication.enablemajorityreadconcernto a file. The following commands support Readconcern:


    • Findcommand
    • Aggregatecommand and thedb.collection.aggregate ()method
    • distinctcommand
    • CountCommand
    • parallelcollectionscanCommand
    • geonearCommand
    • geosearchCommand


The read option describes the access path to the MongoDB read request, which, by default, is sent directly to the main library by the client's read request. So sometimes when I want to do a read-write separation or reduce the reading and writing pressure of the main library, I need to divert the read request to the other secondary libraries, and this time the read option is useful. There are five types of read option modes:


Read Preference Mode Description
Primary By default, read requests are sent to the main library
Primarypreferred In most cases, read from the main library, only the main library is unavailable, and then read from the secondary library.
Secondary Read from Secondary library
Secondarypreferred In most cases, read from the secondary library, only the secondary library is unavailable, and then read from the main library.
Nearest Read from the library with the least latency of the network in the replica set, which is not the master/slave library.


These parameters can be specified by functions or through functions in the shell when the application connects to MongoDB. Specific as follows:



In the shell:


Db.collection.find (). Readpref ({mode: ' nearest '})


In a java program, connecting to the entire replica set, it does not care about which machine is the master or slave, so several parameters work.


List <ServerAddress> addresses = new ArrayList <ServerAddress> ();
   ServerAddress address1 = new ServerAddress ("192.168.245.129", 27017);
   ServerAddress address2 = new ServerAddress ("192.168.245.131", 27017);
   ServerAddress address3 = new ServerAddress ("192.168.245.132", 27017);
   addresses.add (address1);
   addresses.add (address2);
   addresses.add (address3);

   MongoClient client = new MongoClient (addresses);
   DB db = client.getDB ("test");
   DBCollection coll = db.getCollection ("testdb");
 
   BasicDBObject object = new BasicDBObject ();
   object.append ("test2", "testval2");
 
   // Read operation reads from replica node
   ReadPreference preference = ReadPreference.secondary ();
   DBObject dbObject = coll.findOne (object, null, preference);
 
   System. Out .println (dbObject);


The schema of the Read option value is secondary, which is often said to be read and write separated, of course, this mode does not guarantee that the reading data is up-to-date, because there may be a certain delay from the library.






The above five parameters should be set according to the application requirements, not blindly designated, the general recommendations are as follows:


    • Pursue maximum data consistency:


The primary read option and the majority secure read level are used, but when the main library is unavailable, an error is made if most of the members are not available. You can also disable automatic failover, which sacrifices system availability.


    • The pursuit of maximum system availability:


Theprimarypreferredread option is used, but increases the read and write pressure on the main library.


    • The pursuit of minimum delay: In order to always read the delay of small nodes, you can use nearest.


3) read option execution process



When we select a read option for a non-primary key, the MongoDB driver is determined to read from a member by following several procedures:


    • Collect available members, analyze their type (main library, from library, etc.)
    • If you set up a label set, exclude members that are not satisfied
    • Determine which members are closest to the client (network latency is short)
    • Set up a list of members in the last member of the above with a ping distance of milliseconds
    • Randomly select a member from these list hosts for read operations


5. Copy process



Members of a replica set are continuously replicating data, first of all the members use initial sync to catch changes in the data set, and then continuously record and apply the changed data. Each member records data changes into Oplog, and Oplog is actually a capping set.



1) Oplog of the replica set



Oplog (Operation Log) is a special capping set that scrolls the operation of recording data changes each time the database data is modified. MongoDB applies data operations on the main library and logs them to Oplog, and then copies the Oplog from the library and applies them in a single-threaded manner. All replica set members hold a copy of Rplog in thelocal.oplog.rscollection to maintain the state of the database. All replica set members send heartbeat detection to other members, and any member can import oplog entries from other members.


    • Oplog File Size


When you start a member for the first time, the log file size is specified by default, which is typically the 5%,1g-50g of the operating system's free space. In general, this file size is sufficient, but you can also modify the file size, specific reference official documents.


    • A situation that requires a large oplog
#Update many documents at once, in order to follow idempotency, oplog needs to convert multiple updates into separate operations, which takes up a lot of space.
#Deleted data equals the number of inserts
# A large number of in-situ updates do not change the size of the hard disk data, but need to record many log operations.
    • Oplog status
rs0:PRIMARY> rs.printReplicationInfo()
configured oplog size:   1527.7419919967651MB
log length start to end: 6028secs (1.67hrs)
oplog first event time:  Sun Dec 06 2015 07:52:54 GMT+0800 (CST)
oplog last event time:   Sun Dec 06 2015 09:33:22 GMT+0800 (CST)
now:                     Sat Oct 24 2015 11:30:48 GMT+0800 (CST)
rs0:PRIMARY> 


2) Data synchronization



In order to maintain the most current data for each replica member, the secondary members in the replica set pass the sync or copy the data of the other members. MongoDB provides two ways to synchronize data between secondary members:


    • Initialize sync (Initial sync)


Initializing synchronization copies all of the data to another member, generally when the member is newly added without data or has data but loses part of the historical data will be used to initialize the synchronization. When initializing a synchronization operation, MongoDB will:


1. Clone all databases of the source member. In order to clone the database, mongod will query the collection of all source members and insert these data into the copy of the source member, including establishing the _id index. The cloning process just clones legitimate data, ignoring invalid data
2.Apply a cloned rplog copy to the target member
3. Index all collections 
    • Copy (Replication)


When the initialization synchronization is complete, the member will oplog from the initial synchronization source member that continuous replication and apply the data operations asynchronously. In general, synchronization occurs from the main library. Of course, the replica member will also automatically change the source member, in order to be able to synchronize, themembers[n].buildindexesparameter value between two members must be the same. Additionally, replica members do not synchronize data from the deferred type and the hidden type member.



6, master-slave replication



For new applications, MONGODB recommends using replica sets instead of master-slave replication. It used to be a master-slave copy and later introduced a replica set, which is not covered here, please check the official documentation.



Note: Due to my limited translation level, sometimes it is not possible to separate the difference between the replica set and the master-slave replication terminology, sorry! In this article, we refer to the replica set






Copy set of "Six" MongoDB management


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.