Zookeeper Source Analysis: Log and snapshot persistence (Syncrequestprocessor Class)

Source: Internet
Author: User

The persistence of the transaction log is implemented in the Syncrequestprocessor class, and the log is scrolled according to certain rules (closing the current file, creating a new file), and generating a new snapshot. During the persistence process, the group commits is used to optimize disk IO operations. Group commit refers to attaching a transaction of multiple request objects to disk as a write. This way, you can use only one disk seek when persisting multiple transactions. The request object is passed to the next processor only after the transaction is synchronized to disk.

Syncrequestprocessor is used in the following three different scenarios:

    • Leader-Synchronizes the request to the disk, and forwards the request to ackrequestprocessor. The processor sends an ACK message to leader itself.
    • Follower-Synchronizes the request to disk and forwards the request to sendackrequestprocessor. The processor sends an acknowledgement packet to leader. Sendackrequestprocessor is flushable, allowing us to force packets to be pushed to leader.
    • Observer-Synchronizes the submitted request to the disk (received from the Inform packet). It does not send acknowledgment packets to leader. So nextprocessor is null. In observer, it differs from the general Txnlog semantics because it contains only the TXN that have been committed.

There are two key queues in Syncrequestprocessor:

    • Queuedrequest Queue: Stores the request object from the incoming processor. When the processor's ProcessRequest method is called, the request object is placed into the queuedrequest queue;
    • Toflush queue: Contains a Request object that has been attached to a log file but has not been flush.

The Syncrequestprocessor Run method loops through the request object in the Queuedrequests queue and persists.

The flowchart is as follows:

If the Toflush queue is empty, the blocking method of the Queuedrequest queue is called Take (), and if the Toflush queue is not empty, the Queuedrequest queue's non-blocking method poll () is called. If the poll () method returns null, the transactions in all the request objects in the Toflush queue are immediately flush to disk and the request object is passed to the next processor. This avoids increasing the latency of the request processing. If the Queuedrequest.poll () method returns NOT NULL or the Queuedrequest.take () method returns, the transaction in the returned Request object SI is appended to the transaction log file and placed in the Toflush queue. If the Toflush queue size is greater than 1000, the transactions in all request objects in the queue are flush to disk and the request object is passed to the next processor. This is to avoid increasing the latency of request processing when there are a large number of requests.

After the request object is attached to the transaction log, the number of log records Logcount is checked for greater than (Snapcount /2 + Randroll). If it is greater then the log is scrolled and the thread that generated the new snapshot is started. Where Randroll is a random number. The use of this random number prevents all machines in the zookeeper cluster from building snapshot at the same time.

The Syncrequestprocessor.run method is as follows:

 PublicvoidRun () {Try{intLogcount = 0;//The use of this random number randroll prevents all machines in the zookeeper cluster from being built at the same time snapshotSetrandroll (R.nextint (SNAPCOUNT/2)); while(true) {Request si = null;//If Toflush is empty, call the blocking method of queue queuedrequests take ()if(Toflush. IsEmpty ())            {si = Queuedrequests.take (); }//If Toflush is not empty, call the non-blocking method of Queue Queuedrequests poll ()Else{si = Queuedrequests.poll ();//If SI is null, indicating that queuedrequests is empty, the flush () method is called if(SI = = null) {Flush (Toflush);Continue; }            }//If Si is a poison pill, exit the loopif(si = = requestofdeath) { Break; }if(SI! = null) {// track the number of records written to the log//Record operation is recorded in the logif(Zks. Getzkdatabase (). Append (SI)) {logcount++;if(Logcount > (SNAPCOUNT/2 + randroll)) {Randroll = R. Nextint (SNAPCOUNT/2);//Scrolling transaction logZks.getzkdatabase (). Rolllog ();//Build Snapshotif(snapinprocess! = null && snapinprocess.isalive ()) {Log.warn ("Too busy to snap, skipping"); }Else{//Generate snapshot threadSnapinprocess =NewThread ("Snapshot Thread") { PublicvoidRun () {Try{zks.takesnapshot (); }Catch(Exception e) {Log.warn ("Unexpected exception", e); }                                    }                                };//Start snapinprocess                             Snapinprocess.start ();                        }                          Logcount = 0;                    }                }                                                         Elseif(Toflush. IsEmpty ()) {//optimization for read heavy workloads//If this is a read and there is no pending flushes (writes), then pass directly to the next processorif(Nextprocessor! = null) {nextprocessor.processrequest (SI);if(Nextprocessorinstanceofflushable) {((flushable) nextprocessor). Flush (); }                    }Continue; } toflush.add (SI);//If the size of the Toflush is greater than 1000, flushif(Toflush. Size () > 1000)                {Flush (Toflush); }            }        }    }Catch(Throwable t) {Log.error ("Severe unrecoverable error, exiting", t);        running = false; System.    Exit (11); } log.info ("Syncrequestprocessor exited!");}

Reprint please attach original blog address: http://blog.csdn.net/jeff_fangji/article/details/44046997

Zookeeper Source Analysis: Log and snapshot persistence (Syncrequestprocessor Class)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.