Tair Source Code Analysis--LEVELDB new Compactrangeselflevel process

Source: Internet
Author: User
Tags compact

The Tair is a distributed KV storage engine, and when a new machine or machine is down, Tair's dataserver will migrate and clean the data based on the new chart generated by Configserver. In the process of data cleansing used in the tair of the new compaction mode--compactrangeselflevel, as the name implies, This compactrangeselflevel is the compaction of a certain key range for the level you are on (specified) and writes the resulting output file to the level of your own rather than to the parent (L + 1). Let's analyze this compactrangeselflevel.

//Compact filenumber A key less than limit in the range of [begin, end] Sstable//when the compact is only level 0 sstable output to Level 1, the other level is the input output at the same levelStatus dbimpl::compactrangeselflevel (uint64_t limit_filenumber,Constslice*Begin,Constslice*end) {  //Initializes a Mannualcompaction objectManual.limit_filenumber =Limit_filenumber; Manual.bg_compaction_func= &Dbimpl::backgroundcompactionselflevel; //Use Timedcond () to prevent loss of wake-up signal//each level successive schedule manualcompaction   for(intLevel =0; Level < Config::knumlevels && Manual.compaction_status.ok (); ++Level ) {manualcompaction each_manual=Manual; Each_manual.level=Level ;  while(Each_manual.compaction_status.ok () &&!Each_manual.done) {      //still have compaction running wait       while(bg_compaction_scheduled_) {bg_cv_.      Timedwait (Timed_us); } manual_compaction_= &each_manual;      Maybeschedulecompaction ();  while(Manual_compaction_ = = &each_manual) {bg_cv_.      Timedwait (Timed_us); }} manual.compaction_status=Each_manual.compaction_status; }  returnManual.compaction_status;}

Maybeschedulecompaction is mainly to determine whether to start a background compaction thread, mainly to whether there are compaction tasks and whether there are already compaction threads already running as the basis, This function has been analyzed in the previous article explaining compaction, which is no longer described here. Let's take a detailed look at the Backgroundcompactionselflevel function that really works

voidDbimpl::backgroundcompactionselflevel () { Do {    //level-0 do not restrict filenumber   /*compactrangeonelevel Name Incredibles the level of all filenumber less than limit key in the range of [begin, end] sstable Find out*/C= Versions_-> Compactrangeonelevel (m->Level , M->level >0? M->limit_filenumber: ~ (static_cast<uint64_t> (0)), M->begin, m->end); if(NULL = = c) {//No compact for this levelM->done =true;//Do all .       Break; }    //record the end of the manual compaction keyManual_end = C->input (0, C->num_input_files (0) -1),largest; Compactionstate* Compact =Newcompactionstate (c); Status= Docompactionworkselflevel (compact);//A function that really compactioncleanupcompaction (compact); C-releaseinputs ();    Deleteobsoletefiles ();    Delete C; if(Shutting_down_. Acquire_load ()) {//Ignore compaction errors found during shutting down}Else if(!Status.ok ()) {m->compaction_status = status;//Save Error      if(Bg_error_.ok ()) {//no matter paranoid_checksBg_error_ =status; }       Break;//exit once fail.    }  }  while(false); if(!m->Done ) {    //We only compacted part of the requested range. Update *m//To the range, which is a left -to-be compacted.M->tmp_storage =Manual_end; M->begin = &m->Tmp_storage; }  //Mark it as doneManual_compaction_ =NULL;}
Here Docompactionworkselflevel is really where KV reads and compaction, however we do not intend to analyze them in detail, because by contrast we know that the subject process is the same as the Docompactionwork, Just slightly different in some subtle ways of judging and handling. The specific docompactionwork process please refer to the LEVELDB source code Analysis--sstable compaction, below us through the comparison The difference between the way to let everyone understand the actual process of docompactionworkselflevel.

Docompactionworkselflevel and Docompactionwork are basically the same, but there are few judgments on the process:

1. Docompactionworkselflevel traversal to key does not need to shouldstopbefore judgment, because this is to determine whether the L + 2 layer has too much overlap, here selflevel is output to the current layer, so it will not affect with L + 2-layer overlap situation;

 Slice key = Input->key ();  //  if (compact->compaction->     Shouldstopbefore (key) &&  //   Compact->builder = NULL) { //  status =    Finishcompactionoutputfile (compact, input);  //  if (!status.ok ()) { //  break;  // }    " //  

2. Is there less time to drop seq<= smallest_snapshot && (type = = Deletion | | Shoulddrop) && Isbaselevelforkey (Ikey)) is the drop, also because the current layer of the compaction, and Isbaselevelforkey is the judge of the L + 2 above the relevant value of the key, If you want to add a judgment here, you should also include the l+1 layer within the scope of judgment.

}Else if(Ikey.sequence <= compact->smallest_snapshot &&(Ikey.type= = Ktypedeletion | |//deleted or..User_comparator ()Shoulddropmaybe (Ikey.user_key.data (), Ikey.sequence, expired_end_t IME))&&//.. user-defined should drop (maybe),//based on some condition (eg. This key is only have this update.).Compact->compaction->Isbaselevelforkey (Ikey.user_key)) {        //For this user key://(1) There is no data in higher levels//(2) data in lower levels would have larger sequence numbers//(3) data in layers that is being compacted here and has//smaller sequence numbers 'll be a dropped in the next//few iterations of this loop (by rule (A) above). //Therefore This deletion marker are obsolete and can be dropped.drop =true; }

3. When installcompactionresults, the second parameter is passed false, so this function puts the newly generated sstable into the current layer instead of the L + 1 layer

Status = installcompactionresults (compact);  Modified to:false//  output files is at current level, not level + 1

In addition here by the way tair in the LEVELDB is the Google Open source Leveldb also have some changes, such as adding expire features, tair in comparator added three interface functions. In Tair, two such comparator are Numericalcomparatorimpl and Bitcmpldbcomparatorimpl respectively, Here we take Bitcmpldbcomparatorimpl as an example for a brief introduction of its functions.

//determine if the key is in a bucket that needs to be recycled, and if it returns true, then compaction is deleted (that is, recycled) .  Virtual BOOLShoulddrop (Const Char* Key, int64_t sequence, uint32_t now =0)Const{return false;} //determine if the key has expired according to Expire_time, and returns True if it expires  Virtual BOOLShoulddropmaybe (Const Char* Key, int64_t sequence, uint32_t now =0)Const{return false;} //Start_key and key are still part of the same bucket, yes put back false, otherwise return true  Virtual BOOLShouldstopbefore (Constslice& Start_key,Constslice& key)Const{return false;}

With these three functions, Tair's LDB engine will be able to reclaim and determine if the key is written to the same sstable in compaction. For example, the most direct compaction if Shoulddrop returns true then mark this key as drop not written to the new sstable, and Shouldstopbefore is used to generate a new sstable file. Returns true to end the write of the current file to generate the next sstable, so that different buckets can be written to different sstable files.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.