LEVELDB source analysis of nine: env

Source: Internet
Author: User
Tags abstract flush fread mutex posix

Given the portability and flexibility, LEVELDB abstracts the system-related processing (file/process/time) into EVN, and the user can implement the appropriate interface themselves, passing in as part of option, using its own implementation by default.
Env.h declares: Virtual base class env, in env_posix.cc, derived classes posixenv inherit from the Env class, which is the default implementation of LEVELDB. Virtual base class Writablefile, Sequentialfile, Randomaccessfile, respectively, is the writing abstract class of the file, the sequential read abstract class and the random read abstract class Logger,log file writing interface, Log file is to prevent the system abnormal termination caused data loss, is memtable in the disk backup class Filelock, for the file lock Writestringtofile, readfiletostring, log three global functions, encapsulated the above interface

Let's take a look at the default implementations written for us in env_posix.cc

Sequential reads:

Class Posixsequentialfile:public Sequentialfile {private:std::string filename_;

 File* File_; Public:posixsequentialfile (const std::string& fname, file* f): Filename_ (fname), File_ (f) {} virtual ~posix
  Sequentialfile () {fclose (file_);} Read n bytes from a file to "scratch[0..n-1", then convert "scratch[0..n-1" to slice type and place in *result////If read correctly, return OK status, otherwise return NON-OK Status Virtual Status Read (size_t N, slice* result, char* Scratch) {Status s; #ifdef BSD//fread_unlocked doesn ' t
exist on FreeBSD size_t r = fread (scratch, 1, N, File_);
  #else//size_t fread_unlocked (void *ptr, size_t size, size_t n,file *stream); PTR: Memory address used to receive data//size: The number of bytes per data item to read, in bytes//n: To read n data items, each data item size byte//stream: Input stream//return value: Returns the actual read data size//Because the function
The name has a "_unlocked" suffix, so it is not thread-safe size_t r = fread_unlocked (scratch, 1, N, File_);
  The second parameter of the #endif//Slice is to use the actual read data size, because the number of bytes remaining may be less than n *result = Slice (scratch, R) when the end of the file is read; if (r < N) {if (feof (File_)) {//We leave status as OKIf we hit the end of the the file//if r<n, and feof (file_) nonzero, stating that the end of the document, nothing to do, the function will return to the end of the OK Status} else {//A parti
    Al read with an error:return a NON-OK status//Otherwise returns an error message S = Status::ioerror (Filename_, Strerror (errno));
  }} return S;
  }//Skips the N-byte content, which is not slower than reading N-byte content and is faster.
  If the end of the file is reached, it stays at the tail of the file and returns the OK Status.
   Otherwise, the error message is returned by virtual Status Skip (uint64_t N) {//int fseek (FILE *stream, long offset, int origin); Stream: File pointer//offset: Offset, integer for forward offset, negative for negative offset//origin: Set Where to start offset from file, possible value: Seek_cur, Seek_end, or seek_set//See
   K_set: File Start//seek_cur: Current position//seek_end: End of File//where Seek_set, Seek_cur and Seek_end and sequentially 0, 1 and 2. Example://Fseek (FP, 100L, 0); Move the fp pointer to 100 bytes from the beginning of the file;//fseek (FP, 100L, 1); Move the fp pointer to 100 bytes from the current position of the file;//fseek (FP, 100L, 2);
   Returns the fp pointer to 100 bytes from the end of the file.
  Return value: Successfully returned 0, failed to return non 0 if (fseek (File_, N, seek_cur)) {return Status::ioerror (filename_, Strerror (errno));
  } return Status::ok (); }
};
This is the interface from which leveldb reads files from the disk sequence, using the stream file operation of C and the file structure. It is important to note that the read interface does not lock the file stream when it reads the file, so external concurrent access needs to provide concurrency control on its own.
Random Read:

Class Posixrandomaccessfile:public Randomaccessfile {private:std::string filename_;
  int fd_;

 mutable Boost::mutex mu_; Public:posixrandomaccessfile (const std::string& fname, int fd): Filename_ (fname), Fd_ (FD) {} virtual ~posix
  Randomaccessfile () {close (fd_);}
  Here, compared to the sequential read function of the same name, a parameter offset,offset is used to specify//the offset of the read location from the start of the file so that random reads can be achieved. Virtual Status Read (uint64_t offset, size_t N, slice* result, char* Scratch) const {Status s; #ifdef WIN3

    2//No pread on Windows so we emulate it with a mutex boost::unique_lock<boost::mutex> lock (mu_);
    if (:: _lseeki64 (Fd_, offset, seek_set) = = -1l) {return Status::ioerror (filename_, Strerror (errno));
	}//int _read (int _filehandle, void * _dstbuf, unsigned int _maxcharcount)//_filehandle: File descriptor//_DSTBUF: Save buffer to read data
    _maxcharcount: bytes read//Return value: Number of bytes read successfully returned, error returned-1 and set errno.
    int r =:: _read (Fd_, scratch, N);
    *result = Slice (Scratch, (R < 0)? 0:r); lock.unLock (); #else//Use pread for random reading on non-Windows systems, why not lock at this time.
    See below for an analysis of ssize_t r = pread (fd_, Scratch, N, static_cast<off_t> (offset));
*result = Slice (Scratch, (R < 0)? 0:r); #endif if (r < 0) {//an error:return a NON-OK status s = Status::ioerror (Filename_, Strerror (errno))
    ;
  } return s; }
};

As you can see, Posixrandomaccessfile uses Pread on non-Windows systems to achieve the ability to locate and access atoms. The process of regular random access to a file can be divided into two steps, fseek (Seek) locates to the access point and calls Fread (read) to start accessing file* (FD) from a specific location. However, the combination of these two operations is not atomic, that is, file operations between Fseek and fread that may be inserted into other threads. By contrast, Pread is guaranteed to implement atomic positioning and reading combined functions by the system. It is important to note that the pread operation does not update the file pointer.

It is important to note that in both random and sequential reads, a file is represented by FD and file * respectively. File descriptor is the concept of the system layer, FD corresponds to a file in the System Open File table, file* is the concept of application layer, which contains the application layer operation file data structure.

Sequential write:

Class Boostfile:public Writablefile {public:explicit boostfile (std::string path): Path_ (Path), Written_ (0) {O
  Pen ();
  } virtual ~boostfile () {Close (); } private:void Open () {///We truncate the file as implemented in Env_posix//trunc: First empty the contents of the files//out: for Output (write Open File//binary: Open File File_.open (path_.generic_string (). C_STR (), Std::ios_base::trunc | std::ios_ba Se::out |
     Std::ios_base::binary);
  Written_ = 0;
    } public:virtual Status Append (const slice& data) {status result;
    File_.write (Data.data (), data.size ());
    if (!file_.good ()) {result = Status::ioerror (path_.generic_string () + "Append", "Cannot write");
  } return result;

    } Virtual Status Close () {status result;
	try {if (File_.is_open ()) {Sync ();
      When the stream is closed, the data in the buffer is automatically written to the file//above the call to sync () to force refresh, to ensure data is written to prevent data loss file_.close (); }} catch (const Std::exception & e) {result = STatus::ioerror (path_.generic_string () + "Close", E.what ());
  } return result;
    } virtual Status Flush () {File_.flush ();
  return Status::ok ();
    }//Manual refresh (emptying the output buffer and synchronizing the buffer contents to the file) virtual status sync () {status result;
    try {Flush ();
    } catch (const Std::exception & e) {result = Status::ioerror (path_.string () + "Sync", E.what ());
  } return result;
  } private:boost::filesystem::p ath Path_;
  boost::uint64_t Written_;
Std::ofstream File_; };

For the differences between Ofstream::flush and Ofstream::close, see: C + + Ofstream::flush and Ofstream::close

file Lock:

Class Boostfilelock:public Filelock {public
 :
  boost::interprocess::file_lock fl_;
};
virtual Status LockFile (const std::string& fname, filelock** lock) {*lock = NULL;

    Status result; try {if (!boost::filesystem::exists (fname)) {Std::ofstream of (fname, Std::ios_base::trunc | std::ios_base:
      : Out);

      } assert (Boost::filesystem::exists (fname));
      Boost::interprocess::file_lock FL (FNAME.C_STR ());
      Boostfilelock * My_lock = new Boostfilelock ();
      My_lock->fl_ = Std::move (fl);
      if (My_lock->fl_.try_lock ()) *lock = My_lock;
    else result = Status::ioerror ("Acquiring lock" + fname + "failed");
    } catch (const Std::exception & e) {result = Status::ioerror ("lock" + fname, E.what ());
  } return result; }
Virtual Status Unlockfile (filelock* lock) {

    Status result;

    try {
      Boostfilelock * my_lock = Static_cast<boostfilelock *> (lock);
      My_lock->fl_.unlock ();
      Delete my_lock;
    } catch (const Std::exception & E) {
      result = Status::ioerror ("Unlock", E.what ());
    }

    return result;
  }
The lock operation of the file is implemented by invoking the boost lock. Locking is designed to prevent concurrency conflicts for multiple processes, if the lock fails, *lock=null, and returns NON-OK, if the lock succeeds, the *lock holds the lock pointer and returns OK. If the process exits, the lock is freed automatically, otherwise the user needs to call Unlockfile explicit release lock.

These methods are very simple, more obscure is this sentence: My_lock->std::move (F1), from the function name, is to move F1. In fact Std::move is a useful function provided by the C++11 standard library in <utility>, the name of this function is confusing, because in fact Std::move does not move anything, its only function is to force an lvalue into an rvalue reference, We can then use this value with rvalue references for moving semantics. From the implementation, Std::move is basically equivalent to a type conversion:static_cast<t&&> (lvalue); it is worth mentioning that the left value of the transformation, whose life time does not change with the transformation of the left and right values. If the reader expects the lvalue variable lvalue of the Std::move transformation to be immediately deconstructed, it will be disappointed. The concepts of lvalue and rvalue are inherited from C, and in C, the lvalue refers to a variable (or expression) that can appear either to the left of the equals sign or to the right of the equal sign, or to a variable (or expression) that can only appear to the right of the equal sign.

To Schedule a task:

Posixenv also has a very important function, scheduled tasks, that is, compaction threads in the background. Compaction is the meaning of compression merge, in the LEVELDB source analysis of the Six: Skiplist (2) also mentioned. For Leveldb, the write record operation is simple, and deleting a record just writing a delete tag is done, but the read record is complex and needs to be found in the memory and in each level file in order of freshness, at a high cost. In order to speed up the reading speed, Leveldb took a compaction way to collate the existing records, in this way, to remove some no longer valid KV data, reduce the size of the data, reduce the number of files and so on.

A task queue is defined in posixenv:

  struct Bgitem {void* arg; void (*function) (void*);};
  Using the Deque double-ended queue as the underlying data structure
  typedef std::d eque<bgitem> bgqueue;
  Bgqueue Queue_;
Once the main thread determines that compaction operation is required, the compaction task is pressed into the queue queue_, and Bgitem is the structure of the task function and the DB object pointer. The background thread continuously executes the compaction task from the beginning, based on the function pointers in the queue. The Bgthread () function simply takes the function pointer out of the QUEUE_ and executes it.

Background process has been performing queue_ tasks, because queue_ is dynamic, naturally need to consider queue_ empty what to do, LEVELDB is the condition variable boost::condition_variable bgsignal_, the queue is empty to enter the waiting, Until there is a new task to join in. The condition variables are generally used in conjunction with Boost::mutex Mu_ to prevent some logic errors.

The wrapper for the Bgthread function, which is called the Bgthread function
  static void* bgthreadwrapper (void* Arg) {
    reinterpret_cast<posixenv* > (ARG)->bgthread ();
    return NULL;
  }
void Posixenv::schedule (void (*function) (void*), void* Arg) {
  boost::unique_lock<boost::mutex> lock (mu_);

  Start background Thread If necessary
  if (!bgthread_) {
     bgthread_.reset (
         new Boost::thread (Boost::bind (& Amp Posixenv::bgthreadwrapper, this)));
  }

  Add
  to Priority queue//press the task into the queue
  Queue_.push_back (Bgitem ());
  Queue_.back (). function = function;
  Queue_.back (). arg = arg;

  Lock.unlock ();

  Bgsignal_.notify_one ();

}
void Posixenv::bgthread () {
  while (true) {
  //locking, preventing concurrency conflicts
  boost::unique_lock<boost::mutex> Lock (mu_) ;
  If the queue is empty, wait until you receive the notification (notification) while
  (Queue_.empty ()) {
    bgsignal_.wait (lock);
  }
  The function that removes the task from the queue header and its argument
  void (*function) (void*) = Queue_.front (). function;
  void* arg = Queue_.front (). Arg;
  Queue_.pop_front ();

  Lock.unlock ();
  Call function
  (*function) (ARG);
  }
}
In addition, Posixenv also have fileexists, GetChildren, DeleteFile, Createdir, Deletedir, GetFileSize, RenameFile and so on, they see the name of righteousness, is implemented by invoking the corresponding function of boot.

Envwrapper:

A Envwrapper class is also implemented in Leveldb, which inherits from Env and has only one member function env* target_, and all variables of that class call the corresponding member variables of the Env class, and we know that Env is an abstract class that cannot define an env type object. The type of constructor we pass to Envwrapper is posixenv, so the last call is a member variable of the Posixenv class, as you might have guessed, this is the proxy pattern in design mode, Envwrapper is simply encapsulated, It is represented by the subclass of Env posixenv.
The relationship between Envwrapper and Env and Posixenv is as follows:


Due to space limitations, env in the logger class is placed in the back of the analysis, reference: LEVELDB source Analysis of the ten: Log file, from the Env gave me the harvest is: the use of virtual base class features to provide a default implementation, but also open the user-defined permissions of the object-oriented programming paradigm of learning, All operations are defined as the locking of the class file, the thread's synchronous c file stream operation, the character extraction of the file name, the creation, deletion of files and paths, which can be directly used in the future of their own project reference link: http://blog.csdn.net/tmshasha/ article/details/47860573 Reference Link: http://www.360doc.com/content/14/0325/16/15064667_363619343.shtml




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.