HRegionServer Flush source code analysis, hregionserverflush

Source: Internet
Author: User
Tags tmp file

HRegionServer Flush source code analysis, hregionserverflush

The Flush operation is the process of saving HBase data to the hard disk. The specific flush flowchart is as follows. This article briefly analyzes the source code of the flush process.

Submit a Flush task

When HRegion inserts data, it checks whether a flush operation is required. flush is used to store HRegion cached data to the disk:

long addedSize = doMiniBatchMutation(batchOp);long newSize = this.addAndGetGlobalMemstoreSize(addedSize);if (isFlushSize(newSize)) {    requestFlush();}

This article mainly analyzes the flush process and related data structures. It is called internally in requestFlush:
this.rsServices.getFlushRequester().requestFlush(this);Actually calledMemStoreFlusherPerform the flush operation:

  public void requestFlush(HRegion r) {    synchronized (regionsInQueue) {      if (!regionsInQueue.containsKey(r)) {        // This entry has no delay so it will be added at the top of the flush        // queue.  It'll come out near immediately.        FlushRegionEntry fqe = new FlushRegionEntry(r);        this.regionsInQueue.put(r, fqe);        this.flushQueue.add(fqe);      }    }  }

MemStoreFlushRequeter has two data structure managers who need flush tasks,private BlockingQueue<FlushQueueEntry>flushQueue Map<HRegion, FlushRegionEntry> regionsInQueueFlushQueue is equivalent to a working queue that requires flush, while regionsInQueue is used to store the region information that has already been stored in the queue, the above Code indicates that when the current flush region request is not recorded in flushQueue, it is added. The FlushRegionEntry is a flushQueue unit data structure.
At this point, the flush request is submitted. Next, wait for the FlushHander thread in MemStore to retrieve the region and execute the flush task.

Preparations for executing Flush tasks

1. FlushHandler extracts FlushRegionEntry from flushQueue and runs
flushRegion(final FlushRegionEntry fqe)
Here, we first determine whether the current region contains too many storefile files. If yes, we need to merge the storefile first (it is necessary to explain the data organization in HRegion ), then join the queue again. Otherwise, perform the flush operation on region directly:

isTooManyStoreFiles(region)this.server.compactSplitThread.requestSystemCompaction(                  region, Thread.currentThread().getName());                        this.flushQueue.add(fqe.requeue(this.blockingWaitTime / 100));elsereturn flushRegion(region, false);

2. The main execution logic of the flushRegion function is as follows. First, yyflushrequest only counts the number of flush threads. region. flashcache is responsible for flush. After execution, auxiliary operations are performed based on the returned values.

 notifyFlushRequest(region, emergencyFlush); HRegion.FlushResult flushResult = region.flushcache(); boolean shouldCompact = flushResult.isCompactionNeeded();      // We just want to check the size boolean shouldSplit = region.checkSplit() != null; if (shouldSplit) {    this.server.compactSplitThread.requestSplit(region); } else if (shouldCompact) {     server.compactSplitThread.requestSystemCompaction(            region, Thread.currentThread().getName()); }if (flushResult.isFlushSucceeded()) {   long endTime = EnvironmentEdgeManager.currentTime();   server.metricsRegionServer.updateFlushTime(endTime - startTime);}
Execution of Flush tasks

Flushcahe internally calls FlushResult fs = internalFlushcache (status); actually executes the flush operation, and StoreFlushContext is implemented as StoreFlusherImpl, which creates a StoreFlusherImpl for each HStore, it performs non-flush operations for the corresponding HStore. The specific implementation of flush includes three steps:
1. Snapshots

 public void prepare() {      this.snapshot = memstore.snapshot();      this.cacheFlushCount = snapshot.getCellsCount();      this.cacheFlushSize = snapshot.getSize();      committedFiles = new ArrayList<Path>(1);    }

2. Write Data in memestore to the. tmp file.

   public void flushCache(MonitoredTask status) throws IOException {      tempFiles = HStore.this.flushCache(cacheFlushSeqNum, snapshot, status);    }

3. Write the. tmp file to the corresponding file under the corresponding cf, and use StoreFile to save the file information of the corresponding HFile.

    public boolean commit(MonitoredTask status) throws IOException {      if (this.tempFiles == null || this.tempFiles.isEmpty()) {        return false;      }      List<StoreFile> storeFiles = new ArrayList<StoreFile>(this.tempFiles.size());      for (Path storeFilePath : tempFiles) {        try {          storeFiles.add(HStore.this.commitFile(storeFilePath, cacheFlushSeqNum, status));        } catch (IOException ex) {          LOG.error("Failed to commit store file " + storeFilePath, ex);          // Try to delete the files we have committed before.          for (StoreFile sf : storeFiles) {            Path pathToDelete = sf.getPath();            try {              sf.deleteReader();            } catch (IOException deleteEx) {              LOG.fatal("Failed to delete store file we committed, halting " + pathToDelete, ex);              Runtime.getRuntime().halt(1);            }          }          throw new IOException("Failed to commit the flush", ex);        }      }      for (StoreFile sf : storeFiles) {        if (HStore.this.getCoprocessorHost() != null) {          HStore.this.getCoprocessorHost().postFlush(HStore.this, sf);        }        committedFiles.add(sf.getPath());      }      HStore.this.flushedCellsCount += cacheFlushCount;      HStore.this.flushedCellsSize += cacheFlushSize;      // Add new file to store files.  Clear snapshot too while we have the Store write lock.      return HStore.this.updateStorefiles(storeFiles, snapshot.getId());    }

Now the HBase flush operation is complete.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.