hadoop雖然沒有提供POSIX那樣的操作,但是提供的基本的檔案操作open,create,delete,write,seek,read還是令使用者可以方便的操作檔案。下面是一段尋常的hadoop開啟檔案並且讀取檔案內容的代碼:
hdfs=hdfsPath.getFileSystem(conf);inFsData=hdfs.open(p);inFsData.seek(place);inFsData.readLong();
hdfs是FileSystem的執行個體,FileSystem是一個抽象類別,根據conf中url的內容,返回的hdfs可能是本地檔案系統的執行個體,也可能是Distributed File System的執行個體。hadoop檔案操作的實際類是DistributedFileSystem
下面來看一下DistributedFileSystem的open操作:
public FSDataInputStream open(Path f, int bufferSize) throws IOException { statistics.incrementReadOps(1); return new DFSClient.DFSDataInputStream( dfs.open(getPathName(f), bufferSize, verifyChecksum, statistics)); }
可以看出open操作是返回一個FSDataInputStream的輸入資料流,open裡面產生了DFSClient中內部類DFSDataInputStream的對象,對象的其中參數是DFSClent的open函數傳回值下面是DFSClient的open函數
public DFSInputStream open(String src, int buffersize, boolean verifyChecksum, FileSystem.Statistics stats ) throws IOException { checkOpen(); // Get block info from namenode return new DFSInputStream(src, buffersize, verifyChecksum); }
這個open函數返回的是DFSInputStream對象,下面是DFSInputStream的建構函式:
DFSInputStream(String src, int buffersize, boolean verifyChecksum ) throws IOException { this.verifyChecksum = verifyChecksum; this.buffersize = buffersize; this.src = src; prefetchSize = conf.getLong("dfs.read.prefetch.size", prefetchSize); openInfo(); }
下面是DFSInputStream的openInfo函數,這個函數式整個open系列的核心操作。
synchronized void openInfo() throws IOException { LocatedBlocks newInfo = callGetBlockLocations(namenode, src, 0, prefetchSize); if (newInfo == null) { throw new FileNotFoundException("File does not exist: " + src); } // I think this check is not correct. A file could have been appended to // between two calls to openInfo(). if (locatedBlocks != null && !locatedBlocks.isUnderConstruction() && !newInfo.isUnderConstruction()) { Iterator<LocatedBlock> oldIter = locatedBlocks.getLocatedBlocks().iterator(); Iterator<LocatedBlock> newIter = newInfo.getLocatedBlocks().iterator(); while (oldIter.hasNext() && newIter.hasNext()) { if (! oldIter.next().getBlock().equals(newIter.next().getBlock())) { throw new IOException("Blocklist for " + src + " has changed!"); } } } updateBlockInfo(newInfo); this.locatedBlocks = newInfo; this.currentNode = null; }
其中callGetBlockLocations是通過RPC和namenode通訊來訪問該檔案的前prefetchSize個塊(設定檔裡的,預設為10)。把這10個塊的位置存放在這個流中。後面有一個updateBlockInfo函數是選最後一塊的datanode的資訊與namenode上的資訊做比較,若不一致,則遵從datanode上的資訊(因為namenode和datanode上的資訊可能存在不一致)。
然後的seek和read函數都是針對於stream的。下面看下DFSInputStream的seek函數
public synchronized void seek(long targetPos) throws IOException { if (targetPos > getFileLength()) { throw new IOException("Cannot seek after EOF"); } boolean done = false; if (pos <= targetPos && targetPos <= blockEnd) { // // If this seek is to a positive position in the current // block, and this piece of data might already be lying in // the TCP buffer, then just eat up the intervening data. // int diff = (int)(targetPos - pos); if (diff <= TCP_WINDOW_SIZE) { try { pos += blockReader.skip(diff); if (pos == targetPos) { done = true; } } catch (IOException e) {//make following read to retry LOG.debug("Exception while seek to " + targetPos + " from " + currentBlock +" of " + src + " from " + currentNode + ": " + StringUtils.stringifyException(e)); } } } if (!done) { pos = targetPos; blockEnd = -1; } }