Hadoop authoritative Guide Chapter III notes

Source: Internet
Author: User
Tags abstract failover file size mkdir regular expression safe mode hadoop fs
1.HDFS Design

HDFs designed for objects: oversized files (terabytes of files), streaming data access (write-once, multiple-read), Commercial hardware (inexpensive hardware)

HDFs Design Unsuitable objects: low-latency data access, large number of small files, multi-user writes, arbitrary modification of files

(Because file system metadata is stored in namenode memory, the more small files, the more memory is consumed.) In general, the storage information for each file, directory or block of data is approximately 150 bytes, so 1 million files, and each file occupies a block of data, at least 300MB of memory required)

Concept of 2.HDFS

1). Data Block (block)

The size of the block in HDFs defaults to 64M, and files smaller than the block size do not occupy the entire space of the block (instead, the file size is the size of the block.) For example, the file to be stored is 1k, but the system block default is 64MB, the size of the block after storage is 1k, Not 64MB. If the file is larger than 64MB, it is stored more quickly.), the block in HDFs is set so large that it is to minimize addressing overhead.

The advantage of using blocks: can store large files, a file size can be larger than any single hard disk in the network capacity to abstract storage units into blocks instead of files, greatly simplifying the design of the storage subsystem: Simplify the data management, eliminate metadata attention can be well adapted to data replication, Data replication guarantees fault tolerance and availability of the system.

Executing this command will list the blocks that each file in the file system consists of.

% Hadoop fsck/-files-blocks

2) Namenode and Datanode

HDFS provides two types of node NameNode: Manager, responsible for managing file system namespaces , maintaining files and directories throughout the file system tree and tree, and recording data node information for each block of each file. All information is saved in the local Disk: namespace image file (namespace image) and the edit log file. Namenode also records data node information for each block in each file, but it does not permanently store the block's location information because it is rebuilt by DataNode when the system is started (when the cluster is in Safe mode and the HDFs file system is read-only) DataNode: Worker , store and retrieve data blocks as needed (scheduled by the client or Namenode), and periodically send a list of blocks to namenode that they store

The client side also accesses the entire file system on behalf of the user through Namenode and datanode interactions.

Without Namenode, the file system will not be available, so there are two ways to implement fault tolerance for Namenode: A file that backs up the file system metadata that stores the persisted state, the general configuration is to write fsimage and edits files to the local disk, A remote mounted Network file system (NFS) is also written to provide secondary namenode. The primary role of secondary is to periodically merge the namespace image files and edit the log files to prevent edit logs from being too large. However, the data of secondary namenode is slower than the master Namenode data, and all data will be lost after recovery. Therefore, it is common to copy the metadata from the NFS file system in method one to secondary namenode and as the new master Namenode.

3). HDFS Federation Federation uses multiple independent namenode/namespace so that the namenode can be scaled horizontally, and these namenode are federated, that is, they are independent of each other and do not need to coordinate with each other. The respective division of labor, management of their own area. The distributed Datanode is used as a universal block storage device. Each datanode is registered to all Namenode in the cluster and periodically sends heartbeat and block reports to all Namenode, and executes commands from all Namenode. A block pool consists of blocks of data that belong to the same namespace, and each datanode may store data blocks for all block pool in the cluster. Each block pool internal autonomy, that is, the management of their respective Block,block pool can not communicate. A namenode hangs up and does not affect other namenode. The namespace on a Namenode is called namespace volume along with its corresponding block pool. It is the basic unit of management. When a namenode/nodespace is deleted, the corresponding block pool on all datanode will be deleted. Each namespace volume is upgraded as a base unit when the cluster is upgraded.

As shown below:

4). Hadoop High Availability

Hadoop2.0.0 layout, provides a mechanism to synchronize data in the standby state of Namenode with data in the active namenode. The implementation of this mechanism must require that these two namenode be accessible on a shared storage device (for example, NFS from the NAS) on the directory. (Of course, the latest version of Hadoop can be configured to achieve high reliability, detailed configuration, please refer to the official website)

1). When namespace is modified by the active Namenode, this modification is persisted to an edit log file in the shared directory, and Namenode in the standby state constantly views the edit log file in the shared directory. When the edit log file is found to be changed, copy it to your namespace.

2). When the namenode of an active state crashes, the standby state's Namenode replaces the crashed Namenode as the active Namendoe, and Namenode, which was previously in an alternate state, ensures that it reads all the records from the edit log from the shared directory. This ensures that the namespace in these two namenode are fully synchronized prior to failover.

3). In order to provide fast failover, the Namenode node that needs to be in the backup state has the most up-to-date information about the block location in the cluster, and to achieve this, all datanodes managed by these two namenode need to send block information and heartbeat information to these two namenode.

4). For clusters that operate high availability correctly, it is critical that at any moment these two namenode only have one namenode active, otherwise namespace will be in an inconsistent state, which will result in data loss or other unknown results.

3. Command-line interface

Basic command:

1). Copy the Local data:

% Hadoop  fs-copyfromlocal input/docs/quangle.txt   hdfs://localhost/user/tom/quangle.txt
% Hadoop FS- copyfromlocal   input/docs/quangle.txt/user/tom/quangle.txt              ---This inside command can save hdfs://
% Hadoop FS- Copyfromlocal   input/docs/quangle.txt quangle.txt                              ---using relative paths

2). Copy data from HDFs to the local hard disk and check the file for consistency

% Hadoop fs-copytolocal quangle.txt  quangle.copy.txt
% MD5 input/docs/quangle.txt  quangle.copy.txt

3). HDFs file List

% Hadoop fs-mkdir Books
% Hadoop fs-ls.
	Found 2 Items
drwxr-xr-x  -   Tom  supergroup  0      2009-04-02   22:41   /user/tom/ Books
-rw-r--r--   1   Tom  supergroup  118    2009-04-02   22:29   /user/tom/ Quangle.txt
Among the results, the columns indicated:

File mode, number of copies of files copied, file owner, file owner's group, File size (bytes), directory display 0, file last modified date, file last modified time, file absolute path

The Enable permission is configured by the Dfs.permissions property.

4.Hadoop File System

Hadoop has an abstraction of the filesystem, and HDFs is just one of those implementations. Java's abstract class Org.apache.hadoop.fs.FileSystem represents the file system in Hadoop, and there are several other implementations (P58)

The URI scheme is generally used to select an appropriate file system instance for interaction.

List the files in the root directory of the local file system, you can enter the following command:

% Hadoop Fs-ls file:///

Interface:
Hadoop is written in Java, and all Hadoop file interactions are done through the Java API. There is another library that interacts with the Hadoop file system: Thrift, C, FUSE, WebDAV, and more.

5.Java interface

1). Reading data from the Hadoop URL

Use the Java.net.URL object to open the data stream and then read the data from it:

InputStream in = null;
try {in
    = new URL ("Hdfs://host/path"). OpenStream ();
    Process in
    } finally {
    ioutils.closestream (in);
    }
}

code example:

The following shows how the program displays the files in the Hadoop file system as standard output, and the method used here invokes the Seturlstreamhandlerfactory method in the URL via the Fsurlstreamhandlerfactory instance. This operation can only be used once for a JVM, and we can call it in a static block.

public class Urlcat {
    static {
        url.seturlstreamhandlerfactory (new Fsurlstreamhandlerfactory ());
    }

    public static void Main (string[] args) throws Exception {
        inputstream in = null;
        try {in
            = new URL (Args[0]). OpenStream ();
            Ioutils.copybytes (in, System.out, 4096, false);
        } finally {
            ioutils.closestream (in);}}
}


2). Read data through the FileSystem API

FileSystem is a common file system API in which the open method of its object returns a Fsdatainputstream object that supports random access.

The configuration object encapsulates the client's or server's configurations by setting the configuration file to read the Classpath to implement

public class Filesystemcat {public
    static void Main (string[] args) throws Exception {
        String uri = args[0];
        Configuration conf = new configuration ();
        FileSystem fs = Filesystem.get (Uri.create (URI), conf);
        InputStream in = null;
        try {in
            = Fs.open (new Path (URI));
            Ioutils.copybytes (in, System.out, 4096, false);
        } finally {
            ioutils.closestream (in);}}
}


Operation Result:

% Hadoop Filesystemcat hdfs://localhost/user/tom/quangle.txt

On the top of the Crumpetty Tree
The Quangle Wangle Sat,
But he face you could isn't see,
On account of his Beaver Hat.

Package org.apache.hadoop.fs;

public class Fsdatainputstream extends DataInputStream implements Seekable,
        positionedreadable {
    // Implementation elided
}


The Seekable interface supports locating the specified location in the file, where seek () can move to any absolute position in the file, and Skip () is only positioned to another new location relative to the current position, and seek () is a method that is a relatively expensive operation that needs to be used with caution.

Public interface Seekable {
    void Seek (Long pos) throws IOException;

    Long GetPos () throws IOException;
}

public class Filesystemdoublecat {public
    static void Main (string[] args) throws Exception {
        String uri = args[0];
        Configuration conf = new configuration ();
        FileSystem fs = Filesystem.get (Uri.create (URI), conf);
        Fsdatainputstream in = null;
        try {in
            = Fs.open (new Path (URI));
            Ioutils.copybytes (in, System.out, 4096, false);
            In.seek (0); Go back to the start of the file
            ioutils.copybytes (in, System.out, 4096, false);
        } finally {
            Ioutils.clos eSTREAM (in);}}

Operation Result:

% Hadoop Filesystemdoublecat hdfs://localhost/user/tom/quangle.txt

On the top of the Crumpetty Tree
The Quangle Wangle Sat,
But he face you could isn't see,
On account of his Beaver Hat.
On the top of the Crumpetty Tree
The Quangle Wangle Sat,
But he face you could isn't see,
On account of his Beaver Hat.

In the following code, the Read () method reads up to length bytes. The position is offset-relative, and buffer holds the read data. The readfully () method reads the data of length bytes into buffer, and the second readfully reads the data from Buffer.length bytes into buffer. The following methods do not change the value of offset.

Public interface Positionedreadable {public
    int read (long position, byte[] buffer, int offset, int length)
            throws IOException;

    public void readfully (long position, byte[] buffer, int offset, int length)
            throws IOException;

    public void readfully (long position, byte[] buffer) throws IOException;
}

3). Write Data

FileSystem class Create a file method the Create parameter is a specified path object

Public Fsdataoutputstream Create (Path f) throws IOException;

Important method Progressable, you can write the data to the Data node progress notification application

Package org.apache.hadoop.util;

Public interface Progressable {public
    void progress ();
}

New File Method append () to append data to the end of an existing file

Public Fsdataoutputstream append (Path f) throws IOException;

Program instance: Copy local files to the Hadoop file system

public class Filecopywithprogress {public
    static void Main (string[] args) throws Exception {
        String localsrc = ar Gs[0];
        String DST = args[1];
        InputStream in = new Bufferedinputstream (new FileInputStream (LOCALSRC));
        Configuration conf = new configuration ();
        FileSystem fs = Filesystem.get (Uri.create (DST), conf);
        OutputStream out = fs.create (new Path (DST), new progressable () {Public
            void progress () {
                System.out.print (".");
            }
        });
        Ioutils.copybytes (in, out, 4096, true);
    }
}

Execution Result:

% Hadoop filecopywithprogress input/docs/1400-8.txt hdfs://localhost/user/tom/
1400-8.txt
...............

The Fsdataoutputstream object returned by the Create () method

Package org.apache.hadoop.fs;

public class Fsdataoutputstream extends DataOutputStream implements syncable {public
    long GetPos () throws Ioexceptio n {
        //implementation elided
    }
    //implementation elided
}
4). Catalogue

Create a Directory method MkDir () method if the directory creation succeeds returns true

public boolean mkdirs (Path f) throws IOException;
5). Querying the file system

File metadata: Filestatus,filesystem's Getfilesstatus () method to get the Filestatus object for a file or directory

Example code:

public class Showfilestatustest {public

    void Filestatusforfile () throws IOException {
        path file = new Path ("/dir/ File ");
        Filestatus stat = fs.getfilestatus (file);
        Assertthat (Stat.getpath (). Touri (). GetPath (), is ("/dir/file"));
        Assertthat (Stat.isdir (), is (false));
        Assertthat (Stat.getlen (), is (7L));
        Assertthat (Stat.getmodificationtime (), is
                (Lessthanorequalto (System.currenttimemillis ())));
        Assertthat (Stat.getreplication (), is ((short) 1));
        Assertthat (Stat.getblocksize (), is (+ * 1024L));
        Assertthat (Stat.getowner (), is ("Tom"));
        Assertthat (Stat.getgroup (), is ("supergroup"));
        Assertthat (Stat.getpermission (). ToString (), is ("rw-r--r--"));
    }
Other code please check the original book
}


List files, the following Liststatus function can list the contents of the directory can be a

Public filestatus[] Liststatus (Path f) throws IOException;
Public filestatus[] Liststatus (Path F, pathfilter filter) throws IOException;
Public filestatus[] Liststatus (path[] files) throws IOException;
Public filestatus[] Liststatus (path[] files, Pathfilter filter) throws IOException;

public class Liststatus {public
    static void Main (string[] args) throws Exception {
        String uri = args[0];
        Configuration conf = new configuration ();
        FileSystem fs = Filesystem.get (Uri.create (URI), conf);
        path[] paths = new Path[args.length];
        for (int i = 0; i < paths.length; i++) {
            paths[i] = new Path (args[i]);
        }
        filestatus[] status = Fs.liststatus (paths);
        path[] Listedpaths = fileutil.stat2paths (status);
        for (Path p:listedpaths) {
            System.out.println (p);}}
}

Execution Result:

% Hadoop liststatus Hdfs://localhost/hdfs://localhost/user/tom
Hdfs://localhost/user
Hdfs://localhost/user/tom/books
Hdfs://localhost/user/tom/quangle.txt

File mode, in order to process a batch of files, Hadoop provides a "wildcard operation" and provides a globstatus () method that returns an array of Filestatus objects that match all the files of the path and sorts them by path

Public filestatus[] Globstatus (Path pathpattern) throws IOException;
Public filestatus[] Globstatus (Path Pathpattern, Pathfilter filter) throws oexception;

The following is a list of the wildcard and its meanings:

Pathfilter objects, in order to compensate for the lack of proper wildcard functionality, Hadoop filesystem provides optional Pathfilter objects in Liststatus () and Globstatus (), enabling us to programmatically control wildcard characters

Package org.apache.hadoop.fs;

Public interface Pathfilter {
    Boolean accept (path path);
}
Program instance: Pathfilter to exclude matching regular expression paths

public class Regexexcludepathfilter implements Pathfilter {
    private final String regex;

    Public Regexexcludepathfilter (String regex) {
        This.regex = regex;
    }

    public boolean accept (path path) {
        return!path.tostring (). Matches (regex);
    }
}

Filter method Invocation:

Fs.globstatus (New Path ("/2007/*/*"), New Regexexcludefilter ("^.*/2007/12/31$");
6). Delete data

Use the filesystem Delete () method to permanently delete a file or directory, where F is a file or the value of an empty directory recursive is ignored. When a directory is not empty: recursive is true, the directory will be deleted along with the internal content, otherwise the IOException exception is thrown.

public Boolean Delete (Path F, Boolean recursive) throws IOException;
6. Data Flow

1). File Read Profiling

The following illustration shows the main order of some events when reading a file:

The following steps the client opens the file to be read by invoking the Open () method of the FileSystem object, which is an instance of the distributed system called for HDFs Distributedfilesystem Distributedfilesystem calls Namenode to determine the location of the first few blocks of the file via RPC call. For each block,namenode return a Datanode address that contains the block copy, and Next, the Datanode is sorted by distance from the client (the method for determining the distance is described later). If the client itself is a datanode, the data is read from the local Datanode node. Distributedfilesystem returns a fsdatainputstream to the client, letting him read the data from the Fsdatainputstream. Fsdatainputstream then wraps a dfsinputstream, which he uses to manage the read () method of the I/O client call flow for Datanode and Namenode. Dfsinputstream started with the address of the first few blocks datanode, when it started to connect to the nearest datanode. The client repeatedly calls the read () method to stream data from the Datanode. When you read the end of the block, Dfsinputstream closes the link to the current Datanode and then finds the best datanode for the next block. These operations are transparent to the client, and the client senses a continuous stream. (Start looking for the address of the next block when reading) close Fsdatainputstream after reading is complete

About fault-tolerant handling issues:

During the read, if an error occurs when the client communicates with the Datanode, it attempts to read the next datanode that contains the block. The client remembers the error Datanode so that it does not have to try the Datanode after reading the block. The client also verifies the checksum of the data passed from Datanode. If the wrong block is found, it will report this information to Namenode before attempting to read the data from another datanode.

An important aspect of this design is that the client contacts Datanodes to receive data directly, and the client is Namenode directed to the best datanode that contains each piece of data. Such a design can be extended to HDFs to accommodate a large number of clients, because the data transmission line is through all the Datanode in the cluster, Namenode only need the corresponding block Location query service can (and Namenode is the location of the block information in memory, so the efficiency is very high), Namenode does not need to provide data services because data services will quickly become a bottleneck as the client grows.

About network topologies and Hadoop

The Hadoop computing path is done as follows: Consider the network as the distance between the two nodes of the tree structure = the distance between the first node and the common ancestor node of two nodes + the distance of the second node to the two nodes common ancestor node

Here is an example:

Distance (/d1/r1/n1,/d1/r1/n1) = 0 (process on the same node)

Distance (/d1/r1/n1,/d1/r1/n2) = 2 (different nodes on the same rack)

Distance (/d1/r1/n1,/d1/r2/n3) = 4 (nodes on different racks in the same data center)

Distance (/d1/r1/n1,/d2/r3/n4) = 6 (nodes in different data centers)

2). File Write Profiling

The following illustration shows the main sequence of events when writing to a file:

The

client requests the creation of a file by calling the Create () function of Distributedfilesystem (step 1) Distributedfilesystem by making an RPC request to Namenode. Create a new file in Namenode's namespace, but this time there is no block associated with it (step 2). Namenode does a lot of checking to ensure that there are no files to create that already exist in the file system, and that you check for permissions to create files. If these checks are complete, Namenode will log the information for this new file, otherwise the file creation will fail and the client receives a ioexpection. Distributedfilesystem returns a fsdataoutputstream to the client for writing data. As in the case of reading, Fsdataoutputstream will wrap a dfsoutputstream for communication with Datanode and Namenode. The client begins writing data (step 3). Dfsdataoutputstream divides the data to be written into packets (packet) and writes them to the intermediate queue (data queue). The data in the database queue is read by Datastreamer. Datastreamer's job is to get namenode to allocate new chunks--by finding the right datanodes--to store the data copied as a backup. These datanodes constitute a pipeline, we assume that this pipeline is a three-level pipeline, then the inside will contain three nodes. Datastreamer writes the data first to the first node in the pipeline, then the packet is routed and written to the second node by the first stanza point, and then the second one transmits the packet and writes to the third node (step 4,5). Dfsoutputstream maintains an internal queue for packets, which stores information about packets that are waiting to be datanode confirmed. This queue is called the wait queue (Ack queue). A packet message is moved out of this queue when and only if packet is confirmed by all nodes in the pipeline when the client calls the Close method of the stream after the data has been written (step 6), this method will flush the remaining packets until the notification Namenode completes the write, And wait for the confirmation message (acknowledgement). Namenode already knows what blocks the file consists of (by DataStream asking for the allocation of the data block), so it only needs to wait for the data block to replicate the minimum value before returning (step 7).

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.