Hdfs:hadoop Distributed File System
It abstracts the storage resources of the entire cluster and can hold large files.
The file uses the design of the chunked storage replication. The default size of the block is 64M.
Streaming data access, one write (now supports append), multiple reads.
Unsuitable aspects:
Low-latency data access
Solution: HBASE
A large number of small files
Solution: Combinefileinputformat, or directly merge small files into Sequencefile storage to HDFs.
Block of HDFs
The block is a separate storage unit. However, if the file is smaller than the default block size, such as 64M, it does not occupy the entire block of space.
The block of HDFs is larger than the disk block, and is intended to minimize addressing overhead.
Namenode manages the namespace of the file system, that is, the metadata information of the file, which Datanode node exists in the block information of the file, and when the file is requested, Namenode to seek the data content on the Datanode node according to the meta-data information.
Package Myexamples;import Java.io.bufferedreader;import Java.io.bufferedwriter;import java.io.IOException;import Java.io.inputstreamreader;import Java.io.outputstreamwriter;import Org.apache.hadoop.conf.configuration;import Org.apache.hadoop.fs.blocklocation;import Org.apache.hadoop.fs.fsdatainputstream;import Org.apache.hadoop.fs.fsdataoutputstream;import Org.apache.hadoop.fs.filestatus;import Org.apache.hadoop.fs.filesystem;import Org.apache.hadoop.fs.path;public class Hdfsexample {static void Showblock ( FileSystem Fs,path file) throws IOException {//show the file meta data infofilestatus filestatus = fs.getfilestatus (file ); Blocklocation[] blocks= fs.getfileblocklocations (filestatus, 0, Filestatus.getlen ()); for (Blocklocation bl:blocks) System.out.println (Bl.tostring ()); } static void Read (FileSystem fs,path file) throws IOException {//reading from Filefsdatainputstream instream = Fs.open (fi Le); String data = null; BufferedReader br = new BufferedReader (new InputStreamReader (instream)); while (data = Br.readline ())!=null) System.out.println (data); Br.close (); } static void Write (FileSystem fs,path file) throws Ioexception{fsdataoutputstream OutStream = Null;outstream = Fs.create ( file); BufferedWriter bw = new BufferedWriter (new OutputStreamWriter (OutStream)), for (int. i=1;i<101;i++) {bw.write ("line" + i + "Welcome to HDFs Java API"); Bw.newline ();} Bw.close ();} public static void Main (string[] args) throws IOException {configuration conf = new Configuration ();//this are import for C Onnect to Hadoop HDFs//or else you'll get file:///, local file Systemconf.set ("Fs.default.name", "hdfs://namenode:9000" ); FileSystem fs = Filesystem.get (conf); System.out.println (Fs.geturi ()); Path file = new Path ("/user/hadoop/test/demo2.txt"), if (fs.exists (file)) Fs.delete (File,false); write (fs,file); Read ( Fs,file); Showblock (fs,file); Fs.close ();}}
Sample:write and Read data from the HDFS with Java API