Sample:write and Read data from HDFS with Java API

Source: Internet
Author: User

Hdfs:hadoop Distributed File System

It abstracts the storage resources of the entire cluster and can hold large files.

The file uses the design of the chunked storage replication. The default size of the block is 64M.

Streaming data access, one write (now supports append), multiple reads.

Unsuitable aspects:

Low-latency data access

Solution: HBASE

A large number of small files

Solution: Combinefileinputformat, or directly merge small files into Sequencefile storage to HDFs.

Block of HDFs

The block is a separate storage unit. However, if the file is smaller than the default block size, such as 64M, it does not occupy the entire block of space.

The block of HDFs is larger than the disk block, and is intended to minimize addressing overhead.

Namenode manages the namespace of the file system, that is, the metadata information of the file, which Datanode node exists in the block information of the file, and when the file is requested, Namenode to seek the data content on the Datanode node according to the meta-data information.

Package Myexamples;import Java.io.bufferedreader;import Java.io.bufferedwriter;import java.io.IOException;import Java.io.inputstreamreader;import Java.io.outputstreamwriter;import Org.apache.hadoop.conf.configuration;import Org.apache.hadoop.fs.blocklocation;import Org.apache.hadoop.fs.fsdatainputstream;import Org.apache.hadoop.fs.fsdataoutputstream;import Org.apache.hadoop.fs.filestatus;import Org.apache.hadoop.fs.filesystem;import Org.apache.hadoop.fs.path;public class Hdfsexample {static void Showblock ( FileSystem Fs,path file) throws IOException {//show the file meta data infofilestatus filestatus = fs.getfilestatus (file ); Blocklocation[] blocks= fs.getfileblocklocations (filestatus, 0, Filestatus.getlen ()); for (Blocklocation bl:blocks)  System.out.println (Bl.tostring ()); } static void Read (FileSystem fs,path file) throws IOException {//reading from Filefsdatainputstream instream = Fs.open (fi Le); String data = null; BufferedReader br = new BufferedReader (new InputStreamReader (instream)); while (data = Br.readline ())!=null) System.out.println (data); Br.close (); } static void Write (FileSystem fs,path file) throws Ioexception{fsdataoutputstream OutStream = Null;outstream = Fs.create ( file); BufferedWriter bw = new BufferedWriter (new OutputStreamWriter (OutStream)), for (int. i=1;i<101;i++) {bw.write ("line" + i + "Welcome to HDFs Java API"); Bw.newline ();} Bw.close ();} public static void Main (string[] args) throws IOException {configuration conf = new Configuration ();//this are import for C Onnect to Hadoop HDFs//or else you'll get file:///, local file Systemconf.set ("Fs.default.name", "hdfs://namenode:9000" ); FileSystem fs = Filesystem.get (conf); System.out.println (Fs.geturi ()); Path file = new Path ("/user/hadoop/test/demo2.txt"), if (fs.exists (file)) Fs.delete (File,false); write (fs,file); Read ( Fs,file); Showblock (fs,file); Fs.close ();}}

Sample:write and Read data from the HDFS with Java API

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.