Sequencefile Read and write operations

Source: Internet
Author: User

Sequencefile can handle a large number of small files on HDFs, which can be used as a container for a large number of small files. HDFs and MapReduce are optimized for large files, so wrapping small files in Sequencefile type allows for more efficient storage and processing. Store
The keys and values in Sequencefile are not necessarily writable types, as long as they can be serialized and deserialized by serialization, any type can.

The advantages of Sequencefile are: Store with key-value pairs, support compression, and consolidate large numbers of small files.


Configuration conf = new configuration ();
FileSystem fs = Filesystem.get (New URI ("hdfs://single32:9000"), conf);
Path TargetPath = new Path ("/sfs");

Create a sequencefile that does not use compression on HDFs
Final Option Optpath = SequenceFile.Writer.file (TargetPath);
Final Option Optkeyclass = SequenceFile.Writer.keyClass (Text.class);
Final Option Optvalueclass = SequenceFile.Writer.valueClass (Byteswritable.class);
Final Sequencefile.writer Writer = sequencefile.createwriter (conf, Optpath, Optkeyclass, Optvalueclass);
Final collection<file> listfiles = fileutils.listfiles (New File ("/usr/local/"), new string[]{"TXT"}, FALSE);
Text key = null;
Byteswritable value = null;
for (File file:listfiles) {
Key = new Text (File.getpath ());
Value = new Byteswritable (Fileutils.readfiletobytearray (file));
Writer.append (key, value);
}
Ioutils.closestream (writer);

Read the Sequencefile file under the specified directory on HDFs
Final Sequencefile.reader Reader = new Sequencefile.reader (FS, TargetPath, conf);
Final text Outputkey = new text ();
Final byteswritable outputvalue = new byteswritable ();
while (Reader.next (Outputkey, Outputvalue)) {
Final file File = new file ("/usr/" +outputkey.tostring ());
Fileutils.writebytearraytofile (file, outputvalue.getbytes ());
}
Ioutils.closestream (reader);


Sequencefile Read and write operations

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.