Sequencefile can handle a large number of small files on HDFs, which can be used as a container for a large number of small files. HDFs and MapReduce are optimized for large files, so wrapping small files in Sequencefile type allows for more efficient storage and processing. Store
The keys and values in Sequencefile are not necessarily writable types, as long as they can be serialized and deserialized by serialization, any type can.
The advantages of Sequencefile are: Store with key-value pairs, support compression, and consolidate large numbers of small files.
Configuration conf = new configuration ();
FileSystem fs = Filesystem.get (New URI ("hdfs://single32:9000"), conf);
Path TargetPath = new Path ("/sfs");
Create a sequencefile that does not use compression on HDFs
Final Option Optpath = SequenceFile.Writer.file (TargetPath);
Final Option Optkeyclass = SequenceFile.Writer.keyClass (Text.class);
Final Option Optvalueclass = SequenceFile.Writer.valueClass (Byteswritable.class);
Final Sequencefile.writer Writer = sequencefile.createwriter (conf, Optpath, Optkeyclass, Optvalueclass);
Final collection<file> listfiles = fileutils.listfiles (New File ("/usr/local/"), new string[]{"TXT"}, FALSE);
Text key = null;
Byteswritable value = null;
for (File file:listfiles) {
Key = new Text (File.getpath ());
Value = new Byteswritable (Fileutils.readfiletobytearray (file));
Writer.append (key, value);
}
Ioutils.closestream (writer);
Read the Sequencefile file under the specified directory on HDFs
Final Sequencefile.reader Reader = new Sequencefile.reader (FS, TargetPath, conf);
Final text Outputkey = new text ();
Final byteswritable outputvalue = new byteswritable ();
while (Reader.next (Outputkey, Outputvalue)) {
Final file File = new file ("/usr/" +outputkey.tostring ());
Fileutils.writebytearraytofile (file, outputvalue.getbytes ());
}
Ioutils.closestream (reader);
Sequencefile Read and write operations