Inkfish original, do not reprint commercial nature, reproduced please indicate the source (http://blog.csdn.net/inkfish).
The default output of Hadoop is Textoutputformat, and the output file name is not customizable. Hadoop 0.19.X has a org.apache.hadoop.mapred.lib.MultipleOutputFormat that can output multiple files and can customize the filename, but from Hadoop 0.20. All classes in the package Multipleoutputformat in X are marked as obsolete and may not be available in future versions of Hadoop if you use Multipleoutputformat again. In this article, we implement a simple multipleoutputformat ourselves and modify the WordCount sample program with Hadoop to test the results.
Environment:
Ubuntu 8.0.4 Server 32bit
Hadoop 0.20.1
JDK 1.6.0_16-B01
Eclipse 3.5
All code is divided into 3 classes:
1.LineRecordWriter:
An implementation of Recordwriter, used to convert <key, value> into one line of text. In Hadoop, this class exists as a subclass of Textoutputformat, protected access, so that ordinary programs cannot be accessed. This is simply extracting the linerecordwriter from the Textoutputformat as a separate public class.
[Java] View plain copy package inkfish.hadoop.study; import java.io.dataoutputstream; import java.io.ioexception; import java.io.unsupportedencodingexception; import org.apache.hadoop.io.nullwritable; import org.apache.hadoop.io.text; import org.apache.hadoop.mapreduce.recordwriter; import org.apache.hadoop.mapreduce.taskattemptcontext; import org.apache.hadoop.mapreduce.lib.output.textoutputformat; /** excerpt from {@link textoutputformat} In the Linerecordwriter. */ Public class linerecordwriter<k, v> extends recordwriter <K, V> { private static final String utf8 = "UTF-8"; private static final byte[] newline; static { try { newline = "/n". GetBytes (UTF8); } catch (Unsupportedencodingexception uee) { throw new IllegalArgumentException ("Can ' t find " + utf8 + " encoding"); } } protected DataOutputStream out; private final byte[] keyValueSeparator; public linerecordwriter ( Dataoutputstream out, string keyvalueseparator) { this.out = out; try { this.keyValueSeparator = Keyvalueseparator.getbytes (UTF8); } catch (unsupportedencodingexception uee) { throw new illegalargumentexception ("Can ' t find " + utf8 + " encoding"); } } public linerecordwriter ( Dataoutputstream out) { this (out, "/ T "); } private void WriteObject (Object o) &NBSP;THROWS&NBSP;IOEXCEPTION&NBSP;{&NBsp; if (O instanceof text) { Text to = (Text) o; out.write ( To.getbytes (), 0, to.getlength ()); else { out.write (O.tostring (). GetBytes (UTF8));