The source code for the example is included in the release package for Hadoop, and the main function of the Wordcount.java class is as follows: Java code public static void Main (string[] args) throws Exception {int res = Toolrunner.run (new Configuration (), New WordCount (), args); System.exit (RES); } }
public static void Main (string[] args) throws Exception {
int res = Toolrunner.run (new Configuration (), New WordCount ( ), args);
System.exit (res);
}
Let's start with the main function, a little bit of "depth traversal," and break down this wordcount frequency statistic tool, and let's see how it works in Hadoop.
Starting with the Toolrunner run method, the Run method requires three parameters, and the first is an instance of a configuration class. The second is an instance of the Worcount class, args is the array of command lines received from the console. It can be seen that the estimated analysis to our wordcount is very, very far away, because the configuration class and the args array are enough to track for a while.
The following is the implementation of the Toolrunner Run method: Java code public static int run (configuration conf, tool tool, string[] args) throws Exception{     IF (conf == null) { // even if the incoming conf is null, Will still instantiate a configuration class configuration object conf = new Configuration (); } genericoptionsparser parser = new genericoptionsparser (Conf, args); // The object of a Genericoptionsparser class is instantiated from the specified conf and args array, and the Genericoptionsparser class object is constructed to parse the configuration information common to Hadoop The // tool class is an interface in which the WordCount tool implements the tool interface, and the tool interface simply defines a run method. That is, to implement a tool you must know how the object of the tool implementation class will run. // because the tool interface implements the configurable interface, You can set the initialization configuration for a tool in the configurable interface, using the Setconf () method tool.setconf (conf); //get the args w/o generic hadoop args String[] toolargs = parser.getremainingargs (); // returns an array of command line arguments entered from the console return tool.run (Toolargs); // starts the Toolargs instance run according to the command specified by the WordCount array. Returns the execution status code } for objects that implement the implementation class of the tool interface
public static int Run (Configuration conf, Tool Tool, string[] args)
throws exception{
if (conf = null) {//even incoming C ONF is null and will still instantiate a configuration class Configuration object
conf = new Configuration ();
}
Genericoptionsparser parser = new Genericoptionsparser (conf, args); The object of a Genericoptionsparser class is instantiated from the specified conf and args array, and the Genericoptionsparser class object is constructed to parse the configuration information common to Hadoop.
The tool class is an interface in which the WordCount tool implements the tool interface, and the tool interface simply defines a run method that implements a tool must know how to run the object of the tool implementation class.
//Because the tool interface implements the configurable interface, you can set the initialization configuration for a tool in the configurable interface, using the Setconf () method
tool.setconf (conf);
Get the args w/o generic Hadoop args
string[] Toolargs = Parser.getremainingargs ();//Returns an array of command line arguments entered from the console
retur n Tool.run (Toolargs); Starts the WordCount instance run based on the command specified by the Toolargs array, returning the execution status code of the object implementing the tool interface's implementation class
The Run method above should be the highest level of execution of the WordCount example, most abstract.
At the beginning of the program, you first parse the Hadoop configuration file, which corresponds to the Conf directory under the Hadoop root directory. The configuration class is Configuration, constructs a Configuration object, and constructs the method as follows: Java code public Configuration () {if (log.isdebugenabled () {Log.debug (stringutils.stringifyexception) (new IOException ("config ()")); } resources.add ("Hadoop-default.xml"); Resources.add ("Hadoop-site.xml"); }
Public Configuration () {
if (log.isdebugenabled ()) {
log.debug (stringutils.stringifyexception) (new IOException ("config ()"));
}
Resources.add ("Hadoop-default.xml");
Resources.add ("Hadoop-site.xml");
}
Instantiating a configuration object is to add the Hadoop-default.xml and Hadoop-site.xml configuration files in the Conf directory to private arraylist<object> Resources in order to further parse it.
The configuration file that really parses Hadoop is a genericoptionsparser generic option parser class that needs to provide a configuration object and specify an array of command-line arguments.
The following is the construction method for the Genericoptionsparser class: Java code public Genericoptionsparser (Configuration conf, string[] args) {This (CO NF, New Options (), args); Here's an extra addition to the options object as a parameter}
Public Genericoptionsparser (Configuration conf. string[] args) {This
(conf, new Options (), args); Here's an extra addition to the options object as a parameter
}
The options class is a collection of option objects that describe the command-line arguments that might be used in the application. You can view how the options class is constructed by: Java Code Public Options () {//No To do}
Public Options ()
{
//No To do
}
In fact, nothing has been done. However, you can dynamically add the specified options to a Options object.
Another method of constructing the Genericoptionsparser class is called, as follows: Java code public Genericoptionsparser (Configuration conf, options options, Stri Ng[] args) {parsegeneraloptions (options, conf, args); }
Public Genericoptionsparser (Configuration conf, options options, string[] args) {
parsegeneraloptions (options, conf, args);
}
Continue to invoke the Member method Parsegeneraloptions () of the Genericoptionsparser class to further resolve configuration options: Java code /** * parse the user-specified options, get the generic options, and modify * configuration accordingly * @param conf configuration to be modified * @param args User-specified Arguments * @return command-specific arguments */ Rivate string[] parsegeneraloptions (options opts, configuration conf, string[] args) { opts = Buildgeneraloptions (opts); commandlineparser parser = new Gnuparser (); try { commandline =&nbsP;parser.parse (opts, args, true); processgeneraloptions ( Conf, commandline); return commandline.getargs (); } catch (parseexception e) { Log.warn ("options parsing failed: " +e.getmessage ()); helpformatter formatter = new helpformatter (); formatter.printhelp ("general options are: ", opts); } return args;
/**
* Parse The user-specified options, get the generic options, and modify
* Configuration accordingly
* @par AM conf Configuration to be modified
* @param args user-specified arguments
* @return command-specific arguments
*/
Private string[] Parsegeneraloptions (Options opts, Configuration conf,
string[] args) {
opts = Buildgeneraloptions (opts);
Commandlineparser parser = new Gnuparser ();
try {
commandLine = Parser.parse (opts, args, true);
Processgeneraloptions (conf, commandLine);
return Commandline.getargs ();
} catch (ParseException e) {
log.warn ("Options parsing failed:" +e.getmessage ());
Helpformatter formatter = new Helpformatter ();
Formatter.printhelp ("General options are:", opts);
}
return args;
}
Where CommandLine is a private member variable of the Genericoptionsparser class.
The member method of the Genericoptionsparser class above parsegeneraloptions () can be a high-level abstraction for parsing the Hadoop configuration options.
The Buildgeneraloptions () receives the options opts and then returns opts as follows: Java code /** * Specify properties of each generic option */ @ Suppresswarnings ("static-access") private options buildgeneraloptions (options opts) { option fs = optionbuilder.withargname ("local| Namenode:port ") .hasarg () . Withdescription ("Specify a namenode") .create ("FS"); option jt = optionbuilder.withargname ("Local|jobtracker:port") .hasarg () .withdescription ("Specify a job tracker ") .create (" JT "); Option oconf = optionbuilder.withargname ("Configuration file") .hasarg () .withdescription ("Specify an application configuration file ") .create (" conf "); option property = optionbuilder.withargname ("Property=value") . Hasargs () .withargpattern ("=", 1) . Withdescription ("Use value for given property") .create (' D '); opts.addoption (FS); Opts.addoption (JT); opts.addoption (oconf); opts.addoption (property); return & Nbsp;opts; }
/**
* Specify properties of each generic option
*
/@SuppressWarnings ("static-access")
private Options Buildgeneraloptions (Options opts) {
Option fs = Optionbuilder.withargname ("Local|namenode:port")
. Hasarg ()
. Withdescription ("Specify a Namenode")
. Create ("FS");
Option JT = Optionbuilder.withargname ("Local|jobtracker:port")
. Hasarg ()
. Withdescription ("Specify a job Tracker ")
. Create (" JT ");
Option oconf = optionbuilder.withargname ("configuration file")
. Hasarg ()
. Withdescription ("Specify an Application configuration File ")
. Create (" conf ");
Option property = Optionbuilder.withargname ("Property=value")
. Hasargs ().
Withargpattern ("=", 1)
. Withdescription ("Use value for given property")
. Create (' D ');
Opts.addoption (FS);
Opts.addoption (JT);
Opts.addoption (oconf);
Opts.addoption (property);
return opts;
}
Here is a description of the option class and how to set an instance of the option class.
The Buildgeneraloptions () method receives the options opts and then returns opts, which has changed the OPTs value in the process. As follows: Java code /** * Specify properties of each Generic option */ @SuppressWarnings ("static-access") private options buildgeneraloptions (options opts) { option fs = optionbuilder.withargname ("Local|namenode:port") . Hasarg () .withdescription ("Specify a namenode") .create ("FS"); Option jt = Optionbuilder.withargname ("Local|jobtracker:port") .hasarg () .withdescription ("Specify a job tracker") .create ("JT"); option oconf = optionbuilder.withargname ("Configuration file") .hasarg () .withdescription ("specify an Application configuration file ") .create (" conf "); option property = optionbuilder.withargname ("Property=value") .hasargs () .withargpattern ("=", 1) .withdescription ("Use value for given property") .create (' D '); opts.addoption (FS); opts.addoption (JT); opts.addoption (oconf); opts.addoption (property); &nbSp; return opts; }
/** * Specify properties of each generic option * * @Suppre Sswarnings ("static-access") Private options Buildgeneraloptions (Options opts) {Option fs = Optionbuilder.withargname (
"Local|namenode:port"). Hasarg (). Withdescription ("Specify a Namenode"). Create ("FS");
Option JT = Optionbuilder.withargname ("Local|jobtracker:port"). Hasarg (). Withdescription ("Specify a Job Tracker")
. Create ("JT"); Option oconf = optionbuilder.withargname ("Configuration File"). Hasarg (). Withdescription ("Specify an application
Configuration file "). Create (" conf "); Option property = Optionbuilder.withargname ("Property=value"). Hasargs (). Withargpattern ("=", 1). Withdescript
Ion ("Use value for given property"). Create (' D ');
Opts.addoption (FS);
Opts.addoption (JT);
Opts.addoption (oconf);
Opts.addoption (property);
return opts; }
began to pass in a opts, it does not have any content (refers to the object of the option class, that is, an option), because the options opts is not configured from the beginning of the instantiation. However, in the latter part of the code above, the content has been set for opts, which is to set the object to add option class to the options.
See what information is added to the details. Take a look: Java code Option FS = Optionbuilder.withargname ("Local|namenode:port"). Hasarg (). Withdescription ("Specify A Namenode "). Create (" FS "); Opts.addoption (FS);
Option fs = Optionbuilder.withargname ("Local|namenode:port")
. Hasarg ()
. Withdescription ("Specify a Namenode ")
. Create (" FS ");
Opts.addoption (FS);
Option represents a command line, let's take a look at the definition of the option class: Java code package org.apache.commons.cli; Import java.util.ArrayList; Import java.util.regex.Pattern; public Class op