Hadoop read environment variables and setup functions

Source: Internet
Author: User
Tags map class

Setup function source code: (Excerpt from "Hadoop Combat")
*called once at the start of the task.
protected void Setup (context context) throws ioexception,interruptedexception{}

As you can tell from the comments, the setup function is called when the task starts.
Jobs in MapReduce are organized into Maptask and Reducetask.
Each task takes the map class or the reduce class as the processing method body.
The input shard is the input of the processing method, and the task is destroyed after its own shard is processed.
As you can see from here, the setup function is called once before the task starts data processing
The map function and the reduce function are called once for each key of the input shard.
So the setup function can be thought of as a global processing of tasks.
Using the properties of the Setup function, you can put the duplicate processing in the map or reduce function into the setup function.
"Name" in the exercise_2 given by the teacher
However, it is important to note that the call to the Setup function is only a global operation on the corresponding task, not a global operation for the entire job.





You can first use the API to upload local files to the/user/hadoop/test in HDFs.
The local file is uploaded to HDFs.
public static void upload (String src,string DST) throws filenotfoundexception,ioexception{


InputStream in = new Bufferedinputstream (new FileInputStream (SRC));
Get Configuration Objects
Configuration conf = new configuration ();
File system
FileSystem fs = Filesystem.get (Uri.create (DST), conf);
Output stream
OutputStream out = fs.create (new Path (DST), new progressable () {

public void Progress () {
System.out.println ("Upload a file that sets the buffer size capacity!") ");
}
});
Connect two streams to form a channel to transmit data to the input flow output stream
Ioutils.copybytes (in, out, 4096,true);
}
You can call this function when you upload it.
For example
Upload ("/home/jack/test/test.txt", "/user/hadoop/test/test");
The previous file in the local directory, followed by the file in HDFs
Note Both must be "path + file name" cannot have no file name




Configuration conf = new configuration ();


Conf.setstrings ("Job_parms", "AAABBC"); The key is this sentence
Job Job = new Job (conf, "load analysis");
Job.setjarbyclass (Loadanalysis.class);
Job.setmapperclass (Loadmapper.class);
Job.setreducerclass (Loadintohbasereduce.class);
Job.setmapoutputkeyclass (Text.class);
Job.setmapoutputvalueclass (Text.class);

Fileinputformat.addinputpath (Job, New Path (Otherargs[0]));




@Override
protected void Setup (context context)
Throws IOException, Interruptedexception {
try {

Get configuration parameters from global configuration
Configuration conf = context.getconfiguration ();
String parmstr = Conf.get ("job_parms"); So you get it.

......

} catch (SQLException e) {

E.printstacktrace ();
}

}




Global file: Hadoop has distributed cache to save the global file, make sure all node is accessible, use the class name Distributedcache

Hadoop read environment variables and setup functions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.