Hadoop gets the file name of input file inside mapper

Last Update:2015-03-17 Source: Internet

Author: User

Keywords Hadoop mapreduce hadoop architecture Hopfield large-scale data associative memory

Tags big data class data file files get hadoop how to

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Well, I admit it's cool to use Hadoop to handle big data. But sometimes I get frustrated when I do marshalling project.
Many times we use a join in a map-reduce task, so the entire job's input may be more than two files (in other words: Mapper to process more than two files).

How to handle multiple inputs with mapper:

Multiple mapper: Each mapper processes the corresponding input file Https://github.com/zhouhao/Hadoop_Project1/blob/master/MapReduceQueries/Query3/query3.java

multipleinputs.addinputpath (conf, new Path (Args[0)), Textinputformat.class, Customermap.class); Multipleinputs.addinputpath (conf, new Path (args[1)), Textinputformat.class, Transactionmap.class); Fileoutputformat.setoutputpath (conf, new Path (args[2));

A mapper: A mapper process all the different files (the following code snippet, within the mapper, we can data from which file, and then processed accordingly)

public static class Map extends Mapreducebase implements Mapper<longwritable, text, text, text> {public void map ( Longwritable key, Text value, outputcollector<text,text> output, Reporter Reporter) throws IOException {//get FileName from reporter Filesplit Filesplit = (filesplit) reporter.getinputsplit (); String filename = Filesplit.getpath (). GetName (); String line = value.tostring (); Output.collect (new Text (filename), value); } }

Ps:mapper input can be a folder: Fileinputformat.setinputpaths (conf, new Path ("/tmp/");

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

hadoop file what is the file name header file for gets in c css inside html file java read file inside jar found a swap file by the name python input text file

Getting Started with CDN 12-02

Front-end Must Learn: CDN Acceleration Principle 12-02

Elements of CDN Network 12-01

Understand the Principle of CDN Acceleration in One Article 12-01

Cloud Security Issues Derived from the Development of Cloud C... 11-26

8 New Types of Attacks Facing the Cloud Environment 11-26

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

Hot Article

Hot Tags

computing conference access forum computer class data get http html applications

Popular Keywords

html add blank space register business logo register ssl certificate full site sign in sign up node js build cloud register register a subdomain in python network management system tutorial how to learn computer science by myself

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Hadoop gets the file name of input file inside mapper

Contact Us

Hot Article

Hot Tags

Popular Keywords

Recommend Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support