MapReduce instance -- Query of cards missing and mapreduce missing

Source: Internet
Author: User

MapReduce instance -- Query of cards missing and mapreduce missing
Problem:

Solution:

 

1. Code

1) Map code

1     String line = value.toString();2     String[] strs = line.split("-");3     if(strs.length == 2){4         int number = Integer.valueOf(strs[1]);5         if(number > 10){6             context.write(new Text(strs[0]), value);7         }8     }

 

 

2) Reduce code

1      Iterator<Text> iter = values.iterator();2      int count = 0;3      while(iter.hasNext()){4         iter.next();5         count ++;6     }7     if(count < 3){8         context.write(key, NullWritable.get());9     }

 

 

3) Runner code

 1     Configuration conf = new Configuration(); 2     Job job = Job.getInstance(conf); 3     job.setJobName("poker mr"); 4     job.setJarByClass(pokerRunner.class); 5              6     job.setMapperClass(pakerMapper.class); 7     job.setReducerClass(pakerRedue.class); 8              9     job.setMapOutputKeyClass(Text.class);10     job.setMapOutputValueClass(Text.class);11             12     job.setOutputKeyClass(Text.class);13     job.setOutputValueClass(NullWriter.class);14             15     FileInputFormat.addInputPath(job, new Path(args[0]));16     FileOutputFormat.setOutputPath(job, new Path(args[1]));17             18     job.waitForCompletion(true);

 

2. Running result

File System Counters

FILE: Number of bytes read = 87

FILE: Number of bytes written = 211167

FILE: Number of read operations = 0

FILE: Number of large read operations = 0

FILE: Number of write operations = 0

HDFS: Number of bytes read = 366

HDFS: Number of bytes written = 6

HDFS: Number of read operations = 6

HDFS: Number of large read operations = 0

HDFS: Number of write operations = 2

Job Counters

Launched map tasks = 1

Launched reduce tasks = 1

Data-local map tasks = 1

Total time spent by all maps in occupied slots (MS) = 109577

Total time spent by all CES in occupied slots (MS) = 42668

Total time spent by all map tasks (MS) = 109577

Total time spent by all reduce tasks (MS) = 42668

Total vcore-seconds taken by all map tasks = 109577

Total vcore-seconds taken by all reduce tasks = 42668

Total megabyte-seconds taken by all map tasks = 112206848

Total megabyte-seconds taken by all reduce tasks = 43692032

Map-Reduce Framework

Map input records = 49

Map output records = 9

Map output bytes = 63

Map output materialized bytes = 87

Input split bytes = 110

Combine input records = 0

Combine output records = 0

Reduce input groups = 4

Reduce shuffle bytes = 87

Reduce input records = 9

Reduce output records = 3

Spilled Records = 18

Shuffled Maps = 1

Failed Shuffles = 0

Merged Map outputs = 1

GC time elapsed (MS) = 992

CPU time spent (MS) = 3150

Physical memory (bytes) snapshot = 210063360

Virtual memory (bytes) snapshot = 652480512

Total committed heap usage (bytes) = 129871872

Shuffle Errors

BAD_ID = 0

CONNECTION = 0

IO_ERROR = 0

WRONG_LENGTH = 0

WRONG_MAP = 0

WRONG_REDUCE = 0

File Input Format Counters

Bytes Read = 256

File Output Format Counters

Bytes Written = 6

3. Running Method

Compile it in Eclipse, generate a jar package, upload it to the linux system, and run the file on the cluster.

Run the command: bin/hadoop **. jar class package name/

Example: bin/hadoop **. jar com. test. mr/

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.