Business Logic
In fact, it is very simple to input two files, one as the basic data (student information file) and the other is the score information file.
Student Information File: stores student data, including student ID and Student name
Score data: stores scores of students, including student IDs, subjects, and scores.
We will use M/R to associate data based on student IDs. The final result is student name, subject, and score.
Analog data
Student Data
[[Email protected] student_data] $ cat students.txt
1 Randy
2 Tom
3 kitty
4 Lucy
5 Lily
6 Bruce
7 king
8 Jay
9 melody
10 Kimy
//////////////////////////////////////// //////////////////////////////////////// //////////////////////////////////////// /////////////
Score data
[[Email protected] student_data] $ cat scores.txt
1 English 89
2 English 77
3 English 54
4 English 98
5 English 83
6 English 99
7 English 30
8 English 76
9 English 56
10 English 88
1 math 79
2 math 37
3 math 65
4 math 88
5 math 89
6 math 59
7 math 60
8 math 86
9 math 56
10 math 68
1 China 89
2 China 67
3 China 84
4 China 68
5 China 43
6 China 89
7 China 70
8 China 96
9 China 56
10 China 78
//////////////////////////////////////// //////////////////////////////////////// ///////////////////////
Implementation
1) Two text parsers parse two text files respectively.
This article is from the "simple" blog, please be sure to keep this source http://dba10g.blog.51cto.com/764602/1565697
Hadoop native mapreduce for Data Connection