Understanding Aggregations
In a relational database, aggregation is the merging of data records based on a field in the data record. For example, the amount of each order needs to be counted from the order schedule, usually following several steps:
- Extract the order number and amount of each record from the order schedule;
- Group the amount according to the order number so that each group amount has the same order number;
- Sums the sum of each group of amounts, returning the result in the form of the order number-total amount;
Understanding MapReduce
In MapReduce, the map function does include the first two steps of the process, where the emit function is responsible for the implementation of step one. The result of the map function is more than one Key-values collection, Note: Here a key corresponds to multiple values, that is, one of the preceding order numbers corresponds to multiple amounts.
Next, the return value of map is used as the parameter of the reduce function, which is combined by reduce for each set of data, that is, the third step of the above process is summed.