One months ago, I was asked what is functional programming? Although familiar with some of the concept of functional programming, the Little Schemer bought from Canada six months ago also read the previous chapters, that day is not able to answer what is functional programming. Functional programming is a strange field for programmers familiar with procedural programming, and concepts such as closures (closure), continuations (continuation), and currying are a nightmare for programmers with procedural programming.
Without understanding functional programming, you can ' t invent mapreduce,the the algorithm that makes the Google so massively Scalable. The terms Map and Reduce come from LISP and functional programming. MapReduce is, in retrospect, obvious to anyone who remembers from misspelling 6.001-equivalent the class that programming Functional programs have no side effects and are thus trivially. The very fact that Google invented MapReduce, and Microsoft didn't ' t, says something about why Microsoft are e.g. playing catch up Trying to get basic search features to work while Google Super-delegates moved on to the next problem:building skynet^h^h^h^h^h^h the World ' s largest massively parallel supercomputer. I don ' t have the Microsoft completely understands ethically how far behind tightly in that wave.
The previous paragraph, from Joel Spolsky's blog, clearly explains that the functional programming model is the inspiration for MapReduce.
MapReduce's name derives from two core operations in the functional programming model: map and reduce. Perhaps those who are familiar with functional programming (FP) will be more gracious to see these two words. Because the terms map and reduce are derived from the Lisp language and functional programming. A map is a one-to-one mapping of a set of data to another set of data whose mapping rules are specified by a function. Reduce is about a set of data, which is specified by a function. Map is a process of separating data, and reduce is the process of merging separate data. A wordcount example of Hadoop: mapping [One,word,one,dream] with map becomes [{one,1}, {word,1}, {one,1}, {dream,1}], and then use reduce to [{one,1}], { word,1}, {one,1}, {dream,1}] is converted to a result set of [{one,2}, {word,1}, {dream,1}].
The map and reduce operations are independent of each element and operate without side effects. Reduce is a reduction of n map results, which is equivalent to map[1,2,.. The result of n] is the parameter of the function of reduce. The order of evaluation in an imperative language is certain that each function may change or depend on the external state, so the functions must be executed in an orderly manner. As long as you ensure that no functions are modified or are dependent on global variables, the order of the N map execution can be mapreduce for the model, or parallel processing of large-scale operations. So MapReduce is also parallel, the large-scale operation is relatively independent. For the data required in the map and reduce functions, it is not necessary to read all at once in a higher-order function, each time one of the data required by the function is read, in this sense, the concept of lazy evaluation.
In functional programming, the function of an action function is called a higher order function (High-order function). The MapReduce model is the function of manipulating user code. As long as the two high-order functions of map and reduce are processed in parallel, a parallel computing framework based on map and reduce is formed. The specific application of the relevant code in the user code, and then combined with MapReduce to gain the ability to parallel processing, all the complexity in the model hidden.
It is the concept of functional programming that software is easier to assemble, parallel, and operate without side effects, lazy evaluation, higher-order function--mapreduce model. Also have this sentence: Without understanding functional programming, you can ' t invent MapReduce.
Glad talents, also hope that predecessors can a lot of guidance. If you have learned so much, you should hand in your homework. Implement MapReduce (Javascript Edition) with functional programming.
* * MAPRED-A MapReduce Model implement of Javascript. * Author:josh ma * email:beijing.josh@gmial.com * Copyright (c) 2008 hadoop.org.cn * Dual licensed under the MIT ( Mit-license.txt) * and GPL (gpl-license.txt) licenses. * * * $Date: 2008-8-11 14:05 AM; */ //if client did not have Firebug if (!window["console") {console={log:function (s) {alert (s)}}} /*a currying functions.original url:http://www.coryhudson.com/blog/2007/03/10/javascript-currying-redux/i Modify some Thing.*/ function Curry (f, N) {if (typeof (n) = = ' undefined ') n = f.length; var args = arguments; var _this = this; function () {if (arguments.length &gt;= N) {return f.apply (_this, arguments);} else {var a = [].slice.call (arguments,0) ; Return Args.callee.call (_this,function () {return f.apply (_this, A.concat ([].slice.call (arguments,0)))},n-a.length) ; } };} //extend the property and function of S to d function extend (d, s) {for (p in s) {d[p] = s[p]; &Nbsp;//the iterator of object and Array.var each = function (o, F, args) {if (Object.prototype.toString.apply (o) = = ' [Object Array] for (var i = 0, ol = o.length, val = o[0]; i < ol &&f.apply (Val, args? args: [I,val])!== false; val = O[++i ] ); else for (var i in O) f.apply (o[i], args? args: [I, O[i]]); //the MapReduce Objectvar mapred = {}; extend (mapred,{ Pro: {load:function () {return[];}, Save: function () {return;}, Map:null, Reduce:null, Combiner:null}, //mapred configuration revisit: Function (Pro) {This.pro = Extend (This.pro,pro)}, //cache the data cache:{}, //curried this method,the The parameter is given by This.run function. The remains is wait for mapred write data. emit:curry.call (mapred,function (met,key,value) {if (!this.cache[met)) {This.cache[met] = {};} if (!this.cache[met][key]) {This.cache[met][key] = [];} this.cache[met][key].push (value); }), /* The mapred main function,a High-ordEr function, expect pager then one parameter. Client execute this function without given pager then one parameter, return a function to waiting util is given one parameter to run This method. * * Run:curry.call (mapred,function (map,reduce,combine) { var emit = This.emit;//iterator the input data each ( This.pro.load (), function (Kay,value) {map (Kay,value,emit ("map")); //get the cache of map data var data = this.cache["Map"]; if (combine) {//iterator The map data each (data,function (key,value) {Combine (Key,value,emit ("Combine")); this.cache["Map"]; } //get the cache of map or combiner data = combine? this.cache["Combine"]: this.cache["map"]; if (reduce) {//iterator The map data or combiner data each (data,function (key,value) {Reduce (Key,value,emit ("reduce")); Delete Combine? this.cache["Combine"]: this.cache["map"]; } //call Save function to save data; This.pro.save (Reduce this.cache["reduce"]: this.cache["map"); Delete this.cache["reduCe "]; After save data,release the cache; This.cache = {}; },1), //end mapred //the input of Data,like the InputFormat of Hadoop var Outload = function () {return ["One World,one dream!", "hello,world!"];} //save the Result,like the outputformat of Hadoop var Outsave = function (data) {Each data,function (index, Value) {Console.log (index + "\ T" + value + "\ n");} //a map/reduce Job Configuration.like jobconf of Hadoop mapred.configuration ({load:outload,save:outsave}) ; //map Function var Map = function (key,value,emit) {STRs = Value.split (/\s+|!|,/); Value) {if (value) emit (value,1);}); Reduce Function var reduce = function (Key,value,emit) {var v = 0; each (value,function (index,value) {v = value;}); Emit (KEY,V); //sumbit the Job,like the Jobclient.runjob method of Hadoop //mapred.run () () () (map)//mapred.run (map) Mapred.run (map,reduce)//mapred.run (map,reduce,reduce)
Running Code Instances
Reference:
Functional programming: An alternative guide to functional programming
Dream Storm article: MapReduce
Functional Programming:functional Programming and MapReduce