A brief analysis of JavaScript MapReduce Working principle _ basic knowledge

Source: Internet
Author: User

Google has published three of its most influential articles in 2003-2006 years, the MapReduce released on OSDI in 2003 on Sosp, and the OSDI released in 2006 at BigTable. GFS is a file system-related, which is instructive to the later distributed File system design; MapReduce is a parallel computing programming model for job scheduling; BigTable is a distributed storage system for managing structured data, built on GFS, Chubby, Sstable and other Google technologies. A significant number of Google Apps use these three technologies, such as Google Search, Google Earth, Google Analytics, and so on. So these three kinds of technology and called Google Technology "Sambo." Today, D melon to swim, to MapReduce to a "Sunding"!

MapReduce Introduction
MapReduce is a programming model and also a related implementation of an algorithm model for processing and generating large datasets. The user first creates a
A map function deals with a set of data based on Key/value pair and outputs a set of data based on Key/value pair in the middle;
Then create a reduce function to combine all the intermediate value values with the same intermediate key value.
A picture wins thousand words, below we use a picture to explain MapReduce:


Programming Practice
As the saying goes: "Practice is true". A mule is a horse that pulls out for a stroll to know. So, if you really want to understand this principle, or write code in person to practice is the hard truth.
I've been learning JavaScript with a couple of friends recently, so I'm more interested in JavaScript. Yesterday, when surfing the internet, I was surprised to find that there are cows who use JavaScript to implement the MapReduce algorithm. Then turn around and share with you, along with some of my own Dog introduction, hope to help you understand mapreduce. The specific code implementation is as follows:

Copy Code code as follows:

var Job = {
Data to be processed
Data: [
"We are glad to the. This site are dedicated to ",
"Poetry and to the" people who make poetry possible ",
"Poets and their readers. Famouspoetsandpoems.com is ",
"A free poetry site." On my site you can find a large ",
"Collection of poems and quotes from over 631 poets",
"Read and Enjoy poetry",
"I, too, sing America",
"I am the darker brother",
"They send me to eat in the kitchen",
"When company Comes",
"But I Laugh",
"And eat",
"And Grow Strong",
"Tomorrow",
"Ill is at the table",
"When company Comes",
"Nobodyll Dare",
"Say to Me",
"Eat in the Kitchen",
"Then",
"Besides",
"Theyll to beautiful I am"
"And be Ashamed",
"I, too, am America"
],
Separate each line of string in the data by a space,
and "reorganized" into objects such as {key: Word, value:1}, returning an array of objects
Map:function (line) {
var splits = Line.split ("");
var temp = [];
for (var i=0; i<splits.length; i++) {
Temp.push ({key:splits[i], value:1});
}
return temp;
},
Calculates the number of times each word appears in data
Reduce:function (allsteps) {
var result = {};
for (var i=0; i<allsteps.length; i++) {
var step = allsteps[i];
Result[step.key] = Result[step.key]? (Result[step.key] + 1): 1;
}
return result;
},
Initialized, at the same time is the running portal.
Init:function () {
var allsteps = [];
for (var i=0; i<job.data.length; i++) {
It would be more realistic if you could call the Job.map function here multiple threads.
Allsteps = Allsteps.concat (Job.map (job.data[i));
}
In a ointment, this can not be multi-threaded call job.reduce function??
var result = Job.reduce (allsteps)
Console.log (json.stringify (result));
}
}; Job
Start execution
Job.init ();

Copy the code, paste it directly into the console of the browser, or put it in an HTML file, open in a browser, and see the effect in the console output as follows:

Ointment
After this article is published, there are netizens "roaring": "A not even multithreading have no JS what mapreduce ah?" "In fact, this problem, D Melon also found." After seeing the explanation of this code, D-melon is wondering if JavaScript is not a single process? How can you simulate mapreduce? In the careful reading of the code, one step after debugging, more confirmed that D melon's view. (The question about D-Melon has been commented out in the code.) )
However, again, these do not affect our understanding of the principles of MapReduce. This is just a single process, the most basic version. Understanding this first, and then the whole multithreading may be easier to understand.

To be continued
In fact, D Melon now consider on the basis of this example, with Java implementation of a multithreaded version, so that the simulation of the MapReduce more realistic. Wait for D melon to think about some questions clearly, then send out the code. Please look forward to it!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.