India's Java programmer Shekhar Gulati posted "How I explained mapreduce to my wife?" on his blog ?" This article describes the concept of mapreduce. The translation is as follows:Huang huiyu.
Yesterday, I gave a speech about mapreduce in xebia's office in India. The speech went smoothly and the audience were able to understand the concept of mapreduce (based on their feedback ). I was excited to explain the concept of mapreduce to technical audiences (mainly Java programmers, some flex programmers and a few testers. After all our hard work, we had a full dinner in xebia's Indian office and I went home.
When I got home, my wife (Supriya) asked, "How are you going to drive ?" I said it was good. Then she asked me what the meeting was about (she was not engaged in software or programming ). I told her mapreduce. "Mapduce, what is that ?" She asked, "Is it related to topographic maps ?" No, no. It has nothing to do with topographic maps. "So what is it like ?" Asked the wife. "Well... Let's go to dominos (pizza chain). I will explain it to you on the dining table ." The wife said, "okay ." Then we went to the pizza shop.
After we ordered a meal at domions, the guy at the counter told us that it would take 15 minutes to prepare the pizza. So I asked my wife, "What do you really want to know about mapreduce ?" She firmly replied "yes ". So I asked:
Me:How did you prepare onion chili sauce? (The following is not an accurate recipe. Do not try it at home)
Wife:I will take an onion, chopped it, mixed with salt and water, and finally put it into the hybrid Grinding Machine for grinding. In this way, you can get the onion chili sauce. But what does this have to do with mapreduce?
Me:Wait. Let me compile a complete plot so that you can understand mapreduce within 15 minutes.
Wife:Okay.
Me:Now, suppose you want to use mint, onion, tomato, chilies, and garlic to get a bottle of mixed chili sauce. What would you do?
Wife:I will take a handful of mint leaves, one onion, One tomato, one chili, one garlic, chopped and then add some salt and water, and then put it into a hybrid Grinding Machine for grinding, in this way, you can get a bottle of mixed chili sauce.
Me:That's right. Let's apply the concept of mapreduce to recipes. Map and reduce are two types of operations. I will explain them in detail.
Map: chopped onions, tomatoes, chilies, and garlic are a map operation on each of these objects. So if you give map an onion, map will chopped the onion. Similarly, if you give the chilies, garlic, and tomatoes one by one to map, you will also get various fragments. Therefore, when you cut vegetables like onions, you perform a map operation. The map operation is applicable to each type of vegetables, which produces one or more fragments accordingly. In our example, vegetable fragments are produced. In the map operation, an onion may break down. You only need to drop the bad onion. Therefore, if a bad onion occurs, the map operation filters out the bad onion and does not produce any bad onion.
Reduce: At this stage, you put all kinds of chopped vegetables into a grinder for grinding, and you will get a bottle of chili sauce. This means to make a bottle of chili sauce, you have to grind all the ingredients. As a result, the grinder typically aggregates the vegetables operated by map.
Wife:So this is mapreduce?
Me:You can say yes or no. In fact, this is only part of mapreduce. The strength of mapreduce lies in distributed computing.
Wife:Distributed Computing? What is that? Please explain it to me.
Me:No problem.
Suppose you have participated in a chili sauce competition and your recipe has won the Best chili sauce award. After the prize, the recipe for chili sauce is very popular, so you want to start selling homemade brand chili sauce. Suppose you need to produce 10000 bottles of chili sauce every day, what would you do?
Wife:I will find a supplier that can provide a large number of raw materials for me.
Me:Yes ...... That's it. Can you complete the production on your own? That is to say, splitting raw materials alone? Can a single grinding machine meet the needs? And now, we also need to supply different kinds of chili sauce, such as onion chili sauce, green pepper chili sauce, tomato chili sauce and so on.
Wife:Of course not. I will hire more workers to cut vegetables. I need more grinding machines so that I can produce chili sauce more quickly.
Me:That's right, so now you have to allocate work. You will need a few people to cut vegetables together. Everyone needs to handle a full bag of vegetables, and each person is equivalent to performing a simple map operation. Every person will take out vegetables from the bag, and treat only one type of vegetables at a time, that is, chopped them until the bag is empty.
In this way, after all the workers have finished cutting, the workbench (where everyone works) has onion blocks, tomato blocks, and garlic.
Wife:But how can I make different kinds of ketchup?
Me:Now you will see the stage of mapreduce omission-the mixing stage. Mapreduce splits all the output vegetables together, which are produced by key-Based Map operations. The stirring is automatically completed. You can assume that the key is a raw material name, just like an onion. So all the onion keys will be stirred together and transferred to the grinder for grinding the onion. In this way, you can get the onion chili sauce. Similarly, all the tomatoes will be transferred to the grinder labeled with the tomatoes and produced with the tomato chili sauce.
Finally, the pizza was ready. She nodded and said she had understood what mapreduce was. I only hope that the next time she hears mapreduce, she can better understand what I am doing.
Note: The following section explains mapreduce in the simplest language on the Internet:
We want to count all the books in the library. you count up shelf #1, I count up shelf #2. that's map. the more people we get, the faster it goes.
We need to count all the books in the library. You counted bookshelves 1 and I counted bookshelves 2. This is "map ". The more people we have, the faster the number of books.
Now we get together and add our individual counts. That's reduce.
Now let's combine the statistics of all people. This is "reduce ".
Address: http://cloud.csdn.net/a/20110826/303688.html