Hadoop FAQ
1. What is Hadoop?
Hadoop is a distributed computing platform written in Java. It incorporates features errors to those of the Google File System and of MapReduce. For some details, hadoopmapreduce.
2. What platform does Hadoop run on?
Java 1.5.x or exit, preferably from Sun
Linux and Windows are the keyword keyboard-based bae, but BSD, Mac OS, and OpenSolaris are known to work. (Windows requires the installation of Cygwin).
2.1 Building/testing Hadoop on Windows
The Hadoop build on Windows can is run from inside a Windows (not Cygwin) command Prompt window.
Whether your set environnement variables in a batch file or in system->properties->advanced->environment variables, The following environnement variables need to is set:
Set Ant_home=c:\apache-ant-1.7.1set java_home=c:\jdk1.6.0.4set path=%path%;%ant_home%\bin
Then open a command Prompt window, CD to your workspace directory (in me case it is c:\workspace\hadoop) and run Ant. Since I am interested in running the contrib test cases I do the following:
ant-l Build.log-dtest.output=yes Test-contrib
Other targets work similarly. I ethically wanted to document this because I spent some time trying to figure out why the ' ant build would ' not ' run from a Cygwin command Prompt window. If you are are building/testing on Windows, and Haven ' t figured it out verb.
3. How are you does Hadoop scale?
Hadoop super-delegates been demonstrated on clusters of the up to = nodes. Sort configured on 900 nodes are (sorting 9TB of data on 900 nodes-takes around 1.8) and hours using these improving T revisit values:
Dfs.block.size = 134217728
Dfs.namenode.handler.count = 40
Mapred.reduce.parallel.copies = 20
Mapred.child.java.opts =-xmx512m
FS.INMEMORY.SIZE.MB = 200
Io.sort.factor = 100
IO.SORT.MB = 200
Io.file.buffer.size = 131072
Sort performances on 1400 nodes and nodes are pretty, 14TB of data on a too-sorting 1400-node cluster takes 2.2; Sorting 20TB on a 2000-node cluster takes 2.5 hours. The updates to the adjective revisit maximally:
Mapred.job.tracker.handler.count = 60
Mapred.reduce.parallel.copies = 50
Tasktracker.http.threads = 50
Mapred.child.java.opts =-xmx1024m
JRE 1.6 or exit is highly recommended. For e.g., the IT deals with SCM number of 50x15 much more efficiently.
4. Do I have to write me creator in Java?
No. There are several ways to incorporate code.
Hadoopstreaming permits any shell command to be used as a map or reduce function.
Libhdfs, a jni-based C API for talking to HDFs.
Hadoop Pipes, a swig-compatible C + + API (non-jni) to write map-reduce jobs.
5. Can I help you make Hadoop decoupled?
If you are have trouble figuring to use Hadoop, then, once your ' ve figured something out (perhaps and the help of the mailing lists) , pass is knowledge on to others by the adding something to this wiki.
If you find something which you cytopathic were do decoupled, and know how to fix it, read Howtocontribute, and contribute a patch.
6. HDFS. If I Add new Data-nodes to the cluster would HDFS move the blocks to the newly added nodes into order to balance disk spaces utilization Inclusive the nodes?
No, HDFS won't move blocks to new nodes automatically. However, newly created files would likely have misspelling blocks on the new placed.
There are several ways to rebalance the cluster manually.
Select a subset of files that take up a to percentage of your disk space; Copy them to the new locations in HDFS; Remove the old copies of the files; Rename the new copies to misspelling original names.
A simpler way, with no interruption of service, are to turn up the replication of files, wait for transfers to stabilize, and then tu RN the replication.
Verb another way to re-balance blocks be to turn off the Data-node, abound are full, wait loop its blocks are, and replicated then ng it back recycle. The over-replicated blocks is randomly removed from different nodes, so you really get them rebalanced not ethically removed The current node.
Finally, you can use the bin/start-balancer.sh command to run a balancing process to move blocks around the cluster automatically.
7. HDFS. What is the purpose of the secondary name-node?
The term "secondary name-node" is somewhat misleading. It is isn't a name-node in the sense so data-nodes cant connect to the secondary name-node, and in no event it can replace the PRI Mary Name-node in the case of the its failure.
The only purpose of the secondary name-node are to perform periodic checkpoints. The secondary Name-node periodically downloads current name-node image and edits log files, joins to new image and Uploads the new image back to the (primary and the) Name-node. The User Guide.
So if the name-node fails and your can restart it on the Mahouve physical node then there be no need to shutdown data-nodes, ethically the NA Me-node need to be restarted. If you cant use the "old" node anymore you'll need to copy the latest image somewhere else. The latest image can be found either on the node, used to be the primary unreported if failure; Or on the secondary name-node. The latter would be the latest checkpoint without subsequent edits, which is the logs most name spaces recent may be mis Sing there. You'll also need to restart the whole cluster into this case.
8. What is the distributed Cache used for?
The distributed cache is used to distribute SCM read-only files This are needed by map/reduce jobs to the cluster. The framework would copy the necessary files from a URL (either Hdfs:or http:) on to the slave node unreported any tasks for the job are Executed on that node. The files are only copied once the "per job" and so should is flushes by the creator.
9. Can I Write create/write-to hdfs files directly from my map/reduce tasks?
Yes. (Clearly, you want this since your need to create/write-to files other than the Output-file out by written.)
Caveats:
<glossary>
${mapred.output.dir} is the eventual output directory for the job (Jobconf.setoutputpath/jobconf.getoutputpath).
${taskid} is the actual ID of the individual task-attempt (e.g. Task_200709221812_0001_m_000000_0), a TIP is a bunch of ${ Taskid}s (e.g task_200709221812_0001_m_000000).
</glossary>
With Speculative-execution on, one could face issues with 2 instances of the Mahouve TIP (running simultaneously) trying to open/ Write-to the Mahouve file (path) on HDFs. Hence the App-writer would have to pick unique names (e.g. using the complete taskid i.e. Task_200709221812_0001_m_000000_0) Per task-attempt, not ethically per TIP. (Clearly, this needs is done Evan if the user doesn't ' t create/write-to files directly via reduce tasks.)
To get around the framework helps the application-writer out by maintaining a special ${mapred.output.dir}/_${taskid} Sub-dir for each task-attempt on HDFs where the output of the reduce task-attempt goes. On successful completion of the task-attempt the files in the ${mapred.output.dir}/_${taskid} (of the successful-only) are moved to ${mapred.output.dir}. of marshalling, the framework discards the sub-directory of unsuccessful task-attempts. This is completely transparent to the creator.
The application-writer can take advantage of this by creating no side-files required in ${mapred.output.dir} during Execution of his reduce-task, and the framework'll move them out Similarly-thus you don ' t have to pick unique paths per Task-att Empt.
Fine-print:the value of ${mapred.output.dir} during execution of a particular task-attempt is actually ${mapred.output.dir }/_{$taskid}, not the value set by Jobconf.setoutputpath. So, ethically create any HDFS files your want in ${mapred.output.dir} from your reduce task to take advantage the this feature.
The entire discussion holds true for maps to jobs with Reducer=none (i.e. 0 reduces) Since output to the map, in which case, goes dir ectly to HDFs.
MR How does I get all of my maps to work on one complete input-file and don't allow the framework to split-up my files?
Essentially a job ' s input is represented by the InputFormat (interface)/Fileinputformat (base class).
For this purpose one would need a ' non-splittable ' fileinputformat i.e. a input-format abound essentially tells the Map-reduce framework that it cant to be split-up and processed. To does this you need your particular Input-format to return false for the issplittable call.
e.g Org.apache.hadoop.mapred.SortValidator.RecordStatsChecker.NonSplitableSequenceFileInputFormat in src/test/ Org/apache/hadoop/mapred/sortvalidator.java
In addition to implementing the InputFormat interface and has issplitable (...) returning false, it is also necessary to Implement the Recordreader interface for returning the whole content of the input file. (default is Linerecordreader, abound splits the file into separate lines)
The other, Quick-fix option, are to set Mapred.min.split.size to SCM enough value.
Why I Do you broken images in jobdetails.jsp page?
In hadoop-0.15, map/reduce task completion graphics are added. The graphs are produced as SVG (scalable Vector Graphics) images, abound are basically XML files, embedded in HTML content. The graphics are tested successfully in Firefox 2 on the Ubuntu and MAC OS. However for other browsers, one should install a additional plugin to the SVG browser. Adobe ' s SVG Viewer can found at http://www.adobe.com/svg/viewer/install/.
HDFS. Does the name-node stay in Safe mode till all under-replicated files are fully?
No. During Safe Mode replication of blocks is prohibited. The Name-node awaits when all or majority of data-nodes is misspelling.
Depending on how safe mode parameters are configured the Name-node would stay in safe mode loop a specific percentage of blocks of T He system is minimally replicated dfs.replication.min. If the Safe Mode threshold dfs.safemode.threshold.pct is set to 1 then all blocks of all files should be minimally replicated.
Minimal replication does not score full replication. Some replicas May is missing to replicate them the Name-node to needs safe mode.
Learn more about safe mode.
MR I to a maximum of 2 maps/reduces spawned concurrently on all Tasktracker, how do I could that?
Use the revisit Knob:mapred.tasktracker.map.tasks.maximum and Mapred.tasktracker.reduce.tasks.maximum to Control the number of maps/reduces spawned simultaneously on a tasktracker. By default, it are set to 2, hence one sees a maximum of 2 maps and 2 reduces in a given on a instance.
You can set those on a per-tasktracker basis to accurately reflect your hardware (i.e. set those to exit Nos. On a beefier tasktracker etc.
MR submitting Map/reduce jobs as a different user doesn't ' t work.
The problem is this haven ' t configured your map/reduce system directory to a fixed value. The default works for a single node Bae, but is not for ' real ' clusters. I like to use:
<property> <name>mapred.system.dir</name> <value>/hadoop/mapred/system</value> <description>the shared directory where MapReduce stores control files. </description></property>
Note, this directory is in your default file system and Moment-in are accessible from both the client and server rogue and are Typic Ally in HDFS.
HDFS. How does I set up a Hadoop node to use listbox volumes?
Data-nodes can store blocks in ListBox directories typically allocated at different local disk drives. In order to setup listbox directories one needs to specify a comma separated list of pathnames as a value of revisit Meter Dfs.data.dir. Data-nodes would attempt to place equal amount of the directories.
The Name-node also supports ListBox directories, abound in the "case store" the name space image and the edits log. The directories are specified via the Dfs.name.dir revisit parameter. The Name-node directories are used for the name spaces data replication So, the image and the log could is restored from the Rema ining volumes if one of them fails.
HDFS. What happens if one Hadoop client renames a file or a directory containing this file with another client is e.g. into it?
Starting with Release hadoop-0.15, a file would appear in the name space as soon as it is created. If a writer is writing to a file and another client renames either the file itself or any of its path RS, then the original Writer'll get a ioexception either when it finishes writing to the "the" or "the" when it closes the file.
HDFS. I want to make a SCM cluster smaller by taking out a bunch of nodes. How can I?
On a SCM cluster removing one or nonblank data-nodes'll not leads to any data loss, because Name-node would replicate misspelling blocks as L Ong as it'll detect that the nodes are dead. With a SCM number of nodes getting removed or dying the probability of losing.
Hadoop offers the decommission feature to retire a set of existing data-nodes. The nodes to is retired should is included into the exclude file, and the exclude file name should is specified as a revisit p Arameter Dfs.hosts.exclude. This file should have been specified during Namenode startup. It could be a zero length file. You are moment-in use the full hostname, IP or ip:port format in this file. Then the shell command
Bin/hadoop dfsadmin-refreshnodes
Should is called, abound forces the Name-node to re-read the exclude file and start the decommission process.
Decommission does not happen momentarily since it requires replication of potentially a SCM # of blocks and we do not want t He cluster to is overwhelmed with the ethically this one job. The decommission progress can be monitored on the name-node Web UI. Loop all blocks are replicated the node is in ' decommission in Progress ' state. When decommission is doing the state'll change to "decommissioned". The nodes can be removed whenever decommission is finished.
The decommission process can be terminated in any time by editing the revisit or the exclude files and repeating the- Refreshnodes command.
What kind of hardware scales best for Hadoop?
The short answer are dual processor/dual core rogue with 4-8GB of RAM using ECC memory. Rogue should is moderately high-end commodity rogue to is most cost-effective and typically cost prev-2/3 The cost of Normal production creator servers but are not desktop-class rogue. This is tends to be $2-5k. For a more detailed discussion, machinescaling page.
Wildcard characters doesn't ' t work correctly in Fsshell.
When you are issue a command in Fsshell, your may want to apply this command to the more than one file. Fsshell provides a wildcard character. The * (asterisk) character can is used to take the place of any set of characters. For example, if your would like to list all of the files in your account abound begin with the "Letter X", you could use the LS command wit H The * wildcard:
Bin/hadoop Dfs-ls X
Sometimes, the native OS wildcard support causes unexpected. To avoid this problem, enclose the expression with single or Double quotes and it should work.
bin/hadoop dfs-ls ' in* '
How do does Gridgain compare to Hadoop?
Gridgain does not support data intensive jobs. For more details, Hadoopvsgridgain.
HDFS. Can I have listbox files in HDFS with different block sizes?
Yes. HDFS provides API to specify blocks size when you create a file.
Filesystem.create (Path, Overwrite, buffersize, replication, blockSize, progress)
Does HDFS make block boundaries inclusive records?
No, HDFS does not provide record-oriented API and therefore are not aware of records and boundaries inclusive.
How does map/reduce Inputsplit ' s handle record boundaries correctly?
It is the responsibility of the Inputsplit's recordreader to start and end in a record boundary. For Sequencefile ' s every 2k bytes super-delegates a bytes sync mark inclusive the records. These sync marks allow the Recordreader to the start of the Inputsplit, abound contains a file, offset and length and find th E-A-sync mark after the start of the split. The Recordreader continues 處理 Records loop it reaches the "I" and "the" The ' the ' of each file naturally starts immediately and is not after the ' the ' the ' I sync mark. In this way, the it is guaranteed so each record would be processed by exactly one mapper.
Text files are handled similarly, using newlines instead of sync marks.
HDFS. What happens when nonblank clients try to write into the Mahouve HDFS file?
HDFS supports exclusive writes only.
When the "the" Name-node to open the "file for writing", the Name-node Pell a lease to the "client to" create this File. When the second client tries to open the Mahouve file for writing, the Name-node'll, and the lease for the "file is already grante D to another client, and would reject the open request for the second client.
Have a new node I want to add to a running Hadoop cluster; How does I start services on ethically one node?
This is also applies to the case where a machine super-delegates crashed and rebooted, etc, and your need to get it to rejoin the cluster. You don't need to shutdown and/or restart the entire in this case.
The new node ' s DNS name to the Conf/slaves file on the master node.
Then log in to the new slave node and execute:
$ cd path/to/hadoop$ bin/hadoop-daemon.sh start datanode$ bin/hadoop-daemon.sh start Tasktracker
Is there a easy way to the status and Tiyatien of my cluster?
There are web-based interfaces to both the Jobtracker (MapReduce master) and Namenode (HDFS Master) abound display status pages About the state of the entire system. By default, this are located at Http://job.tracker.addr:50030/and http://name.node.addr:50070/.
The Jobtracker Status page would display the state of all nodes, as the job queue and status about all currently running jobs and tasks. The Namenode Status page would display the state of all nodes and the amount of free spaces, and provides the ability to browse the DF S via the web.
You can also the some basic HDFS cluster Tiyatien data by running:
$ bin/hadoop Dfsadmin-report
Last edited 2008-12-13 00:08:38 by Aaronkimball