Build a Hadoop source code learning Environment

Last Update:2014-12-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Hadoop file directory Related

The previous article put Hadoop as a simple primer, download the source code, wrote HelloWorld, briefly analyzed its programming essentials, and then made a more complex example. Next look at its source code, the study of how to achieve.

Research the source, then we will first look at the overall Hadoop-0.20.2 directory:

This is the content in the directory list just after the code is finished

Catalog/File	Description
Bin	Below is the executable sh named, all operations are here
Conf	directory where the configuration files are located
Ivy	Apache Ivy is dedicated to managing the project's jar dependencies, which is the main directory of Ivy
Lib	Referenced library file directory, which stores the jar package used
Src	This is the main source.
Build.xml	The configuration file used for compiling. We're compiling with Ant.
CHANGES.txt	Text file that records the change history of this version
Ivy.xml	Ivy's configuration file
LICENSE.txt	Document This document
NOTICE.txt	A text file that records the place to be noticed
README.txt	Description file.

Into the SRC directory, you can see the contents as shown:

Build a learning environment for Hadoop source

To create a common Java project:

Click Next, enter the project name: Hadoopsrcstudy, and then next

Then all defaults to the next, then finish finishes:

Next, add the source code, open the SRC folder under Hadoop, and copy the core,hdfs,marped three directories into the project. (first select the three folders Ctrl + C, then go back to Eclipse, select the Hadoopsrcstudy project, and then press CTRL + V directly)

After the folder is added, now these three folders can not be compiled as source code, so we right-Jian project properties:

Then select Java Build Path, select Source on the Right tab, then click Add Folder:

In the pop-up page, select Core, HDFs, mapred three directories, then click OK two times to complete the setup.

Create a Jar folder in the source directory. The jar files in the following directories are then copied in.

Hadoop-0.20.2/build/ivy/lib/hadoop/common/*.jar

Hadoop-0.20.2/lib/jsp-2.1/*.jar

Hadoop-0.20.2/lib/kfs-0.2.2.jar

Hadoop-0.20.2/lib/hsqldb-1.8.0.10.jar

Then right Jian project, select Property page, on BuildPath page, select Libraiers:

Click Add Jars:

Select all the jar files under the Jar folder and click OK two times.

There are still bugs in the Rcctask file:

On the right-hand menu of the file, build Path->exclude.

Then the hadoop-0.20.2 directory under the Conf folder Core-site.xml, Hdfs-site.xml, Mapred-site.xml, log4j.properties These files, placed in the SRC directory,

Under the SRC folder under the hadoop-0.20.2 directory, copy the WebApps to the SRC directory.

In eclipse, the SRC directory is built with a package named: Org.apache.hadoop and then hadoop-0.20.2\build\src\org\apahe\hadoop\ Package-info.java file, copy it to the package. The directory is as follows:

This is done by the source debugging environment.

Let Hadoop run in eclipse

The source code has been added, and has been compiled and passed, then you have to run in Eclipse, try to run normally.

Here we try to execute namenode with the command line, then run Datanode with Eclipse, and then open a command line, with the FS command, whether the previous content can be found.

1. Open terminal, CD into hadoop-0.20.2 directory, execute bin/hadoop namenode command

The following error occurred:

14/12/15 17:31:47 INFO datanode. Datanode:startup_msg:
/************************************************************
startup_msg:starting DataNode
startup_msg:host = ubuntu/127.0.1.1
Startup_msg:args = []
startup_msg:version = 0.20.2
Startup_msg:build =-r; compiled by ' Wu ' on Sun Nov 07:50:30 PST
************************************************************/
14/12/15 17:31:49 INFO IPC. Client:retrying Connect to server:localhost/127.0.0.1:9000. Already tried 0 time (s).
14/12/15 17:31:50 INFO IPC. Client:retrying Connect to server:localhost/127.0.0.1:9000. Already tried 1 time (s).
14/12/15 17:31:51 INFO IPC. Client:retrying Connect to server:localhost/127.0.0.1:9000. Already tried 2 time (s).
14/12/15 17:31:52 INFO IPC. Client:retrying Connect to server:localhost/127.0.0.1:9000. Already tried 3 time (s).
14/12/15 17:31:53 INFO IPC. Client:retrying Connect to server:localhost/127.0.0.1:9000. Already tried 4 time (s).
14/12/15 17:31:54 INFO IPC. Client:retrying Connect to server:localhost/127.0.0.1:9000. Already tried 5 time (s).
14/12/15 17:31:55 INFO IPC. Client:retrying Connect to server:localhost/127.0.0.1:9000. Already tried 6 time (s).

Terminal Input Command:

bin/start-all.sh

Console normal output:

14/12/15 17:34:25 INFO Datanode. Datanode:startup_msg:
/************************************************************
Startup_msg:starting DataNode
Startup_msg:host = ubuntu/127.0.1.1
Startup_msg:args = []
Startup_msg:version = 0.20.2
Startup_msg:build =-R; Compiled by ' Wu ' on Sun Nov 07:50:30 PST 2014
************************************************************/
14/12/15 17:34:25 INFO Common. Storage:storage Directory/tmp/hadoop-wu/dfs/data is not formatted.
14/12/15 17:34:25 INFO Common. Storage:formatting ...
14/12/15 17:34:26 INFO Datanode. Datanode:registered Fsdatasetstatusmbean
14/12/15 17:34:26 INFO Datanode. datanode:opened Info Server at 50010
14/12/15 17:34:26 INFO Datanode. Datanode:balancing Bandwith is 1048576 bytes/s
14/12/15 17:34:26 INFO mortbay.log:Logging to Org.slf4j.impl.Log4jLoggerAdapter (Org.mortbay.log) via Org.mortbay.log.Slf4jLog
14/12/15 17:34:26 INFO http. Httpserver:port returned by Webserver.getconnectors () [0].getlocalport () before open () is-1. Opening the Listener on 50075
14/12/15 17:34:26 INFO http. HttpServer:listener.getLocalPort () returned 50075 webserver.getconnectors () [0].getlocalport () returned 50075
14/12/15 17:34:26 INFO http. Httpserver:jetty bound to Port 50075
14/12/15 17:34:26 INFO mortbay.log:jetty-6.1.14
14/12/15 17:34:33 INFO mortbay.log:Started [email protected]:50075
14/12/15 17:34:34 INFO JVM. Jvmmetrics:initializing JVM Metrics with Processname=datanode, sessionid=null
14/12/15 17:34:34 INFO Metrics. Rpcmetrics:initializing RPC Metrics with Hostname=datanode, port=50020
14/12/15 17:34:34 INFO IPC. SERVER:IPC Server responder:starting
14/12/15 17:34:34 INFO IPC. SERVER:IPC Server Listener on 50020:starting
14/12/15 17:34:34 INFO IPC. SERVER:IPC Server Handler 0 on 50020:starting

2. In Eclipse, enter the HDFs directory, Then go to the Org.apache.hadoop.hdfs.server.datanode directory, open the Datanode.java file, then click on the above to run, then you can see in eclipse, the normal output information, and no errors. This information can be found in the log folder, the Datanode logs, the content is the same. Also in the previous command line form, you can see that an Datanode access request was received in the Namenode program.

3. Open a command-line window, enter the hadoop-0.20.2 directory Bin/hadoop Fs–ls, you can see the output file list.

4. Then enter the command Bin/hadoop Fs-cat out/* to see the data that was generated in the Out directory before the program was run.

If all two of the above commands succeed, the Namenode and Datanode running in eclipse work. It can be observed that when we execute the cat command, in the output box in eclipse, we see a new response output indicating that it is working.

Again, we can, in turn, run Namenode in Eclipse and run Datanode on the command line. The same effect.

In order to see more debug log output, we can also open src under the log4j.properties file, the second line in the information to debug, so that the output will be more detailed.

Resources

Http://www.cnblogs.com/zjfstudio/p/3919331.html

Build a Hadoop source code learning Environment

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More