Reprinted from: http://jbm3072.iteye.com/blog/1113827
Hadoop is a framework of distributed storage and computing. When we use hadoop on a daily basis, we will find that hadoop cannot fully meet our needs. We may need to modify the hadoop source code and re-compile and package it.
The following describes in detail how to import from SVN and compile hadoop in eclipse.
Because the project uses the hadoop version 0.20.2, So we based on hadoop-0.20.2 for secondary development.
(1) first, check out the source code from SVN. SVN address:
URL code
- Http://svn.apache.org/repos/asf/hadoop/common/tags/release-0.20.2/
(2) Prepare the development and compilation environments during the download process. Install at least the following software in Windows:
Java code
- JDK 6 -- you need to set the PATH environment variable.
- Ant
- -- After downloading and unzipping, add the ant bin directory to the environment variable of path.
- Cygwin
- -- Install cygwin refer to the http://ebiquity.umbc.edu/Tutorials/Hadoop/03%20-%20Prerequistes.html for installation and configuration. Select as many installation packages as possible in cygwin to facilitate future development.
(3) After checking out from SVN, use the command line to enter the home directory downloaded from hadoop and execute the command:
Java code
- $ Ant
Now, ant starts to download dependencies and compile files. I made a compilation error during compilation. It was found that there was a problem with the package-info.java generated by $ hadoop_home/src/saveversion. Sh, causing the compilation to fail. Modify saveversion. sh:
Java code
- Unset Lang
- Unset lc_ctype
- Version = $1
- User = 'whoam' # change it to a fixed value, for example, jbm3072.
- Date = 'date'
- If [-D. Git]; then
- Revision = 'git log-1 -- pretty = format: "% H "'
- Hostname = 'hostname'
- Branch = 'git branch | sed-n-e's/^ * // P''
- Url = "Git: // $ Hostname/$ CWD on Branch $ branch"
- Else
- Revision = 'svn info | sed-n-e's/last changed rev: \ (. * \)/\ 1/P''
- Url = 'svn info | sed-n-e's/url: \ (. * \)/\ 1/P''
- Fi
- Mkdir-P build/src/org/Apache/hadoop
- Cat <EOF | \
- Sed-e "s/version/$ version/"-e "s/user/$ user/"-e "s/date/$ date /"\
- -E "s | URL | $ URL |"-e "s/Rev/$ revision /"\
- > Build/src/org/Apache/hadoop/package-info.java
- /*
- * Generated by src/saveversion. Sh
- */
- @ Hadoopversionannotation (version = "version", revision = "Rev ",
- User = "user", date = "date", url = "url ")
- Package org. Apache. hadoop;
- EOF
After the modification, the compilation is successful.
(4) Copy eclipse-files to the project directory
Run the following command:
Java code
- Ant eclipse-Files
You can copy eclipse-files to the project directory.
(5) import the project
Open eclipse, select import in file, select general-> exsiting projects into workspace, click Next, and select the hadoop source code directory. Now you can identify hadoop as an Eclipse project, click Finish. After a while, an Eclipse project without errors will be OK.
(6) Now you can modify the hadoop source code based on Eclipse.