This article describes how to build the hadoop distribution version, that is, the running version of hadoop, through hadoop2.4.1 source code on centos7.
Why do we need to build on our own, instead of using Apache's distribution [Bin] version, because hadoop involves the underlying implementation of the Linux system, such:
Hadoop FS-ls/
It is implemented through Java Native, so it should be the relevant lib of native. [Personal opinion]
Download link for related files:
File directory explanation:
- Hadoop
- Hadoop-2.4.1.tar.gz I built on centos7 distribution version, you can directly download and configure, and then run can, refer to the article:
- Source code for hadoop-2.4.1-src.tar.gz hadoop
- Relative_tools
- Maven [only need to configure the maven_hadoop environment variable]
- Protocolbuf [that is: Google protocolbuffer], which must be built by yourself and checked whether protocolbuf has configured Environment Variables
- Findbugs needs to be built on its own and check whether protocolbuf has configured environment variables.
- Cmake searches by itself and builds configuration Environment Variables
Requirements:* Unix System* JDK 1.6+* Maven 3.0 or later* Findbugs 1.3.9 (if running findbugs)* ProtocolBuffer 2.5.0* CMake 2.6 or newer (if compiling native code)* Internet connection for first build (to fetch all Maven and Hadoop dependencies)----------------------------------------------------------------------------------Maven build goals: * Clean : mvn clean * Compile : mvn compile [-Pnative] * Run tests : mvn test [-Pnative] * Create JAR : mvn package * Run findbugs : mvn compile findbugs:findbugs * Run checkstyle : mvn compile checkstyle:checkstyle * Install JAR in M2 cache : mvn install * Deploy JAR to Maven repo : mvn deploy * Run clover : mvn test -Pclover [-DcloverLicenseLocation=${user.name}/.clover.license] * Run Rat : mvn apache-rat:check * Build javadocs : mvn javadoc:javadoc * Build distribution : mvn package [-Pdist][-Pdocs][-Psrc][-Pnative][-Dtar] * Change Hadoop version : mvn versions:set -DnewVersion=NEWVERSION Build options: * Use -Pnative to compile/bundle native code * Use -Pdocs to generate & bundle the documentation in the distribution (using -Pdist) * Use -Psrc to create a project source TAR.GZ * Use -Dtar to create a TAR with the distribution (using -Pdist)----------------------------------------------------------------------------------Importing projects to eclipseWhen you import the project to eclipse, install hadoop-maven-plugins at first. $ cd hadoop-maven-plugins $ mvn installThen, generate eclipse project files. $ mvn eclipse:eclipse -DskipTestsAt last, import to eclipse by specifying the root directory of the project via[File] > [Import] > [Existing Projects into Workspace].----------------------------------------------------------------------------------Building distributions:Create binary distribution without native code and without documentation: $ mvn package -Pdist -DskipTests -DtarCreate binary distribution with native code and with documentation: $ mvn package -Pdist,native,docs -DskipTests -DtarCreate source distribution: $ mvn package -Psrc -DskipTestsCreate source and binary distributions with native code and documentation: $ mvn package -Pdist,native,docs,src -DskipTests -DtarCreate a local staging version of the website (in /tmp/hadoop-site) $ mvn clean site; mvn site:stage -DstagingDirectory=/tmp/hadoop-site
- As long as the preceding Linux system, cmake, Maven, findbuds, protocolbuf, network connectivity, and Java have been built/installed successfully, and other conditions are met, run the code area in the hadoop source code directory.
Download link: http://pan.baidu.com/s/1pJJQkPl