Hadoop source code compilation--2.5.0 version

Source: Internet
Author: User
Tags gz file install openssl automake

Compulsive disorder must be cured:

WARN util. nativecodeloader:unable to load Native-hadoop library for your platform ... using Builtin-java classes where applicable

To get rid of this warning!

This local library needs to be compiled based on the current environment, otherwise this warning will appear, but it has no effect on our task handling.

(But bloggers have proposed: The official compiled package is based on 32-bit JVM compiled, running on 64-bit JVM, can not play a performance advantage), since can not play a performance advantage, it can not endure.

1 Download the source package and basic environment

https://archive.apache.org/dist/hadoop/common/

This time I compiled it by: hadoop-2.5.0

Download the hadoop-2.5.0-src.tar.gz file to

Upload the source file to the directory/home/xuan/opt/softwares

Give Execute permission:

chmod u+x hadoop-2.5.0-src.tar.gz

Extract:

TAR–ZXF hadoop-2.5.0-src.tar.gz

Into the extracted directory, you can see a BUILDING.txt file

To view its contents, you can see:

Requirements:

1* Unix System

1.6+ JDK

3.0 Maven or later

Findbugs 1.3.9 (if running Findbugs)

5* Protocolbuffer 2.5.0

6* CMake 2.6 or newer (if compiling native code)

7* Zlib devel (if compiling native code)

8* OpenSSL devel (if compiling native hadoop-pipes)

9* Internet Connection for first build (to fetch all Maven and Hadoop dependencies)

Condition 1: The system I use here is CentOS6.4 (64-bit)

Condition 2:JDK already installed, version: Java version "1.7.0_67"

Condition 3:maven also installed: Apache Maven 3.0.5

(mavenhttps://archive.apache.org/dist/maven/maven-3/, configuration can refer to the official website, blog, and the Java environment variable configuration is similar)

Condition 9: Must our virtual machine system be able to network

2 Preparing for compilation

Switch to root user

Install SVN

Yum Install SVN

(This step can not do in fact, our source has been downloaded, do not need to download through it, may not have our own download fast it)

Condition 4: Non-mandatory, suitable for finding bugs

Condition 6: Install autoconf automake libtool cmake (6,7,8 is required for compiling native code)

Yum install autoconf automake libtool cmake

Condition 7: Install Ncurses-devel

Yum Install Ncurses-devel

Condition 8: Installing the OpenSSL devel

Yum Install Openssl-devel

Installing GCC

Yum Install gcc*

If you use Yum to install the above software, appear y/n, then directly press the y of the keyboard is good, this is whether to download some of the dependency package, in fact, should be added to the-y parameter, anyway. (There are bloggers who didn't do this step.)

Condition 5: Install the PROTOBUF (required)

Protoc buffer is a communication tool for Hadoop

The requirement is 2.5.0 version, but https://code.google.com/p/protobuf/downloads/list

Well, Google code didn't do it before. And FQ also difficult, so in

http://www.tuicool.com/articles/jM7Nn2/

This historical version was downloaded from the address given in the above article

(In fact, now moved to GitHub, you can also download to Protobuf, click on the Branch branch, then click Tags, you can find 2.5.0, and some bloggers use 3.0 version also reported wrong)

Upload to Linux under the/home/xuan/opt/softwares

Give execute permissions, unzip, and enter the directory where you are located.

Perform the following command installation (remember to enter the directory and execute these commands.)

./configure

Make

Make check

Make install

This is the way the source package is installed

http://zyjustin9.iteye.com/blog/2026579

Where make check is not necessary, just to check whether it was modified by a third party

This process is a bit long ha, don't be nervous, haha.

(some blog said to install the completion, the configuration environment variable protoc_home, I refer to the document is not set, most of it is not set drop)

Actually do not set, direct protoc–version, you can see the return:

Libprotoc 2.5.0

3 Compiling the source code

Enter our source code after extracting the directory/home/xuan/opt/softwares/hadoop-2.5.0-src, directly execute the following command:

MVN Package-pdist,native-dskiptests–dtar

This command, in the middle also wrote to us:

Create source and binary distributions with native code and documentation:

$ MVN Package-pdist,native,docs,src-dskiptests-dtar

We're mostly native, not getting code and documentation, so save, docs,src.

The execution of this command is lengthy and requires that the network be kept unblocked (I have compiled from 11:30 to nearly 13:00)

(The first time I ran the command, I went to eat, came back to find something wrong, so ...) Watching the hacker show for 1.5 hours. ~\ (≧▽≦)/~)

All right, how slow is my Internet connection .... Twenty or thirty is it true that it's a good thing to be a part of?

4

After the compilation is complete

Under the Hadoop-2.5.0-src/hadoop-dist/target directory of the/home/xuan/opt/softwares directory (that is,/home/xuan/opt/softwares/hadoop-2.5.0-src/ Hadoop-dist/target), you can see

[email protected] target]$ LL

Total 390200

Drwxr-xr-x 2 root root 4096 1 13:07 antrun

-rw-r--r--1 root root 1637 1 13:07 dist-layout-stitching.sh

-rw-r--r--1 root root 654 1 13:07 dist-tar-stitching.sh

Drwxr-xr-x 9 root root 4096 1 13:07 hadoop-2.5.0

-rw-r--r--1 root root 132941160 1 13:07 hadoop-2.5.0.tar.gz

-rw-r--r--1 root root 2745 1 13:07 hadoop-dist-2.5.0.jar

-rw-r--r--1 root root 266585127 1 13:07 hadoop-dist-2.5.0-javadoc.jar

Drwxr-xr-x 2 root root 4096 1 13:07 javadoc-bundle-options

Drwxr-xr-x 2 root root 4096 1 13:07 maven-archiver

Drwxr-xr-x 2 root root 4096 1 13:07 test-dir

(You should change the user and the group when necessary)

Further down the path, you can

Found under/home/xuan/opt/softwares/hadoop-2.5.0-src/hadoop-dist/target/hadoop-2.5.0/lib.

Native, replace this directory

/home/xuan/opt/modules/hadoop-2.5.0/lib under the native directory is good (of course, the original one to back up, or the problem of regret is too late)

sbin/start-dfs.sh

sbin/start-yarn.sh

Then run a wordcount example to test the

Bin/hadoop jar Share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar Wordcount/user/xuan/mapreduce/wordcount /input/user/xuan/mapreduce/wordcount/outputtest

Bin/hadoop dfs–cat/user/xuan/mapreduce/wordcount/outputtest/part*

So I found out, okay.

There is also a pit, compiled after the test, in writing a homework note, found that:

The JDK version is wrong Ah, look at the next/etc/profile, found that the Java_home configuration is right

Using Echo ${path} to see the PATH, there is no trace, but Lucius and point out that in/etc/profile instead: Export path= $JAVA _home: $PATH is OK, because PATH would have been read from the back, I also think of this point, or I will not go to find the path, but the path did not find obvious signs. And in fact,

[Email protected] bin]# pwd

/usr/bin

[Email protected] bin]# ll/usr/bin|grep java

-rwxr-xr-x 1 root root 17:38 gjavah

lrwxrwxrwx 1 root root 1 10:48 java -/etc/alternatives/java

Reference:

http://hyz301.iteye.com/blog/2235331

Well, Source/etc/profile. Check java–version again and everything will be OK.

Main references:

http://www.cnblogs.com/shishanyuan/p/4164104.html#undefined (reference in this article)

Http://www.cnblogs.com/hanganglin/p/4349919.html (Auxiliary, controlled verification)

https://www.zybuluo.com/awsekfozc/note/213815 (can modify the source of the jar package download, improve the speed of compilation)

http://linxiao.iteye.com/blog/2269047 (protobuf downloaded from github, loaded with findbugs, not GCC, presumably it's the system)

http://liubao0312.blog.51cto.com/2213529/1557657 (from cluster angle)

http://my.oschina.net/jeeker/blog/619275

Hadoop source code compilation--2.5.0 version

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.