Install and configure Hadoop2.6 and integrate the Eclipse Development Environment

Last Update:2015-01-27 Source: Internet

Author: User

Tags hdfs dfs

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Install and configure Hadoop2.6 and integrate the Eclipse Development Environment

Install Java and Hadoop on Ubuntu14.04

Java installation is/usr/lib/jvm/jdk1.7.0 _ 72

1. Download,

2. Use sudo to create a jvm folder and cp

3. Decompress tar-zxvf

4. sudochown-R castle: castle hadoop-2.6.0 Modify permissions

5. Configure Environment Variables

~ /. Profile can also be found in ~ /. Add to bashrc

# Setjava env

ExportJAVA_HOME =/usr/lib/jvm/jdk1.7.0 _ 72

ExportJRE_HOME =$ {JAVA_HOME}/jre

ExportCLASSPATH =.: $ {JAVA_HOME}/lib: $ {JRE_HOME}/lib

ExportPATH =$ {JAVA_HOME}/bin: $ PATH

# Sethadoop env

ExportHADOOP_HOME =/usr/local/hadoop/hadoop-2.6.0.

ExportPATH = $ PATH: $ HADOOP_HOME/bin

Source. profile does not need to be logged out when the file takes effect

Hadoop/usr/local/hadoop/hadoop-2.6.0

The preceding steps are similar to the preceding steps.

1. Configure etc/hadoop/hadoop-env.sh

# Set to the root of your Java installation

ExportJAVA_HOME =/usr/lib/jvm/jdk1.7.0 _ 72

# Hadoop

ExportHADOOP_PREFIX =/usr/local/hadoop/hadoop-2.6.0

2. pseudo distribution Configuration

Etc/hadoop/core-site.xml:

<Property>
<Name> hadoop. tmp. dir </name>

<Value>/usr/local/hadoop/hadoop-2.6.0/tmp </value>
<Description> Abase for other
Temporary directories.
</Description>
</Property>
<Configuration>
<Property>
<Name> fs. defaultFS </name>
<Value> hdfs :/// localhost: 9000 </value>
</Property> </configuration> etc/hadoop/hdfs-site.xml:

<Name> dfs. replication </name>

</Property>

<Name> dfs. namenode. name. dir </name>

<Value> file:/usr/local/hadoop/hadoop-2.6.0/dfs/name </value>

</Property>

<Name> dfs. datanode. data. dir </name>

& Lt; value & gt; file:/usr/local/hadoop/hadoop-2.6.0/dfs/data & lt;/value & gt;

</Property>

<Name> dfs. permissions </name>

<Value> false </value>

/// This attribute node is used to prevent the eclopse from being read/write denied.
</Property>

</Configuration>

Mapred-site.xml

<! -- Mapreduce parameter -->

<! -- The new framework supports third-party MapReduce development frameworks to support non-Yarn architectures such as SmartTalk and DGSG. Note that this configuration value is usually set to Yarn,

If this option is not configured, the submitted Yarn job will only run in locale mode, not in distributed mode. -->

<Name> mapreduce. framework. name </name>

</Property>

</Configuration>

Note: the earlier version of mapreduce requires the following Configuration:

<Name> mapred. job. tracker </name>

</Property>

The new framework has been changed to the resouceManager and nodeManager configuration items in the Yarn-site.xml, the query of historical jobs in the new framework has been stripped from Jobtracker, into a separate mapreduce. jobtracker. jobhistory related configuration,

Therefore, you do not need to configure this option here. Configure relevant properties in yarn-site.xml.

Yarn-site.xml

<Configuration> <property>
<Name> yarn. nodemanager. aux-services </name> <value> mapreduce_shuffle </value> </property> </configuration>

For the differences between the new and old versions of mapreduce, see the following:

The most classic cluster configuration method of Xiami.

Other modified articles

3. Configure SSH login without a password

If ubuntu does not have ssh-related software installed

$
Sudo apt-get install ssh $
Sudo apt-get install rsyncSetuppassphraseless ssh
Nowcheck that you can ssh to the localhost without a passphrase:

$
Ssh localhostIfyou cannot ssh to localhost without a passphrase, execute thefollowing commands:

$
Ssh-keygen-t dsa-p'-f ~ /. Ssh/id_dsa
$
Cat ~ /. Ssh/id_dsa.pub> ~ /. Ssh/authorized_keysssh-keygen
Ssh key generation
Localhost still has a problem
Unable to connect to ssh:
Connect to host localhost port 22: Connection refused learn the solution from the Internet 1. First check whether there is an sshd Process
Ps
-E | grep ssh2. if not, start
/Etc/init. d/ssh
-If start cannot be started, install it. 3. install it.
Sudo
Apt-get install openssh-server4. Restart 5. View 1695
? Ssh-agent12407 00:00:00
? 00:00:00 sshdcastle @ castle-X550VC :~ $
Ssh localhost
The
Authenticity of host 'localhost (127.0.0.1) 'can't be established. ECDSA
Key fingerprint is AE: 23: 4a: 95: bc: 37: dd: 1a: 5b: 48: 4f: 66: e2: 87: 12: 1c. Are
You sure you want to continue connecting (yes/no )? YPlease
Type 'yes' or 'no': yesWarning:
Permanently added 'localhost' (ECDSA) to the list of known hosts. Welcome
To Ubuntu 14.04 LTS (GNU/Linux 3.20.- 43-generic x86_64)
*
Documentation: https://help.ubuntu.com/The
Programs embedded with the Ubuntu system are free software;
Exact distribution terms for each program are described in theindividual
Files in/usr/share/doc/*/copyright. Ubuntu
Comes with absolutely no warranty, to the extent permitted byapplicable
Law. $
Bin/hdfs namenode-formatbin/hdfs namenode-format only needs to be executed once. If you perform this operation twice,
Each namenode
Format creates a new namenodeId
/Usr/local/hadoop/hadoop2.6.0/tmp/dfs/name
But not datanode.
The clusterID of datanode appears.
And
ClusterID of namenode
Mismatch
The solution to this problem is: Modify the namenodeId under.../tmp/dfs/name.
Why is the format executed every time in hadoop0.20.2? I think it is because I fail to use the format once. Hdfs dfs-mkdir/user create a folder in hdfs. $
Sbin/start-dfs.sh view 2855 with jps command
Org. eclipse. equinox. launcher_1.3.0.v20140415-2008.jar
11127 DataNode
10975 NameNode
Jps 11432
11284 SecondaryNameNode
Indicates that the operation is successful.
$
Sbin/start-yarn.sh $
Sbin/stop-dfs.sh
$
Sbin/stop-yarn.sh if helloword is run in eclipse, the console does not print the running process. Copy etc/hadoop/log4j. properties in the hadoop installation folder to the src folder in the eclipse project. 15/01/2410: 30: 12 WARN util. NativeCodeLoader: Unable to load native-hadooplibrary for your platform... using builtin-java classes whereapplicable

15/01/2410: 30: 13 INFO Configuration. deprecation: session. id is deprecated. Instead, use dfs. metrics. session-id

15/01/2410: 30: 13 INFO jvm. jv1_rics: Initializing JVM Metrics withprocessName = JobTracker, sessionId =

15/01/2410: 30: 13 WARN mapreduce. JobSubmitter: No job jar file set. Userclasses may not be found. See Job or Job # setJar (String ).

15/01/2410: 30: 13 INFO input. FileInputFormat: Total input paths to process: 2

15/01/2410: 30: 14 INFO mapreduce. JobSubmitter: number of splits: 2

15/01/2410: 30: 14 INFO mapreduce. JobSubmitter: Submitting tokens for job: job_local632218717_0001

15/01/2410: 30: 14 INFO mapreduce. Job: The url to track the job: http: // localhost: 8080/

15/01/2410: 30: 14 INFO mapreduce. Job: Running job: job_local632218717_0001

15/01/2410: 30: 14 INFO mapred. LocalJobRunner: OutputCommitter set in confignull

15/01/2410: 30: 14 INFO mapred. LocalJobRunner: OutputCommitter isorg. apache. hadoop. mapreduce. lib. output. FileOutputCommitter

15/01/2410: 30: 15 INFO mapred. LocalJobRunner: Waiting for map tasks

15/01/2410: 30: 15 INFO mapred. LocalJobRunner: Starting task: attempt_local632218717_0001_m_000000_0

15/01/2410: 30: 15 INFO mapred. Task: Using ResourceCalculatorProcessTree: []

15/01/2410: 30: 15 INFO mapred. MapTask: Processing split: hdfs: // localhost: 9000/user/castle/wordcount_input/input1: 0 + 32

15/01/2410: 30: 15 INFO mapred. MapTask: (EQUATOR) 0 kvi 26214396 (104857584)

15/01/2410: 30: 15 INFO mapred. MapTask: mapreduce. task. io. sort. mb: 100

15/01/2410: 30: 15 INFO mapred. MapTask: soft limit at 83886080

15/01/2410: 30: 15 INFO mapred. MapTask: bufstart = 0; bufvoid = 104857600

15/01/2410: 30: 15 INFO mapred. MapTask: kvstart = 26214396; length = 6553600

15/01/2410: 30: 15 INFO mapred. MapTask: Map output collector class = org. apache. hadoop. mapred. MapTask $ MapOutputBuffer

15/01/2410: 30: 15 INFO mapred. LocalJobRunner:

15/01/2410: 30: 15 INFO mapred. MapTask: Starting flush of map output

15/01/2410: 30: 15 INFO mapred. MapTask: Spilling map output

15/01/2410: 30: 15 INFO mapred. MapTask: bufstart = 0; bufend = 52; bufvoid = 104857600

15/01/2410: 30: 15 INFO mapred. MapTask: kvstart = 26214396 (104857584); kvend = 26214380 (104857520); length = 17/6553600

15/01/2410: 30: 15 INFO mapred. MapTask: Finished spill 0

15/01/2410: 30: 15 INFO mapred. Task: attempt_local632218717_0001_m_000000_0 is done. And is in theprocess of committing

15/01/2410: 30: 15 INFO mapred. LocalJobRunner: map

15/01/2410: 30: 15 INFO mapred. Task: Task 'attempt _ local632218717_0001_m_000000_0 'done.

15/01/2410: 30: 15 INFO mapred. LocalJobRunner: Finishing task: attempt_local632218717_0001_m_000000_0

15/01/2410: 30: 15 INFO mapred. LocalJobRunner: Starting task: attempt_local632218717_0001_m_000001_0

15/01/2410: 30: 15 INFO mapred. Task: Using ResourceCalculatorProcessTree: []

15/01/2410: 30: 15 INFO mapred. MapTask: Processing split: hdfs: // localhost: 9000/user/castle/wordcount_input/input2: 0 + 29

15/01/2410: 30: 15 INFO mapred. MapTask: (EQUATOR) 0 kvi 26214396 (104857584)

15/01/2410: 30: 15 INFO mapred. MapTask: mapreduce. task. io. sort. mb: 100

15/01/2410: 30: 15 INFO mapred. MapTask: soft limit at 83886080

15/01/2410: 30: 15 INFO mapred. MapTask: bufstart = 0; bufvoid = 104857600

15/01/2410: 30: 15 INFO mapred. MapTask: kvstart = 26214396; length = 6553600

15/01/2410: 30: 15 INFO mapred. MapTask: Map output collector class = org. apache. hadoop. mapred. MapTask $ MapOutputBuffer

15/01/2410: 30: 15 INFO mapred. LocalJobRunner:

15/01/2410: 30: 15 INFO mapred. MapTask: Starting flush of map output

15/01/2410: 30: 15 INFO mapred. MapTask: Spilling map output

15/01/2410: 30: 15 INFO mapred. MapTask: bufstart = 0; bufend = 49; bufvoid = 104857600

15/01/2410: 30: 15 INFO mapred. MapTask: kvstart = 26214396 (104857584); kvend = 26214380 (104857520); length = 17/6553600

15/01/2410: 30: 15 INFO mapred. MapTask: Finished spill 0

15/01/2410: 30: 15 INFO mapred. Task: attempt_local632218717_0000000m_0000000000 is done. And is in theprocess of committing

15/01/2410: 30: 15 INFO mapred. LocalJobRunner: map

15/01/2410: 30: 15 INFO mapred. Task: Task 'attempt _ local632218717_0000000m_0000000000 'done.

15/01/2410: 30: 15 INFO mapred. LocalJobRunner: Finishing task: attempt_local632218717_0000000m_0000000000

15/01/2410: 30: 15 INFO mapred. LocalJobRunner: map task executor complete.

15/01/2410: 30: 15 INFO mapred. LocalJobRunner: Waiting for reduce tasks

15/01/2410: 30: 15 INFO mapred. LocalJobRunner: Starting task: attempt_local632218717_0001_r_000000_0

15/01/2410: 30: 15 INFO mapred. Task: Using ResourceCalculatorProcessTree: []

15/01/2410: 30: 15 INFO mapred. ReduceTask: Using ShuffleConsumerPlugin: org. apache. hadoop. mapreduce. task. reduce. Shuffle @ 158e338a

15/01/2410: 30: 15 INFO reduce. MergeManagerImpl: MergerManager: memoryLimit = 626471744, maxSingleShuffleLimit = 156617936, mergeThreshold = 413471360, ioSortFactor = 10, timeout = 10

15/01/2410: 30: 15 INFO reduce. EventFetcher: attempt_local632218717_0001_r_000000_0 Thread started: EventFetcherfor fetching Map Completion Events

15/01/2410: 30: 15 INFO mapreduce. Job: Job job_local632218717_0001 running inuber mode: false

15/01/2410: 30: 15 INFO mapreduce. Job: map 100% reduce 0%

15/01/2410: 30: 16 INFO reduce. LocalFetcher: localfetcher #1 about to shuffleoutput of map attempt_local632218717_0001_m_000000_0 decomp: 40 len: 44 to MEMORY

15/01/2410: 30: 16 INFO reduce. InMemoryMapOutput: Read 40 bytes from map-outputfor attempt_local632218717_0001_m_000000_0

15/01/2410: 30: 16 INFO reduce. mergeManagerImpl: closeInMemoryFile-> map-output of size: 40, inMemoryMapOutputs. size ()-> 1, commitMemory-> 0, usedMemory-> 40

15/01/2410: 30: 16 INFO reduce. LocalFetcher: localfetcher #1 about to shuffleoutput of map attempt_local632218717_0001_m_000001_0 decomp: 51 len: 55 to MEMORY

15/01/2410: 30: 16 INFO reduce. InMemoryMapOutput: Read 51 bytes from map-outputfor attempt_local632218717_0001_m_000001_0

15/01/2410: 30: 16 INFO reduce. mergeManagerImpl: closeInMemoryFile-> map-output of size: 51, inMemoryMapOutputs. size ()-> 2, commitMemory-> 40, usedMemory-> 91

15/01/2410: 30: 16 INFO reduce. EventFetcher: EventFetcher is interrupted... Returning

15/01/2410: 30: 16 INFO mapred. LocalJobRunner: 2/2 copied.

15/01/2410: 30: 16 INFO reduce. MergeManagerImpl: finalMerge called with 2in-memory map-outputs and 0 on-disk map-outputs

15/01/2410: 30: 16 INFO mapred. Merger: Merging 2 sorted segments

15/01/2410: 30: 16 INFO mapred. Merger: Down to the last merge-pass, with 2 segments left of total size: 71 bytes

15/01/2410: 30: 16 INFO reduce. MergeManagerImpl: Merged 2 segments, 91 bytes todisk to satisfy reduce memory limit

15/01/2410: 30: 16 INFO reduce. MergeManagerImpl: Merging 1 files, 93 bytes fromdisk

15/01/2410: 30: 16 INFO reduce. MergeManagerImpl: Merging 0 segments, 0 bytesfrom memory into reduce

15/01/2410: 30: 16 INFO mapred. Merger: Merging 1 sorted segments

15/01/2410: 30: 16 INFO mapred. Merger: Down to the last merge-pass, with 1 segments left of total size: 79 bytes

15/01/2410: 30: 16 INFO mapred. LocalJobRunner: 2/2 copied.

15/01/2410: 30: 16 INFO Configuration. deprecation: mapred. skip. on isdeprecated. Instead, use mapreduce. job. skiprecords

15/01/2410: 30: 16 INFO mapred. Task: attempt_local632218717_0001_r_000000_0 is done. And is in theprocess of committing

15/01/2410: 30: 16 INFO mapred. LocalJobRunner: 2/2 copied.

15/01/2410: 30: 16 INFO mapred. Task: Taskattempt_local632218717_0001_r_000000_0 is allowed to commit now

15/01/2410: 30: 16 INFO output. fileOutputCommitter: Saved output of task 'attempt _ upload' tohdfs: // localhost: 9000/user/castle/wordcount_output/_ temporary/0/task_local632218717_0001_r_000000

15/01/2410: 30: 16 INFO mapred. LocalJobRunner: reduce> reduce

15/01/2410: 30: 16 INFO mapred. Task: Task 'attempt _ local632218717_0001_r_000000_0 'done.

15/01/2410: 30: 16 INFO mapred. LocalJobRunner: Finishing task: attempt_local632218717_0001_r_000000_0

15/01/2410: 30: 16 INFO mapred. LocalJobRunner: reduce task executor complete.

15/01/2410: 30: 16 INFO mapreduce. Job: map 100% reduce 100%

15/01/2410: 30: 16 INFO mapreduce. Job: Job job_local632218717_0001 completedsuccessfully

15/01/2410: 30: 16 INFO mapreduce. Job: Counters: 38

FileSystem Counters

FILE: Number of bytes read = 1732

FILE: Number of bytes written = 754881

FILE: Number of read operations = 0

FILE: Number of large read operations = 0

FILE: Number of write operations = 0

HDFS: Number of bytes read = 154

HDFS: Number of bytes written = 42

HDFS: Number of read operations = 25

HDFS: Number of large read operations = 0

HDFS: Number of write operations = 5

Map-performanceframework

Mapinput records = 10

Mapoutput records = 10

Mapoutput bytes = 101

Mapoutput materialized bytes = 99

Inputsplit bytes = 242

Combineinput records = 10

Combineoutput records = 7

Performanceinput groups = 5

Reduceshuffle bytes = 99

Performanceinput records = 7

Performanceoutput records = 5

SpilledRecords = 14

ShuffledMaps = 2

FailedShuffles = 0

MergedMap outputs = 2

GCtime elapsed (MS) = 0

CPUtime spent (MS) = 0

Physicalmemory (bytes) snapshot = 0

Virtualmemory (bytes) snapshot = 0

Totalcommitted heap usage (bytes) = 855638016

ShuffleErrors

BAD_ID = 0

CONNECTION = 0

IO_ERROR = 0

WRONG_LENGTH = 0

WRONG_MAP = 0

WRONG_REDUCE = 0

FileInput Format Counters

BytesRead = 61

FileOutput Format Counters

BytesWritten = 42

Hadoop 2.6 and eclipse Integrated Development Configuration Compilation
Eclipse plug-in git
Clone https://github.com/winghc/hadoop2x-eclipse-plugin.gitthen use antto compile cd
Src/contrib/eclipse-pluginant jar-Dversion = 2.6.0-Declipse. home =/usr/local/eclipse-Dhadoop. home =/usr/local/hadoop-2.6.0 // you need to manually install eclipse, one-click installation through the command line does not work
Set eclipse. home and hadoop. home to your own environment path

The generated location is:/home/hunter/hadoop2x-eclipse-plugin/build/contrib/eclipse-plugin/hadoop-eclipse-plugin-2.6.0.jar

Sorry, I didn't succeed, that is, it was stuck there during compilation, and no error was reported. Later, I used this git file to include a version of hadoop2.2.0 under release. You can use this, but not the others.
The configuration on the right should be consistent with that in the core-site.xml. On the left, you do not need to configure, the previous version of mapreduce is consistent with the configuration in the mapred-site.xml.

Install and configure Hadoop2.2.0 on CentOS

Build a Hadoop environment on Ubuntu 13.04

Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1

Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)

Configuration of Hadoop environment in Ubuntu

Detailed tutorial on creating a Hadoop environment for standalone Edition

Build a Hadoop environment (using virtual machines to build two Ubuntu systems in a Winodws environment)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More