Linux under single node installation Hadoop

Last Update:2015-01-12 Source: Internet

Author: User

Tags rsync

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Two cyan

Email: [Email protected] Weibo: HTTP://WEIBO.COM/XTFGGEF

Now it's time to learn about Hadoop in a systematic way, although it may be a bit late, but you want to learn the hot technology, let's start with the installation environment. Official documents

The software and version used in this article are as follows:

Ubuntu 14.10-Bit Server Edition
Hadoop2.6.0
JDK 1.7.0_71
Ssh
Rsync

First, you prepare a machine with Linux system, physical machine virtual machine can be, we recommend the use of Oracle VirtualBox to build a virtual machine. This article uses Window7+virtualbox+ubuntu 14.10 Server Edition.

Go to Apache home page to download a Hadoop image (Apache Hadoop Mirror). Download the JDK (JDK download) to Oracle website.

1. Build the base environment and download the Hadoop and JDK installation packages

2. Sign in to Ubuntu using putty

Execute the following two lines of command to install SSH and rsync

$ sudo apt-get install SSH

$ sudo apt-get install rsync

3. Use WINSCP to bring the downloaded Hadoop and JDK packages into Ubuntu

Use TAR-ZXVF xxx.tar.gz to extract two packages separately and copy them to the/OPT directory.

4. Configuring the Java Environment

Root permission to open the/etc/profile file, at the end of the add:

Java_home=/opt/jdk1.7.17path= $JAVA _home/bin: $PATHCLASSPATH =.: $JAVA _home/lib/tools.jar: $JAVA _home/lib/ Dt.jarexport java_home PATH CLASSPATH

Perform. /etc/profile makes the profile effective immediately after it is modified. (Note: The following spaces)

In fact, the purpose of this configuration is to set path and classpath, which is consistent with the environment variables we set under Windows. After using Javac or java-version test, see success is not.

5. Configure Hadoop

After you copy the unpacked package to/opt, you want to make a simple configuration of Hadoop.

To edit etc/hadoop/hadoop-env.sh, add the following configuration:

# set to the root of your Java installationexport java_home=/opt/jdk1.7.0_71# assuming your installation directory is/opt /hadoop-2.6.0export hadoop_conf_dir=${hadoop_conf_dir:-"/opt/hadoop-2.6.0"}

By doing this, Hadoop is simply configured, and we can do a little bit of configuration to turn on different patterns.

Standalone mode
For distributed
Fully distributed

Here we are going to configure the pseudo-distributed to use, a single-node pseudo-distributed representation of each Hadoop daemon running alone in a Java process.

1. Edit the configuration file Etc/hadoop/core-site.xml, Etc/hadoop/hdfs-site.xml

<configuration>    <property>        <name>fs.defaultFS</name>        <value>hdfs:// Localhost:9000</value>    </property></configuration>

<configuration>    <property>        <name>dfs.replication</name>        <value>1< /value>    </property></configuration>

2. Set up a no-key SSH connection to localhost

If it fails, use the following command:

$ ssh-keygen-t Dsa-p "-F ~/.ssh/id_dsa$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

3. Implementation

Execution. HDFs Namenode-format

[email protected]:/opt/hadoop-2.6.0/bin$./hdfs namenode-format15/01/11 11:37:08 INFO namenode. Namenode:startup_msg:/************************************************************startup_msg:starting   Namenodestartup_msg:host = Ubuntu/127.0.1.1startup_msg:args = [-format]startup_msg:version = 2.6.0STARTUP_MSG: Classpath =/opt/hadoop-2.6.0/etc/hadoop:/opt/hadoop-2.6.0/share/hadoop/common/lib/slf4j-logre/hadoop/common/lib /jsr305-1.3.9.jar:/opt/hadoop-2.6.0/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/opt/hadtpcore-4.2.5.jar:/opt/ hadoop-2.6.0/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/opt/hadoop-2.6.0/share.jar:/opt//lib/ Commons-el-1.0.jar:/opt/hapacheds-i18n-2.0.0-m15.jar:/opt/hadoop-2.6.0/share/hadoop/common/lib/xmlenc-0.52.jar :/opt/hadoop-2.6.0/shaar:/opt/hadoop-...... Startup_msg:build = Https://git-wip-us.apache.org/repos/asf/hadoop.git-r E3496499ecb8d220fba99dc5ed4c99 2014-11-13t21:10zstartup_msg:java = 1.7.0_71*******************************************/15/01/11 11:37:08 INFO Namenode. Namenode:registered UNIX signal handlers for [term, HUP, INT]15/01/11 11:37:08 INFO NameNode. Namenode:createnamenode [-format]formatting using CLUSTERID:CID-6645A7AA-B5C4-4B8C-A0B7-ECE148452BE515/01/11 11:37:10 INFO Namenode. Fsnamesystem:no keyprovider found.15/01/11 11:37:10 INFO namenode. Fsnamesystem:fslock is FAIR:TRUE15/01/11 11:37:10 INFO blockmanagement. DATANODEMANAGER:DFS.BLOCK.INVALIDATE.LIMIT=100015/01/11 11:37:10 INFO blockmanagement. DATANODEMANAGER:DFS.NAMENODE.DATANODE.REGISTRATION.IP-HOSTNAME-CHEC15/01/11 11:37:10 INFO blockmanagement. BlockManager:dfs.namenode.startup.delay.block.deletion.sec is set T15/01/11 11:37:10 INFO blockmanagement. Blockmanager:the block deletion would start around Jan 11:3715/01/11 11:37:10 INFO util. Gset:computing capacity for map BLOCKSMAP15/01/11 11:37:10 INFO util. GSET:VM type = 64-BIT15/01/11 11:37:10 INFO util. gset:2.0% Max memory 966.7 MB = 19.3 MB15/01/11 11:37:10 inFO util. gset:capacity = 2^21 = 2097152 entries15/01/11 11:37:10 INFO blockmanagement. BLOCKMANAGER:DFS.BLOCK.ACCESS.TOKEN.ENABLE=FALSE15/01/11 11:37:10 INFO blockmanagement. Blockmanager:defaultreplication = 115/01/11 11:37:10 INFO blockmanagement. Blockmanager:maxreplication = 51215/01/11 11:37:10 INFO blockmanagement. Blockmanager:minreplication = 115/01/11 11:37:10 INFO blockmanagement. Blockmanager:maxreplicationstreams = 215/01/11 11:37:10 INFO blockmanagement. Blockmanager:shouldcheckforenoughracks = FALSE15/01/11 11:37:10 INFO blockmanagement. BLOCKMANAGER:REPLICATIONRECHECK15/01/11 11:37:10 INFO blockmanagement. BLOCKMANAGER:ENCRYPTDATATRANSFE15/01/11 11:37:10 INFO blockmanagement. BLOCKMANAGER:MAXNUMBLOCKSTOLOG15/01/11 11:37:10 INFO Namenode. Fsnamesystem:fsowner = ADA15/01/11 11:37:10 INFO namenode. Fsnamesystem:supergroup = SUP15/01/11 11:37:10 INFO namenode. fsnamesystem:ispermissionenabled = TRU15/01/11 11:37: Ten INFO Namenode. Fsnamesystem:ha ENABLED:FALSE15/01/11 11:37:10 INFO namenode. Fsnamesystem:append ENABLED:TRUE15/01/11 11:37:11 INFO util. Gset:computing capacity for map INODEMAP15/01/11 11:37:11 INFO util. GSET:VM type = 64-BIT15/01/11 11:37:11 INFO util. gset:1.0% Max memory 966.7 MB = 9.7 mb15/01/11 11:37:11 INFO util. gset:capacity = 2^20 = 1048576 entrie15/01/11 11:37:11 INFO namenode. namenode:caching file names occuring m15/01/11 11:37:11 INFO util. Gset:computing capacity for map CACHEDBLOC15/01/11 11:37:11 INFO util. GSET:VM type = 64-BIT15/01/11 11:37:11 INFO util. gset:0.25% Max memory 966.7 MB = 2.4 mb15/01/11 11:37:11 INFO util. gset:capacity = 2^18 = 262144 entries15/01/11 11:37:11 INFO namenode. FSNAMESYSTEM:DFS.NAMENODE.SAFEMODE.THR15/01/11 11:37:11 INFO Namenode. FSNAMESYSTEM:DFS.NAMENODE.SAFEMODE.MIN15/01/11 11:37:11 INFO Namenode. FSNAMESYSTEM:DFS.NAMENODE.SAFEMODE.EXT15/01/11 11:37:11 INFO Namenode. Fsnamesystem:retry Cache on Namenode i15/01/11 11:37:11 INFO Namenode. Fsnamesystem:retry Cache would use 0.0315/01/11 11:37:11 INFO util. Gset:computing capacity for map NAMENODERE15/01/11 11:37:11 INFO util. GSET:VM type = 64-BIT15/01/11 11:37:11 INFO util. gset:0.029999999329447746% Max memory 966.15/01/11 11:37:11 INFO util. gset:capacity = 2^15 = 32768 entries15/01/11 11:37:11 INFO namenode. NNCONF:ACLS enabled? FALSE15/01/11 11:37:11 INFO Namenode. Nnconf:xattrs enabled? TRUE15/01/11 11:37:11 INFO Namenode. Nnconf:maximum size of an XATTR:1638415/01/11 11:37:11 INFO namenode. fsimage:allocated new BLOCKPOOLID:BP-15/01/11 11:37:11 INFO Common. Storage:storage DIRECTORY/TMP/HADOOP-AD15/01/11 11:37:11 INFO namenode. Nnstorageretentionmanager:going to RET15/01/11 11:37:11 INFO util. Exitutil:exiting with status 015/01/11 11:37:11 INFO namenode. Namenode:shutdown_msg:/************************************************************shutdown_msg:shutting down NameNode at ubuntu/127.0.1.1****************************/[email protected]:/opt/hadoop-2.6.0/bin$

Start the Namenode daemon

Switch directory to Hadoop/sbin under execute:./start-dfs.sh

[Email protected]:/opt/hadoop-2.6.0/sbin$./start-dfs.shstarting namenodes on [localhost][email protected] ' s password : localhost:starting Namenode, logging to/opt/hadoop-2.6.0/logs/hadoop-adam-namenode-ubuntu.out[email protected] ' s Password:localhost:starting Datanode, logging to/opt/hadoop-2.6.0/logs/hadoop-adam-datanode-ubuntu.outstarting Secondary namenodes [0.0.0.0][email protected] ' s password:0.0.0.0:starting secondarynamenode, logging to/opt/ Hadoop-2.6.0/logs/hadoop-adam-secondarynamenode-ubuntu.out[email protected]:/opt/hadoop-2.6.0/sbin$

This allows us to install a simple single-node pseudo-distributed Hadoop environment.

This article is from: Linux Tutorial Network

Linux under single node installation Hadoop

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More