Hadoop installation in pseudo-Distribution Mode

Last Update:2018-12-07 Source: Internet

Author: User

Tags free ssh

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Pseudo distribution mode:

Hadoop can run in pseudo-distributed mode on a single node. Different Java processes can be used to simulate various nodes in the distributed operation.

1. Install hadoop

Make sure that JDK and SSH are installed in the system.

1) on the official website download hadoop: http://hadoop.apache.org/I download here is the hadoop-1.1.1-bin.tar.gz

2) download and put it in the/Softs directory

3. Decompress hadoop-1.1.1-bin.tar.gz to the/usr directory.

 [Root @ localhost USR]#Tar-zxvf/Softs/hadoop-1.1.1-bin.tar.gz

[Root @ localhost USR]#LsBin etc games Co., hadoop-1.1.1. include Java lib libexec local lost +Found sbin share src tmp [root@ Localhost USR]#

2. Configure hadoop

1) Configure/usr/hadoop-1.1.1/CONF/hadoop-env.sh file, find export java_home, change to JDK installation path

 Export java_home =/usr/Java/jdk1.6.0 _ 38

2) Configure/usr/hadoop-1.1.1/CONF/core-site.xml with the following content:

 <?  XML version = "1.0"  ?>  <? XML-stylesheet type = "text/XSL" href = "configuration. XSL"  ?>  <! --  Put site-specific property overrides in this file.  -->  <  Configuration  >          <  Property  >                  <  Name  > FS. Default. Name </ Name  >                  <  Value  > HDFS: // localhost: 9000 </  Value  >          </  Property  >  </  Configuration  >

3) Configure/usr/hadoop-1.1.1/CONF/hdfs-site.xml with the following content:

 <? XML version = "1.0"  ?>  <?  XML-stylesheet type = "text/XSL" href = "configuration. XSL"  ?>  <! --  Put site-specific property overrides in this file.  -->  <  Configuration  >          <  Property  >                  < Name  > DFS. Replication </  Name  >                  <  Value  > 1 </  Value  >          </  Property  >  </  Configuration >

4) Configure/usr/hadoop-1.1.1/CONF/mapred-site.xml with the following content:

 <?  XML version = "1.0"  ?>  <?  XML-stylesheet type = "text/XSL" href = "configuration. XSL"  ?>  <! --  Put site-specific property overrides in this file.  -->  <  Configuration  >         <  Property  >                  <  Name  > Mapred. Job. Tracker </  Name  >                  <  Value  > Localhost: 9001 </  Value  >          </ Property  >  </  Configuration  >

3. Password-free SSH settings

During hadoop running, you need to use SSH to manage the remote hadoop daemon. You do not need to enter a password to access the hadoop daemon. Therefore, use a key for verification.

1) generate the key and execute the following command:

 [Root @ localhost ~] # Ssh-keygen-T RSA

If you are prompted to enter passphrase (empty for no passphrase): And enter same passphrase again:, press enter directly without entering anything.

[Root @ localhost ~] #  Ssh-keygen-T RSA Generating public/ Private RSA key pair. Enter File  In Which to save the key (/root/. Ssh/ Id_rsa): Created directory' /Root/ . Ssh '. Enter passphrase (empty  For  No passphrase): Enter same passphrase again: Your identification has been saved  In /Root/. Ssh/ Id_rsa.your public key has been saved  In /Root/. Ssh/ Id_rsa.pub.the key fingerprint is: 16: 54: ED: 23: 0C: 04: Fa: 74: 1b: B0: B5: EB: C3: 87: 43: 52 root @Localhost. localdomainthe key's randomart image is: + -- [RSA 2048] ---- + | OO +... |. =... |. o eo. | o = o | o s .. | *. | *. | + --------------- + [Root @ Localhost ~] #  Ls

The execution result shows that the generated key has been saved to/root/. Ssh/id_rsa.

2) Go to the/root/. Ssh directory and run the following command:

 [Root @ localhost. Ssh]#CP id_rsa.pub authorized_keys

3) then execute:

[Root @ localhost. Ssh]#SSH localhost

You can connect to the instance by using SSH without entering a password.

 [Root @ localhost. Ssh]#SSH localhostThe authenticity of host 'localhost (: 1) 'Can't be established. RSA key fingerprint is E5:44: 06: 97: B4: 66: BA: 89: 40: 95: BA: 23: 0a: 06: 2a: 74. Are you sure you wantContinueConnecting (Yes/No )?Yeswarning: Permanently added 'localhost' (RSA) to the list of known hosts. Last login: Tue Jan15 22:08:06 2013 from 192.168.0.101Hello, MAN [root@ Localhost ~]#

4. Run hadoop

1) format the Distributed File System:

 [Root @ localhost hadoop-1.1.1]#Bin/hadoop namenode-format

[Root @ localhost hadoop-1.1.1] #  Bin/hadoop namenode-format 13/01/15 23:56:53 Info namenode. namenode: startup_msg: /*************************************** ********************* Startup_msg: Starting namenodestartup_msg: Host = Localhost. localdomain/127.0.0.1 Startup_msg: ARGs = [-Format] startup_msg: Version = 1.1.1 Startup_msg: Build = Https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1-r 1411108; compiled by 'hortonfo' on Mon Nov 19 10:48:11 UTC 2012 **************************** * ******************************/13/01/15 23:56:54 info util. gset: VM type = 32- Bit 13/01/15 23:56:54 info util. gset: 2% max memory = 19.33375 MB 13/01/15 23:56:54 info util. gset: capacity = 2 ^ 22 = 4194304 Entries 13/01/15 23:56:54 info util. gset: Recommended = 4194304, actual = 419430413/01/15 23:56:55 info namenode. fsnamesystem: fsowner = Root 13/01/15 23:56:55 info namenode. fsnamesystem: supergroup = Supergroup 13/01/15 23:56:55 info namenode. fsnamesystem: ispermissionenabled = True 13/01/15 23:56:55 info namenode. fsnamesystem: DFS. Block. invalidate. Limit = 10013/01/15 23:56:55 info namenode. fsnamesystem: isaccesstokenenabled = False Accesskeyupdateinterval = 0 min (s), accesstokenlifetime = 0 Min (s) 13/01/15 23:56:55 info namenode. namenode: caching file names occuring more than 10 Times 13/01/15 23:56:55 info common. Storage: Image File of size 110 saved In 0 Seconds. 13/01/15 23:56:55 info namenode. fseditlog: Closing edit log: Position = 4, editlog =/tmp/hadoop-root/dfs/name/current/ Edits 13/01/15 23:56:55 info namenode. fseditlog: Close success: truncate to 4, editlog =/tmp/hadoop-root/dfs/name/current/ Edits 13/01/15 23:56:55 info common. Storage: storage directory/tmp/hadoop-root/dfs/ Name has been successfully formatted. 13/01/15 23:56:55 Info namenode. namenode: shutdown_msg: /*************************************** ********************* Shutdown_msg: Shutting Down namenode at localhost. localdomain /127.0.0.1 ************************************** **********************/ [Root @ Localhost hadoop-1.1.1 #

2) Start the hadoop daemon

 [Root @ localhost hadoop-1.1.1]#Bin/start-all.sh

 [root @ localhost hadoop-1.1.1] #  bin/start-all.sh  starting namenode, logging to/usr/hadoop-1.1.1/libexec /.. /logs/hadoop-root-namenode- localhost. localdomain. outlocalhost: Starting datanode, logging to /usr/hadoop-1.1.1/libexec /.. /logs/hadoop-root-datanode- localhost. localdomain. outlocalhost: Starting secondarynamenode, logging to /usr/hadoop-1.1.1/libexec /.. /logs/hadoop-root-secondarynamenode- localhost. localdomain. outstarting jobtracker, logging to /usr/hadoop-1.1.1/libexec /.. /logs/hadoop-root-jobtracker- localhost. localdomain. outlocalhost: Starting tasktracker, logging to /usr/hadoop-1.1.1/libexec /.. /logs/hadoop-root-tasktracker- localhost. localdomain. out [root  @ localhost hadoop-1.1.1] #

After startup, you can access namenode through http: // localhost: 50070:

Namenode is the protection of HDFSProgramRecords how files are divided into data blocks and the data nodes on which these data blocks are stored. Its main function is to centrally manage memory and I/O.

Access through http: // localhost: 50030/Jobtracker

The jobtracker background program is used to connect the application to hadoop. UserCodeAfter being submitted to the cluster, jobtracker determines which file will be processed and assigns nodes for different tasks.

3) Stop the hadoop daemon.

[Root @ localhost hadoop-1.1.1]#Bin/stop-all.sh

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More