Hadoop installation in pseudo-Distribution Mode

Source: Internet
Author: User
Tags free ssh

Pseudo distribution mode:

Hadoop can run in pseudo-distributed mode on a single node. Different Java processes can be used to simulate various nodes in the distributed operation.

1. Install hadoop

Make sure that JDK and SSH are installed in the system.

1) on the official website download hadoop: http://hadoop.apache.org/I download here is the hadoop-1.1.1-bin.tar.gz

2) download and put it in the/Softs directory

3. Decompress hadoop-1.1.1-bin.tar.gz to the/usr directory.

 
[Root @ localhost USR]#Tar-zxvf/Softs/hadoop-1.1.1-bin.tar.gz

[Root @ localhost USR]#LsBin etc games Co., hadoop-1.1.1. include Java lib libexec local lost +Found sbin share src tmp [root@ Localhost USR]#

 

2. Configure hadoop

1) Configure/usr/hadoop-1.1.1/CONF/hadoop-env.sh file, find export java_home, change to JDK installation path

 
Export java_home =/usr/Java/jdk1.6.0 _ 38

2) Configure/usr/hadoop-1.1.1/CONF/core-site.xml with the following content:

 <?  XML version = "1.0"  ?>  <? XML-stylesheet type = "text/XSL" href = "configuration. XSL"  ?>  <! --  Put site-specific property overrides in this file.  -->  <  Configuration  >          <  Property  >                  <  Name  > FS. Default. Name </ Name  >                  <  Value  > HDFS: // localhost: 9000 </  Value  >          </  Property  >  </  Configuration  > 

3) Configure/usr/hadoop-1.1.1/CONF/hdfs-site.xml with the following content:

 <? XML version = "1.0"  ?>  <?  XML-stylesheet type = "text/XSL" href = "configuration. XSL"  ?>  <! --  Put site-specific property overrides in this file.  -->  <  Configuration  >          <  Property  >                  < Name  > DFS. Replication </  Name  >                  <  Value  > 1 </  Value  >          </  Property  >  </  Configuration > 

4) Configure/usr/hadoop-1.1.1/CONF/mapred-site.xml with the following content:

 <?  XML version = "1.0"  ?>  <?  XML-stylesheet type = "text/XSL" href = "configuration. XSL"  ?>  <! --  Put site-specific property overrides in this file.  -->  <  Configuration  >         <  Property  >                  <  Name  > Mapred. Job. Tracker </  Name  >                  <  Value  > Localhost: 9001 </  Value  >          </ Property  >  </  Configuration  > 

 

3. Password-free SSH settings

During hadoop running, you need to use SSH to manage the remote hadoop daemon. You do not need to enter a password to access the hadoop daemon. Therefore, use a key for verification.

1) generate the key and execute the following command:

 
[Root @ localhost ~] # Ssh-keygen-T RSA

If you are prompted to enter passphrase (empty for no passphrase): And enter same passphrase again:, press enter directly without entering anything.

[Root @ localhost ~] #  Ssh-keygen-T RSA Generating public/ Private RSA key pair. Enter File  In Which to save the key (/root/. Ssh/ Id_rsa): Created directory' /Root/ . Ssh '. Enter passphrase (empty  For  No passphrase): Enter same passphrase again: Your identification has been saved  In /Root/. Ssh/ Id_rsa.your public key has been saved  In /Root/. Ssh/ Id_rsa.pub.the key fingerprint is: 16: 54: ED: 23: 0C: 04: Fa: 74: 1b: B0: B5: EB: C3: 87: 43: 52 root @Localhost. localdomainthe key's randomart image is: + -- [RSA 2048] ---- + | OO +... |. =... |. o eo. | o = o | o s .. | *. | *. | + --------------- + [Root @ Localhost ~] #  Ls 

The execution result shows that the generated key has been saved to/root/. Ssh/id_rsa.

2) Go to the/root/. Ssh directory and run the following command:

 
[Root @ localhost. Ssh]#CP id_rsa.pub authorized_keys

3) then execute:

[Root @ localhost. Ssh]#SSH localhost

You can connect to the instance by using SSH without entering a password.

 
[Root @ localhost. Ssh]#SSH localhostThe authenticity of host 'localhost (: 1) 'Can't be established. RSA key fingerprint is E5:44: 06: 97: B4: 66: BA: 89: 40: 95: BA: 23: 0a: 06: 2a: 74. Are you sure you wantContinueConnecting (Yes/No )?Yeswarning: Permanently added 'localhost' (RSA) to the list of known hosts. Last login: Tue Jan15 22:08:06 2013 from 192.168.0.101Hello, MAN [root@ Localhost ~]#

 

4. Run hadoop

1) format the Distributed File System:

 
[Root @ localhost hadoop-1.1.1]#Bin/hadoop namenode-format

[Root @ localhost hadoop-1.1.1] #  Bin/hadoop namenode-format 13/01/15 23:56:53 Info namenode. namenode: startup_msg: /*************************************** ********************* Startup_msg: Starting namenodestartup_msg: Host = Localhost. localdomain/127.0.0.1 Startup_msg: ARGs = [-Format] startup_msg: Version = 1.1.1 Startup_msg: Build = Https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1-r 1411108; compiled by 'hortonfo' on Mon Nov 19 10:48:11 UTC 2012 **************************** * ******************************/13/01/15 23:56:54 info util. gset: VM type = 32- Bit 13/01/15 23:56:54 info util. gset: 2% max memory = 19.33375 MB 13/01/15 23:56:54 info util. gset: capacity = 2 ^ 22 = 4194304 Entries 13/01/15 23:56:54 info util. gset: Recommended = 4194304, actual = 419430413/01/15 23:56:55 info namenode. fsnamesystem: fsowner = Root 13/01/15 23:56:55 info namenode. fsnamesystem: supergroup = Supergroup 13/01/15 23:56:55 info namenode. fsnamesystem: ispermissionenabled = True 13/01/15 23:56:55 info namenode. fsnamesystem: DFS. Block. invalidate. Limit = 10013/01/15 23:56:55 info namenode. fsnamesystem: isaccesstokenenabled = False Accesskeyupdateinterval = 0 min (s), accesstokenlifetime = 0 Min (s) 13/01/15 23:56:55 info namenode. namenode: caching file names occuring more than 10 Times 13/01/15 23:56:55 info common. Storage: Image File of size 110 saved In 0 Seconds. 13/01/15 23:56:55 info namenode. fseditlog: Closing edit log: Position = 4, editlog =/tmp/hadoop-root/dfs/name/current/ Edits 13/01/15 23:56:55 info namenode. fseditlog: Close success: truncate to 4, editlog =/tmp/hadoop-root/dfs/name/current/ Edits 13/01/15 23:56:55 info common. Storage: storage directory/tmp/hadoop-root/dfs/ Name has been successfully formatted. 13/01/15 23:56:55 Info namenode. namenode: shutdown_msg: /*************************************** ********************* Shutdown_msg: Shutting Down namenode at localhost. localdomain /127.0.0.1 ************************************** **********************/ [Root @ Localhost hadoop-1.1.1 #  

2) Start the hadoop daemon

 
[Root @ localhost hadoop-1.1.1]#Bin/start-all.sh

 

 [root @ localhost hadoop-1.1.1] #  bin/start-all.sh  starting namenode, logging to/usr/hadoop-1.1.1/libexec /.. /logs/hadoop-root-namenode- localhost. localdomain. outlocalhost: Starting datanode, logging to /usr/hadoop-1.1.1/libexec /.. /logs/hadoop-root-datanode- localhost. localdomain. outlocalhost: Starting secondarynamenode, logging to /usr/hadoop-1.1.1/libexec /.. /logs/hadoop-root-secondarynamenode- localhost. localdomain. outstarting jobtracker, logging to /usr/hadoop-1.1.1/libexec /.. /logs/hadoop-root-jobtracker- localhost. localdomain. outlocalhost: Starting tasktracker, logging to /usr/hadoop-1.1.1/libexec /.. /logs/hadoop-root-tasktracker- localhost. localdomain. out [root  @ localhost hadoop-1.1.1] # 

After startup, you can access namenode through http: // localhost: 50070:

 

Namenode is the protection of HDFSProgramRecords how files are divided into data blocks and the data nodes on which these data blocks are stored. Its main function is to centrally manage memory and I/O.

Access through http: // localhost: 50030/Jobtracker

 

The jobtracker background program is used to connect the application to hadoop. UserCodeAfter being submitted to the cluster, jobtracker determines which file will be processed and assigns nodes for different tasks.

3) Stop the hadoop daemon.

[Root @ localhost hadoop-1.1.1]#Bin/stop-all.sh

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.