Pseudo distribution mode:
Hadoop can run in pseudo-distributed mode on a single node. Different Java processes can be used to simulate various nodes in the distributed operation.
1. Install hadoop
Make sure that JDK and SSH are installed in the system.
1) on the official website download hadoop: http://hadoop.apache.org/I download here is the hadoop-1.1.1-bin.tar.gz
2) download and put it in the/Softs directory
3. Decompress hadoop-1.1.1-bin.tar.gz to the/usr directory.
[Root @ localhost USR]#Tar-zxvf/Softs/hadoop-1.1.1-bin.tar.gz
[Root @ localhost USR]#LsBin etc games Co., hadoop-1.1.1. include Java lib libexec local lost +Found sbin share src tmp [root@ Localhost USR]#
2. Configure hadoop
1) Configure/usr/hadoop-1.1.1/CONF/hadoop-env.sh file, find export java_home, change to JDK installation path
Export java_home =/usr/Java/jdk1.6.0 _ 38
2) Configure/usr/hadoop-1.1.1/CONF/core-site.xml with the following content:
<? XML version = "1.0" ?> <? XML-stylesheet type = "text/XSL" href = "configuration. XSL" ?> <! -- Put site-specific property overrides in this file. --> < Configuration > < Property > < Name > FS. Default. Name </ Name > < Value > HDFS: // localhost: 9000 </ Value > </ Property > </ Configuration >
3) Configure/usr/hadoop-1.1.1/CONF/hdfs-site.xml with the following content:
<? XML version = "1.0" ?> <? XML-stylesheet type = "text/XSL" href = "configuration. XSL" ?> <! -- Put site-specific property overrides in this file. --> < Configuration > < Property > < Name > DFS. Replication </ Name > < Value > 1 </ Value > </ Property > </ Configuration >
4) Configure/usr/hadoop-1.1.1/CONF/mapred-site.xml with the following content:
<? XML version = "1.0" ?> <? XML-stylesheet type = "text/XSL" href = "configuration. XSL" ?> <! -- Put site-specific property overrides in this file. --> < Configuration > < Property > < Name > Mapred. Job. Tracker </ Name > < Value > Localhost: 9001 </ Value > </ Property > </ Configuration >
3. Password-free SSH settings
During hadoop running, you need to use SSH to manage the remote hadoop daemon. You do not need to enter a password to access the hadoop daemon. Therefore, use a key for verification.
1) generate the key and execute the following command:
[Root @ localhost ~] # Ssh-keygen-T RSA
If you are prompted to enter passphrase (empty for no passphrase): And enter same passphrase again:, press enter directly without entering anything.
[Root @ localhost ~] # Ssh-keygen-T RSA Generating public/ Private RSA key pair. Enter File In Which to save the key (/root/. Ssh/ Id_rsa): Created directory' /Root/ . Ssh '. Enter passphrase (empty For No passphrase): Enter same passphrase again: Your identification has been saved In /Root/. Ssh/ Id_rsa.your public key has been saved In /Root/. Ssh/ Id_rsa.pub.the key fingerprint is: 16: 54: ED: 23: 0C: 04: Fa: 74: 1b: B0: B5: EB: C3: 87: 43: 52 root @Localhost. localdomainthe key's randomart image is: + -- [RSA 2048] ---- + | OO +... |. =... |. o eo. | o = o | o s .. | *. | *. | + --------------- + [Root @ Localhost ~] # Ls
The execution result shows that the generated key has been saved to/root/. Ssh/id_rsa.
2) Go to the/root/. Ssh directory and run the following command:
[Root @ localhost. Ssh]#CP id_rsa.pub authorized_keys
3) then execute:
[Root @ localhost. Ssh]#SSH localhost
You can connect to the instance by using SSH without entering a password.
[Root @ localhost. Ssh]#SSH localhostThe authenticity of host 'localhost (: 1) 'Can't be established. RSA key fingerprint is E5:44: 06: 97: B4: 66: BA: 89: 40: 95: BA: 23: 0a: 06: 2a: 74. Are you sure you wantContinueConnecting (Yes/No )?Yeswarning: Permanently added 'localhost' (RSA) to the list of known hosts. Last login: Tue Jan15 22:08:06 2013 from 192.168.0.101Hello, MAN [root@ Localhost ~]#
4. Run hadoop
1) format the Distributed File System:
[Root @ localhost hadoop-1.1.1]#Bin/hadoop namenode-format
[Root @ localhost hadoop-1.1.1] # Bin/hadoop namenode-format 13/01/15 23:56:53 Info namenode. namenode: startup_msg: /*************************************** ********************* Startup_msg: Starting namenodestartup_msg: Host = Localhost. localdomain/127.0.0.1 Startup_msg: ARGs = [-Format] startup_msg: Version = 1.1.1 Startup_msg: Build = Https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1-r 1411108; compiled by 'hortonfo' on Mon Nov 19 10:48:11 UTC 2012 **************************** * ******************************/13/01/15 23:56:54 info util. gset: VM type = 32- Bit 13/01/15 23:56:54 info util. gset: 2% max memory = 19.33375 MB 13/01/15 23:56:54 info util. gset: capacity = 2 ^ 22 = 4194304 Entries 13/01/15 23:56:54 info util. gset: Recommended = 4194304, actual = 419430413/01/15 23:56:55 info namenode. fsnamesystem: fsowner = Root 13/01/15 23:56:55 info namenode. fsnamesystem: supergroup = Supergroup 13/01/15 23:56:55 info namenode. fsnamesystem: ispermissionenabled = True 13/01/15 23:56:55 info namenode. fsnamesystem: DFS. Block. invalidate. Limit = 10013/01/15 23:56:55 info namenode. fsnamesystem: isaccesstokenenabled = False Accesskeyupdateinterval = 0 min (s), accesstokenlifetime = 0 Min (s) 13/01/15 23:56:55 info namenode. namenode: caching file names occuring more than 10 Times 13/01/15 23:56:55 info common. Storage: Image File of size 110 saved In 0 Seconds. 13/01/15 23:56:55 info namenode. fseditlog: Closing edit log: Position = 4, editlog =/tmp/hadoop-root/dfs/name/current/ Edits 13/01/15 23:56:55 info namenode. fseditlog: Close success: truncate to 4, editlog =/tmp/hadoop-root/dfs/name/current/ Edits 13/01/15 23:56:55 info common. Storage: storage directory/tmp/hadoop-root/dfs/ Name has been successfully formatted. 13/01/15 23:56:55 Info namenode. namenode: shutdown_msg: /*************************************** ********************* Shutdown_msg: Shutting Down namenode at localhost. localdomain /127.0.0.1 ************************************** **********************/ [Root @ Localhost hadoop-1.1.1 #
2) Start the hadoop daemon
[Root @ localhost hadoop-1.1.1]#Bin/start-all.sh
[root @ localhost hadoop-1.1.1] # bin/start-all.sh starting namenode, logging to/usr/hadoop-1.1.1/libexec /.. /logs/hadoop-root-namenode- localhost. localdomain. outlocalhost: Starting datanode, logging to /usr/hadoop-1.1.1/libexec /.. /logs/hadoop-root-datanode- localhost. localdomain. outlocalhost: Starting secondarynamenode, logging to /usr/hadoop-1.1.1/libexec /.. /logs/hadoop-root-secondarynamenode- localhost. localdomain. outstarting jobtracker, logging to /usr/hadoop-1.1.1/libexec /.. /logs/hadoop-root-jobtracker- localhost. localdomain. outlocalhost: Starting tasktracker, logging to /usr/hadoop-1.1.1/libexec /.. /logs/hadoop-root-tasktracker- localhost. localdomain. out [root @ localhost hadoop-1.1.1] #
After startup, you can access namenode through http: // localhost: 50070:
Namenode is the protection of HDFSProgramRecords how files are divided into data blocks and the data nodes on which these data blocks are stored. Its main function is to centrally manage memory and I/O.
Access through http: // localhost: 50030/Jobtracker
The jobtracker background program is used to connect the application to hadoop. UserCodeAfter being submitted to the cluster, jobtracker determines which file will be processed and assigns nodes for different tasks.
3) Stop the hadoop daemon.
[Root @ localhost hadoop-1.1.1]#Bin/stop-all.sh