Hadoop is a distributed filesystem (Hadoop distributedfile system) HDFS. Hadoop is a large amount of data that can beDistributed Processingof theSoftwareFramework. Hadoop processes data in a reliable, efficient, and scalable way. Hadoop is reliable because it assumes that compute elements and storage will fail, so it maintains multiple copies of working data, ensuring that the failed nodes are re -Distribution Processing. Hadoop comes withJavathe framework for language writing.
The master node of Hadoop includes the name node, the subordinate name node and the Jobtracker daemon, and the utilities and browsers used to manage the cluster. The slave node includes the Tasktracker and data nodes. The master node includes daemons that provide Hadoop cluster management and orchestration, while the slave nodes include daemons that implement Hadoop file system (HDFS) storage capabilities and MapReduce functionality (data processing capabilities).
Namenode is the primary server in Hadoop, typically HDFS The software that runs on a separate machine in the instance, which manages the file System namespace and access to the files stored in the cluster. One namenode and one secondary namenode can be found in each Hadoop cluster. When an external client sends a request to create a file, NameNode responds with the block identity and the DataNode IP address of the first copy of the block. The NameNode also notifies other DataNode that will receive a copy of the block.
the datanode,h adoop cluster consists of a NameNode and a large number of Datanode. DataNode are usually organized in a rack, with a rackSwitchConnect all the systems together. The DataNode responds to read-write requests from the HDFS client. They also respond to commands from NameNode to create, delete, and copy blocks.
Jobtracker is a master service, and after the software is started, the Jobtracker receives the job, each subtask task that dispatches the job runs on Tasktracker and monitors them, if found run it again with a failed task.
Tasktracker is a slaver service that runs on multiple nodes. Tasktracker actively communicates with Jobtracker, receives jobs, and is responsible for performing each task directly. Tasktracker are required to run on the datanode of HDFs.
NameNode, secondary, NameNode, Jobtracker run on the master node, and on each slave node, deploy a datanode and tasktracker to This slave server runs a data handler that can handle native data as directly as possible.
Server2.example.com 172.25.45.2 (Master)
server3.example.com 172.25.45.3 (slave)
server4.example.com 172.25.45.4 (slave)
server5.example.com 172.25.45.5 (slave)
Configuration for Hadoop Legacy:
Server2,server3,server4 and Server5 add Hadoop users:
Useradd-u, Hadoop
echo Westos | passwd--stdin Hadoop
Server2:
SH Jdk-6u32-linux-x64.bin # #安装JDK
MV jdk1.6.0_32//home/hadoop/java
MV hadoop-1.2.1.tar.gz/home/hadoop/
Su-hadoop
Vim. Bash_profile
Export Java_home=/home/hadoop/javaexport classpath=.: $JAVA _home/lib: $JAVA _home/jre/libexport path= $PATH: $HOME/bin : $JAVA _home/bin
source. bash_profile
Tar zxf hadoop-1.1.2.tar.gz # #配置hadoop单节点
Ln-s hadoop-1.1.2 Hadoop
Cd/home/hadoop/hadoop/conf
Vim hadoop-env.sh
Export Java_home=/home/hadoop/java
Cd..
mkdir input
CP Conf/*.xml input/
Bin/hadoop Jar Hadoop-examples-1.1.2.jar
Bin/hadoop jar hadoop-examples-1.1.2.jar grep input Output ' dfs[a-z. +
CD output/
Cat *
1 dfsadmin
Set master to slave without password login:
Server2:
Su-hadoop
Ssh-keygen
Ssh-copy-id localhost
Ssh-copy-id 172.25.45.3
Ssh-copy-id 172.25.45.4
Cd/home/hadoop/hadoop/conf
Vim Core-site.xml # #指定 Namenode
<property><name>fs.default.name</name><value>hdfs://172.25.45.2:9000</value>< /property>
Vim Mapred-site.xml # #指定 Jobtracker
<configuration><property><name>mapred.job.tracker</name><value>172.25.45.2:9001 </value></property><configuration>
Vim Hdfs-site.xml # #指定文件保存的副本数
<configuration><property><name>dfs.replication</name><value>1</value></ Property><configuration>
Cd..
Bin/hadoop Namenode-format # #格式化成一个新的文件系统
Ls/tmp
Hadoop-hadoop Hsperfdata_hadoop Hsperfdata_root Yum.log
bin/start-dfs.sh # #启动hadoop进程
JPs
650) this.width=650; "Src=" Http://s5.51cto.com/wyfs02/M00/83/F9/wKioL1eChGvSUA2kAAAqoUUgwMg223.png-wh_500x0-wm_3 -wmp_4-s_258939859.png "title=" 2016-07-07 08_53_37 screen. png "alt=" wkiol1echgvsua2kaaaqouugwmg223.png-wh_50 "/>
bin/start-mapred.sh
JPs
650) this.width=650; "Src=" Http://s3.51cto.com/wyfs02/M02/83/FA/wKiom1eChIiCtb8SAAA4Mz_DRe0395.png-wh_500x0-wm_3 -wmp_4-s_3626137455.png "title=" 2016-07-07 08_53_45 screen. png "alt=" wkiom1echiictb8saaa4mz_dre0395.png-wh_50 "/>
Open in Browser: 172.25.45.2:50030
650) this.width=650; "Src=" Http://s1.51cto.com/wyfs02/M00/83/F9/wKioL1eChKGAvuJMAAE8GFdygZI253.png-wh_500x0-wm_3 -wmp_4-s_888672759.png "title=" 2016-07-07 08_56_05 screen. png "alt=" wkiol1echkgavujmaae8gfdygzi253.png-wh_50 "/>
Open 172.25.45.2:50070
650) this.width=650; "Src=" Http://s5.51cto.com/wyfs02/M00/83/FA/wKiom1eChLGSqhbrAAByk_RT7jA334.png-wh_500x0-wm_3 -wmp_4-s_675966571.png "title=" 2016-07-07 08_56_19 screen. png "alt=" wkiom1echlgsqhbraabyk_rt7ja334.png-wh_50 "/>
Bin/hadoop fs-put Input Test # #给分布式文件系统考入新建的文件
650) this.width=650; "Src=" Http://s3.51cto.com/wyfs02/M01/83/FA/wKiom1eChYCw-wNPAACbm0z_NQg317.png-wh_500x0-wm_3 -wmp_4-s_1259541688.png "title=" 2016-07-07 09_00_01 screen. png "alt=" wkiom1echycw-wnpaacbm0z_nqg317.png-wh_50 "/>
Bin/hadoop jar Hadoop-examples-1.2.1.jar WordCount Output
At the same time in the Web page
650) this.width=650; "Src=" Http://s3.51cto.com/wyfs02/M00/83/FA/wKiom1eChSKxoBdtAAB6mm5rLmk308.png-wh_500x0-wm_3 -wmp_4-s_3725032956.png "title=" 2016-07-07 09_04_02 screen. png "alt=" wkiom1echskxobdtaab6mm5rlmk308.png-wh_50 "/>
To view uploaded files in a webpage:
Bin/hadoop fs-get Output Test
Cat test/*
RM-FR test/# #删除下载的文件
2. Server2:
Shared File system:
Su-root
Yum Install Nfs-utils-y
/etc/init.d/rpcbind start
/etc/init.d/nfs start
Vim/etc/exports
/home/hadoop * (rw,anonuid=900,anongid=900)
Exportfs-rv
Exportfs-v
Server3 and Server4:
Yum Install Nfs-utils-y
/etc/init.d/rpcbind start
SHOWMOUNT-E 172.25.45.2 # #
Export list for 172.25.45.2:
/home/hadoop *
Mount 172.25.45.2:/home/hadoop/home/hadoop/
Df
650) this.width=650; "Src=" Http://s3.51cto.com/wyfs02/M01/83/FA/wKiom1eChk7iGXtZAAB2cIEwQgw772.png-wh_500x0-wm_3 -wmp_4-s_3035208311.png "title=" 2016-07-08 01_36_40 screen. png "alt=" wkiom1echk7igxtzaab2ciewqgw772.png-wh_50 "/>
650) this.width=650; "Src=" Http://s4.51cto.com/wyfs02/M02/83/F9/wKioL1eChljAQTjGAABzAFy0qnk248.png-wh_500x0-wm_3 -wmp_4-s_1464444331.png "title=" 2016-07-08 01_36_55 screen. png "alt=" wkiol1echljaqtjgaabzafy0qnk248.png-wh_50 "/>
Server2:
Su-hadoop
CD hadoop/conf
Vim Hdfs-site.xml
<configuration><property><name>dfs.replication</name><value>2</value></ Property></configuration>
Vim Slaves # #slave端的ip
172.25.45.3172.25.45.4
Vim Masters # #master端的ip
172.25.45.2
Hint: # #如果还有之前的进程开着, must be closed before formatting, to ensure that JPS no process run
Steps to close a process
bin/stop-all.sh # #执行完之后, sometimes the tasktracker,datanode will open, so close them
bin/hadoop-daemon.sh Stop Tasktracker
bin/hadoop-daemon.sh Stop Datanode
Delete the file in/tmp as a Hadoop user, save the file with no permissions
Su-hadoop
Bin/hadoop Namenode-format
bin/start-dfs.sh
Bin/start-mapred.s
Bin/hadoop fs-put Input Test # # #
Bin/hadoop jar hadoop-examples-1.2.1.jar grep test output ' dfs[a-z. + ' # #
While uploading and opening 172.25.45.2:50030 in the browser, you'll see that you're uploading files.
Su-hadoop
Bin/hadoop Dfsadmin-report
DD If=/dev/zero of=bigfile bs=1m count=200
Bin/hadoop fs-put bigfile Test
Open 172.25.45.2:50070 in the browser
650) this.width=650; "Src=" Http://s4.51cto.com/wyfs02/M02/83/F9/wKioL1eChqezxt3tAAGWuRnNH4M217.png-wh_500x0-wm_3 -wmp_4-s_3490417492.png "title=" 2016-07-08 02_24_12 screen. png "alt=" wkiol1echqezxt3taagwurnnh4m217.png-wh_50 "/>
3. New Server5.example.com 172.25.45.5 as the new slave end:
Su-hadoop
Yum Install Nfs-utils-y
/etc/init.d/rpcbind start
Useradd-u, Hadoop
echo Westos | passwd--stdin Hadoop
Mount 172.25.45.2:/home/hadoop//home/hadoop/
Su-hadoop
Vim Hadoop/conf/slaves
172.25.45.3172.25.45.4172.25.45.5
Cd/home/hadoop/hadoop
bin/hadoop-daemon.sh Start Datanode
bin/hadoop-daemon.sh Start Tasktracker
JPs
Delete a Slave end:
Server2:
Su-hadoop
Cd/home/hadoop/hadoop/conf
Vim Mapred-site.xml
<property><name>dfs.hosts.exclude</name><value>/home/hadoop/hadoop/conf/ Datanode-excludes</value></property>
Vim/home/hadoop/hadoop/conf/datanode-excludes
172.25.45.3 # #删除172.25.45.3 not as Slave end
Cd/home/hadoop/hadoop
Bin/hadoop Dfsadmin-refreshnodes # #刷新节点
Bin/hadoop Dfsadmin-report # #查看节点状态 will find the data on the Server3 transferred to Serve5
on the Server3:
Su-hadoop
bin/stop-all.sh
Cd/home/hadoop/hadoop
bin/hadoop-daemon.sh Stop Tasktracker
bin/hadoop-daemon.sh Stop Datanode
Server2:
Vim/home/hadoop/hadoop/conf/slaves
172.25.45.4
172.25.45.5
4. Configure the new version of Hadoop:
Server2:
Su-hadoop
Cd/home/hadoop
Tar zxf jdk-7u79-linux-x64.tar.gz
Ln-s Jdk1.7.0_79/java
Tar zxf hadoop-2.6.4.tar.gz
Ln-s hadoop-2.6.4 Hadoop
Cd/home/hadoop/hadoop/etc/hadoop
Vim hadoop-env.sh
Export Java_home=/home/hadoop/javaexport Hadoop prefix=/home/hadoop/hadoop
Cd/home/hadoop/hadoop
mkdir INP
CP Etc/hadoop/*.xml Input
TAR-TF Hadoop-native-64-2.6.0.tar
TAR-XF hadoop-native-64-2.6.0.tar-c hadoop/lib/native/
Cd/home/hadoop/hadoop
RM-FR output/
Bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar grep input Output ' dfs[a-z. +
cd/hone/hadoop/hadoop/etc/hadoop/
Vim Slaves
172.25.45.3172.25.45.4
Vim CORE-SITE.XM
<configuration><property><name>fs.defaultfs</name><value>hdfs://172.25.45.2:9000 </value></property></configuration>
Vim Mapred-site.xml
<configuration><property><name>mapred.job.tracker</name><value>172.25.45.2:9001 </value></property><configuration>
Vim Hdfs-site.xml
<configuration><property><name>dfs.replication</name><value>2</value></ Property></configuration>
Cd/home/hadoop/hadoop
Bin/hdfs Namenode-format
sbin/start-dfs.sh
JPs
Bin/hdfs Dfs-mkdir/user/hadoop # #要上传的文件, you must create a new directory before uploading it
Bin/hdfs Dfs-put Input/test
RM-FR input/
Bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar grep test output ' dfs[a-z. +
Bin/hdfs Dfs-cat output/*
1Dfsadmin
Open 172.25.45.2:50070 in the browser
650) this.width=650; "Src=" Http://s1.51cto.com/wyfs02/M00/83/F9/wKioL1eCimSzmD_XAAGhHLjiZ2c249.png-wh_500x0-wm_3 -wmp_4-s_2454728916.png "title=" 2016-07-08 07_55_10 screen. png "alt=" wkiol1ecimszmd_xaaghhljiz2c249.png-wh_50 "/>
hadoop~ Big Data