Deployment environment
os:ubuntu12.04 Server
Hadoop:cdh3u6
Machine list: Namenode 192.168.71.46;datanode 192.168.71.202,192.168.71.203,192.168.71.204
Installing Hadoop
Add a software source
/etc/apt/sources.list.d/cloudera-3u6.list
Insert
Deb Http://192.168.52.100/hadoop MAVERICK-CDH3 Contrib
DEB-SRC Http://192.168.52.100/hadoop MAVERICK-CDH3 Contrib
Add GPG Key, execute
Curl-s Http://archive.cloudera.com/debian/archive.key | sudo apt-key add-
Update
Apt-get Update
Install Hadoop-0.20-namenode and Jobtracker on Namenode
Apt-get install-y--force-yes Hadoop-0.20-namenode hadoop-0.20-jobtracker
Install Hadoop-0.20-datanode and Tasktracker on Datanode
Apt-get install-y--force-yes Hadoop-0.20-datanode hadoop-0.20-tasktracker
Configure no SSH logon
Execute on the Namendoe machine
SSH-KEYGEN-T RSA
All the way to return, copy the contents of the id_rsa.pub generated under the ~/.ssh folder to the end of the/root/.ssh/authorized_keys file of the other Datanode machine, and manually create one if the file is not in the other machine.
Set up Hadoop storage directory and modify owner
Mkdir/opt/hadoop
Chown Hdfs:hadoop/opt/hadoop
Mkdir/opt/hadoop/mapred
Chown mapred:hadoop/opt/hadoop/mapred
Modify configuration file and distribute
Modify/etc/hadoop/conf/core-site.xml to
<?xml version= "1.0"?> <?xml-stylesheet type= "text/xsl"
href= "configuration.xsl"?>
<!--put Site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
< value>hdfs://192.168.71.46:8020</value>
</property>
<property>
<name> hadoop.tmp.dir</name>
<value>/opt/hadoop</value>
</property>
</ Configuration>
Modify/etc/hadoop/conf/hdfs-site.xml to
<?xml version= "1.0"?> <?xml-stylesheet type= "text/xsl"
href= "configuration.xsl"?>
<!--put Site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.balance.bandwidthPerSec</name>
<value>10485760</value>
</property>
<property>
<name> dfs.block.size</name>
<value>134217728</value>
</property>
<property >
<name>dfs.data.dir</name>
<value>/opt/hadoop/dfs/data</value>
< /property>
<property>
<name>dfs.datanode.max.xcievers</name>
<value> 4096</value>
</property>
<property>
<name>dfs.namenode.handler.count </name>
<value>100</value>
</property>
</configuration>