Install Cloudera Hadoop cluster under Ubuntu12.04 server

Source: Internet
Author: User
Tags mkdir xsl ssh

Deployment environment

os:ubuntu12.04 Server

Hadoop:cdh3u6

Machine list: Namenode 192.168.71.46;datanode 192.168.71.202,192.168.71.203,192.168.71.204

Installing Hadoop

Add a software source

/etc/apt/sources.list.d/cloudera-3u6.list

Insert

Deb Http://192.168.52.100/hadoop MAVERICK-CDH3 Contrib

DEB-SRC Http://192.168.52.100/hadoop MAVERICK-CDH3 Contrib

Add GPG Key, execute

Curl-s Http://archive.cloudera.com/debian/archive.key | sudo apt-key add-

Update

Apt-get Update

Install Hadoop-0.20-namenode and Jobtracker on Namenode

Apt-get install-y--force-yes Hadoop-0.20-namenode hadoop-0.20-jobtracker

Install Hadoop-0.20-datanode and Tasktracker on Datanode

Apt-get install-y--force-yes Hadoop-0.20-datanode hadoop-0.20-tasktracker

Configure no SSH logon

Execute on the Namendoe machine

SSH-KEYGEN-T RSA

All the way to return, copy the contents of the id_rsa.pub generated under the ~/.ssh folder to the end of the/root/.ssh/authorized_keys file of the other Datanode machine, and manually create one if the file is not in the other machine.

Set up Hadoop storage directory and modify owner

Mkdir/opt/hadoop

Chown Hdfs:hadoop/opt/hadoop

Mkdir/opt/hadoop/mapred

Chown mapred:hadoop/opt/hadoop/mapred

Modify configuration file and distribute

Modify/etc/hadoop/conf/core-site.xml to

<?xml version= "1.0"?> <?xml-stylesheet type= "text/xsl"  
href= "configuration.xsl"?>  
      
<!--put Site-specific property overrides in this file. -->
      
<configuration>  
<property>  
<name>fs.default.name</name>  
< value>hdfs://192.168.71.46:8020</value>  
</property>  
<property>  
<name> hadoop.tmp.dir</name>  
<value>/opt/hadoop</value>  
</property>  
</ Configuration>

Modify/etc/hadoop/conf/hdfs-site.xml to

<?xml version= "1.0"?> <?xml-stylesheet type= "text/xsl"  
href= "configuration.xsl"?>  
      
<!--put Site-specific property overrides in this file. -->
      
<configuration>  
  <property>  
    <name>dfs.balance.bandwidthPerSec</name>  
    <value>10485760</value>  
  </property>  
  <property>  
    <name> dfs.block.size</name>  
    <value>134217728</value>  
  </property>  
  <property >  
    <name>dfs.data.dir</name>  
    <value>/opt/hadoop/dfs/data</value>  
  < /property>  
  <property>  
    <name>dfs.datanode.max.xcievers</name>  
    <value> 4096</value>  
  </property>  
  <property>  
    <name>dfs.namenode.handler.count </name>  
    <value>100</value>  
  </property>  
</configuration>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.