Hadoop--linux Build Hadoop environment (simplified article)

Last Update:2015-05-16 Source: Internet

Author: User

Tags hadoop fs

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Reprint Please specify source: http://blog.csdn.net/l1028386804/article/details/45771619

1, install the JDK (here to install JDK1.6 as an example, the specific version of the installation of the JDK is not limited) (1) Download the installation JDK: Make sure the computer is networked after the command line enter the following command to install the JDK
sudo apt-get install SUN-JAVA6-JDK
(2) Configuring the Computer Java Environment: Open/etc/profile, enter the following at the end of the file
Export Java_home = (JAVA installation directory)
Export CLASSPATH = ".: $JAVA _home/lib: $CLASSPATH"
Export PATH = "$JAVA _home/:P Ath"
(3) Verify that Java is installed successfully
Enter Java-version, and the output Java version information is installed successfully.
2, install the configuration ssh (1) Download and install SSH: Also enter the command line below command to install SSH
sudo apt-get install SSH
(2) Configure password-free login native: Enter the following two commands at the command line
$ ssh-keygen-t Rsa-p "-F ~/.ssh/id_rsa
Direct carriage return, when completed will generate two files in ~/.ssh/: Id_rsa and id_rsa.pub; These two pairs appear, similar to keys and locks.

Append the id_rsa.pub to the authorization key (there is no Authorized_keys file at this moment)
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

(3) Verify that SSH is installed successfully
Enter SSH localhost. If the display of a native login succeeds, the installation is successful.
3. Close the firewall $sudo UFW disable
Note: This step is very important, if you do not close, there will be no problem finding Datanode
4. Install and run Hadoop (for example, version 0.20.2) (1) Download Hadoop: Load Hadoop on the http://www.apache.org/dyn/closer.cgi/hadoop/core/page.
(2) Installation configuration Hadoop
Single-node configuration:
Installing a single-node Hadoop does not require configuration, in this way, Hadoop is recognized as a separate Java process.
Pseudo Distribution configuration:
A pseudo-distributed Hadoop is a cluster with only one node. In this cluster, the computer is both master and slave,
Even if Namenode is Datanode, both Jobtracker and Tasktracker.

The configuration process is as follows:

A, enter the Conf folder to modify the following file.
Add the following to the hadoop-env.sh:
Export Java_home = (JAVA installation directory)
The contents of the Core-site.xml file are modified to the following:

<configuration>     <!--Global Properties--     <property>  <name>hadoop.tmp.dir </name>  <value>/home/liuyazhuang/tmp</value> </property>    <!--file system Properties-<property>  <name>fs.default.name</name>  <value>hdfs:// Localhost:9000</value> </property> </configuration>

The contents of the Hdfs-site.xml file are modified to the following: (Replication default is 3, if not modified, datanode less than three will be error)

<configuration> <property>  <name>fs.replication</name>  <value>1</value > </property> </configuration>

The contents of the Mapred-site.xml file are modified to the following:

<pre name= "code" class= "HTML" > <configuration> <property>  <name>mapred.job.tracker< /name>  <value>localhost:9001</value> </property> </configuration>

B. Format the Hadoop file system and enter commands at the command line: Bin/hadoop Namenode-format

Multiple formatting is deleted/home/liuyazhuang/tmp (this directory is the same as the directory configured in Core-site.xml) after this folder is performing a format operation,

C. Start Hadoop and enter the command at the command line: bin/start-all.sh

D, verify that Hadoop is installed successfully, enter the following URL in the browser, if the normal opening instructions to install successfully.

http://localhost:50030 (Web page for MapReduce) http://localhost:50070 (HDFS Web page)

5. Running Example

(1) First establish two input files on local disk file01 and file02 $echo "Hello World Bye World" > File01 $echo "Hello Hadoop Goodbye hadoop" > Fil E02

(2) Create an input directory in HDFs: $hadoop fs-mkdir input

(3) Copy file01 and file02 into HDFs: $hadoop fs-copyfromlocal/home/liuyazhuang/file0* input

(4) Execution wordcount: $hadoop jar hadoop-0.20.2-examples.jar wordcount Input Output

(5) When finished, view the results $hadoop Fs-cat output/part-r-00000

Export Java_home =/home/chuanqing/profile/jdk-6u13-linux-i586.zip_files/jdk1.6.0_13export CLASSPATH = ".: $JAVA _home /lib: $CLASSPATH "Export PATH =" $JAVA _home/:P ath "Export hadoop_install=/home/chuanqing/profile/ Hadoop-0.20.203.0export path= $PATH: $HADOOP _install/binexport hadoop_install=/home/zhoulai/profile/ Hadoop-0.20.203.0export path= $PATH: $HADOOP _install/bin

Hadoop--linux Build Hadoop environment (simplified article)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More