Hadoop--linux Build Hadoop environment (simplified article)

Source: Internet
Author: User
Tags hadoop fs

Reprint Please specify source: http://blog.csdn.net/l1028386804/article/details/45771619
1, install the JDK (here to install JDK1.6 as an example, the specific version of the installation of the JDK is not limited) (1) Download the installation JDK: Make sure the computer is networked after the command line enter the following command to install the JDK
sudo apt-get install SUN-JAVA6-JDK
(2) Configuring the Computer Java Environment: Open/etc/profile, enter the following at the end of the file
Export Java_home = (JAVA installation directory)
Export CLASSPATH = ".: $JAVA _home/lib: $CLASSPATH"
Export PATH = "$JAVA _home/:P Ath"
(3) Verify that Java is installed successfully
Enter Java-version, and the output Java version information is installed successfully.
2, install the configuration ssh (1) Download and install SSH: Also enter the command line below command to install SSH
sudo apt-get install SSH
(2) Configure password-free login native: Enter the following two commands at the command line
$ ssh-keygen-t Rsa-p "-F ~/.ssh/id_rsa
Direct carriage return, when completed will generate two files in ~/.ssh/: Id_rsa and id_rsa.pub; These two pairs appear, similar to keys and locks.

Append the id_rsa.pub to the authorization key (there is no Authorized_keys file at this moment)
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

(3) Verify that SSH is installed successfully
Enter SSH localhost. If the display of a native login succeeds, the installation is successful.
3. Close the firewall $sudo UFW disable
Note: This step is very important, if you do not close, there will be no problem finding Datanode
4. Install and run Hadoop (for example, version 0.20.2) (1) Download Hadoop: Load Hadoop on the http://www.apache.org/dyn/closer.cgi/hadoop/core/page.
(2) Installation configuration Hadoop
Single-node configuration:
Installing a single-node Hadoop does not require configuration, in this way, Hadoop is recognized as a separate Java process.
Pseudo Distribution configuration:
A pseudo-distributed Hadoop is a cluster with only one node. In this cluster, the computer is both master and slave,
Even if Namenode is Datanode, both Jobtracker and Tasktracker.

The configuration process is as follows:

A, enter the Conf folder to modify the following file.
Add the following to the hadoop-env.sh:
Export Java_home = (JAVA installation directory)
The contents of the Core-site.xml file are modified to the following:
<configuration>     <!--Global Properties--     <property>  <name>hadoop.tmp.dir </name>  <value>/home/liuyazhuang/tmp</value> </property>    <!--file system Properties-<property>  <name>fs.default.name</name>  <value>hdfs:// Localhost:9000</value> </property> </configuration>
The contents of the Hdfs-site.xml file are modified to the following: (Replication default is 3, if not modified, datanode less than three will be error)
<configuration> <property>  <name>fs.replication</name>  <value>1</value > </property> </configuration>

The contents of the Mapred-site.xml file are modified to the following:

<pre name= "code" class= "HTML" > <configuration> <property>  <name>mapred.job.tracker< /name>  <value>localhost:9001</value> </property> </configuration>

B. Format the Hadoop file system and enter commands at the command line: Bin/hadoop Namenode-format

Multiple formatting is deleted/home/liuyazhuang/tmp (this directory is the same as the directory configured in Core-site.xml) after this folder is performing a format operation,

C. Start Hadoop and enter the command at the command line: bin/start-all.sh

D, verify that Hadoop is installed successfully, enter the following URL in the browser, if the normal opening instructions to install successfully.

http://localhost:50030 (Web page for MapReduce) http://localhost:50070 (HDFS Web page)

5. Running Example

(1) First establish two input files on local disk file01 and file02 $echo "Hello World Bye World" > File01 $echo "Hello Hadoop Goodbye hadoop" > Fil E02

(2) Create an input directory in HDFs: $hadoop fs-mkdir input

(3) Copy file01 and file02 into HDFs: $hadoop fs-copyfromlocal/home/liuyazhuang/file0* input

(4) Execution wordcount: $hadoop jar hadoop-0.20.2-examples.jar wordcount Input Output

(5) When finished, view the results $hadoop Fs-cat output/part-r-00000

Export Java_home =/home/chuanqing/profile/jdk-6u13-linux-i586.zip_files/jdk1.6.0_13export CLASSPATH = ".: $JAVA _home /lib: $CLASSPATH "Export PATH =" $JAVA _home/:P ath "Export hadoop_install=/home/chuanqing/profile/ Hadoop-0.20.203.0export path= $PATH: $HADOOP _install/binexport hadoop_install=/home/zhoulai/profile/ Hadoop-0.20.203.0export path= $PATH: $HADOOP _install/bin

Hadoop--linux Build Hadoop environment (simplified article)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.