Linux build Hadoop Environment 1, install JDK (1) Download and install JDK: Make sure the computer is networked after the command line enter the following command to install the JDK sudo apt-get install SUN-JAVA6-JDK (2) Configure the computer Java environment: Open/etc /profile, enter the following content at the end of the file export Java_home = (JAVA installation directory) export CLASSPATH = ".: $JAVA _home/lib: $CLASSPATH" Expo RT PATH = "$JAVA _home/:P Ath" (3) Verify that Java installation successfully entered Java-version, the output Java version information is installed successfully. 2, install the configuration ssh (1) Download and install SSH: Similarly, at the command line, enter the following command to install SSH sudo apt-get install ssh (2) To configure no password logon native: Enter the following two commands at the command line ssh-keygen-t RSA- P '-F ~/.ssh/id_rsa direct Enter, after completion will generate two files in ~/.ssh/: Id_rsa and id_rsa.pub; These two pairs appear, similar to keys and locks. Append the id_rsa.pub to the authorization key (currently no Authorized_keys file) $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys (3) Verify S SH is installed successfully entered SSH localhost. If the display of a native login succeeds, the installation is successful. 3, shut down the firewall $sudo UFW Disable Note: This step is very important, if you do not close, you can not find the Datanode problem 4, install running Hadoop (0.20.2 version for example) (1) Download Hadoop: in Http://www.apache The. Org/dyn/closer.cgi/hadoop/core/page Downloads Hadoop. (2) Installation configuration Hadoop single node configuration: Installing a single node of Hadoop without configuration, in this way, Hadoop is recognized as a separate Java process. Pseudo-distribution configuration: A pseudo-distributed Hadoop is a cluster with only one node. In this episodeGroup, the computer is both master and slave, even if Namenode is also Datanode, both Jobtracker and Tasktracker. The configuration process is as follows: A, enter the Conf folder to modify the following file. Add the following to the hadoop-env.sh: Export Java_home = (JAVA installation directory) The contents of the Core-site.xml file are modified to the following:<Configuration> <!--Global Properties - < Property> <name>Hadoop.tmp.dir</name> <value>/home/zhongping/tmp</value> </ Property> <!--File System Properties - < Property> <name>Fs.default.name</name> <value>hdfs://localhost:9000</value> </ Property> </Configuration>the contents of the Hdfs-site.xml file are modified to the following: (Replication default is 3, if not modified, datanode less than three will be error)<Configuration> < Property> <name>Fs.replication</name> <value>1</value> </ Property> </Configuration>the contents of the Mapred-site.xml file are modified to the following:<Configuration> < Property> <name>Mapred.job.tracker</name> <value>localhost:9001</value> </ Property> </Configuration> B. format Hadoop file system, enter command at command line: Bin/hadoop Namenode-format C, start Hadoop, enter command at command line: bin/start-all.sh D, verify that Hadoop is installed successfully, enter in browser The following URL, if open correctly, indicates that the installation was successful. http://localhost:50030 (Web page for MapReduce) http://localhost:50070 (HDFs Web page) 5, running instance (1) First build two input files on local disk FILE01 and FILE02 $ echo "Hello World Bye World" > File01 $echo "Hello Hadoop Goodbye Hadoop" > File02 (2) Create an input directory in HDFs: $hadoop Fs-mkdir input (3) copies file01 and file02 into HDFs: $hadoop fs-copyfromlocal/home/zhongping/file0* input (4) executes wordcount : After $hadoop jar Hadoop-0.20.2-examples.jar wordcount Input Output (5) is complete, view the results $hadoop Fs-cat output/part-r-000 00export Java_home =/home/chuanqing/profile/jdk-6u13-linux-i586.zip_files/jdk1.6.0_13export CLASSPATH = ".: $JAVA _ Home/lib: $CLASSPATH "Export PATH =" $JAVA _home/:P ath "Export hadoop_install=/home/chuanqing/profile/ Hadoop-0.20.203.0export path= $PATH: $HADOOP _install/binexport hadoop_install=/home/zhoulai/profile/ Hadoop-0.20.203.0export path= $PATH: $HADOOP _install/bin
Linux builds Hadoop environment