Hadoop-1.2.0 cluster installation and configuration

Source: Internet
Author: User
1. An overview of the establishment of the cloud platform for colleges and universities started a few days ago. The installation and configuration of the hadoop cluster test environment took about two days, I finally completed the basic outline and shared my experience with you. Ii. hardware environment 1, Windows 7 flagship edition 64-bit 2, VMWare Workstation ace version 6.0.23, RedHat Linux 54, Hadoop-1.2.0

Windows

VM

Virtual Machine Linux

IP

Function

Window 7

64-bit

VMware Workstation

Redhat1

192.168.24.250

Namenode, Master, jobtracker

Redhat2

192.168.24.249

Datanode, slave, tasktracker

Redhat3

192.168.24.248

Datanode, slave, tasktracker

3. Install vmware workstation and RedHat Linux 51, vmware workstation, and RedHat Linux 5, which are everywhere on the Internet, you can find more detailed and accurate information, which will not be described here. For details, refer to: Explain. For more information, see Configure. 4. install and configure hadoop1. Configure Linux before installing hadoop (1) Change the network connection mode of the three machines, select the virtual machine to be changed, and right-click Settings) (2) log on to Linux as a root user and set the IP address and default gateway (both the three machines must be set)
Enter VI/etc/sysconfig/network-scripts/ifcfg-eth0, (vi usage will not go into details, do not understand their own online query) modify the file content: device = eth0bootproto = staticipaddr = 192.168.24.250gateway = 192.168.27.254netmask = 255.255.255.0onboot = Yes

Set the IP address and default gateway as needed (3) configure the Host Name of the Virtual Machine (set for all three machines) Input VI/etc/sysconfig/networknetworking = yesnetworking_ipv6 = yeshostname = redhat1 (4) configure the relationship between the host name and IP address (set for all three machines). Enter VI/etc/hosts127.0.0.1 localhost192.168.24.250 redhat11942524.249 redhat2192.168.24.248 redhat3
This is the standard content. Remove unnecessary items. Otherwise, hadoop may view live in the master.
Nodes: 0 (5) disable the firewall (set for all three machines) Input chkconfig iptables off boot do not start the firewall input service iptables stop disable the current Firewall Service of course you can also set the firewall to allow hadoop, for the sake of simplicity, I will directly close the firewall here. (6) run the ping command between each virtual machine after the network connection settings are completed to ensure that the network between virtual machines is normal. For example: ping 192.168.24.2492 and establish SSH password-less login between Linux systems. This network is also everywhere, refer to: Login. 3. install and configure JDK (three machines must be installed) This reference my article http://www.jialinblog.com /? P = 74 slave (three machines to install) (1) download hadoop1.2.0 from hadoop website (2) use ftp to upload to Linux, if you do not understand, can refer to my article: http://blog.csdn.net/shan9liang/article/details/9110559http://www.jialinblog.com? P000064(3104decompress the installation into the directory where hadoop-1.2.0.tar.gz is located input: tar-zvxf hadoop-1.2.0.tar.gz is installed 5, configure hadoop (three machines must be set) (1) Configure hadoop environment variables and set JDK environment variables the same command: VI/etc/profile at the end of the file input: Export hadoop_home =/usr/local/hadoop-1.2.0export Path = $ path: $ hadoop_home/bin execute the command source/etc/profile to make profile take effect (2) Configure hadoop run parameters to change files in/CONF/hadoop-env.sh under the hadoop installation path (set on all three machines) add export java_home =/usr/Java/jdk1.7.0 _ 21 to the row 9th to change the hadoop installation path/CONF/masters and slaves files. (Configure only 192.168.24.250 Virtual Machine) input in the masters: 192.168.24.250slaves input: 192.168.24.249192.168.24.248 configure hadoop installation path/CONF/core-site.xml, hdfs-site.xml and mapred-site.xml three files. Core-site.xml: <configuration> <property> <Name> FS. default. name </Name> <value> HDFS: // 192.168.24.250: 9000 </value> </property> <Name> hadoop. TMP. dir </Name> <value>/tmp </value> </property> </configuration> hdfs-site.xml: <configuration> <property> <Name> DFS. replication </Name> <value> 2 </value> </property> <Name> DFS. permissions </Name> <value> false </value> </property> </configur Action> mapred-site.xml: <configuration> <property> <Name> mapred. job. tracker </Name> <value> 192.168.24.250: 9001 </value> </property> </configuration> (3) format the file system command: hadoop namenode-format: hadoop has been installed and configured. 5. Test 1. Start hadoop and execute the following command on the 192.168.24.250 machine to start the hadoop installation directory bin: Start-all.sh for hadoop, It Is nosebleed to start all processes, but if necessary, you can still only start HDFS (START-Dfs) or mapreduce (start-mapred.sh) web browser to monitor HDFS file system status and mapreduce execution tasks. In the HDFS file system browser, enter http: // 192.168.11.188: 50070/in the browser and enter http: // 192.168.11.188: 50030 2. Run the wordcount example provided by hadoop to execute the following commands in sequence: echo "it is a dog"> input1echo "it is not a dog"> input2hadoop FS-mkdir inputhadoop FS-copyfromlocal/root/input * inputhadoop JAR/usr/local/hadoop-1.2.0/hadoop-examples-1.2.0.jar wordcount input output allows you to view the running status http: // 192.168.24.250: 50030 check that all running results are successful! 6. Summary: The hadoop Cluster Environment simulated by multiple virtual machines is basically done, and the rest is detailed configuration as needed. If you want to port data to a physical machine, you only need to directly use the installation method on the virtual machine. Next, I will continue to write an article about how to connect eclipse to a remote hadoop cluster for development, which also involves some troublesome problems. But fortunately, I have solved them. I will sort them out immediately and look forward to it.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.