Turn from: http://www.cyblogs.com/My own blog ~
first of all, we need 3 machines, and here I created 3 VMs in VMware to ensure my hadoop is fully distributed with the most basic configuration. I chose the CentOS here because the Redhat series, which is popular in the enterprise comparison. After the installation, the final environmental information:
IP Address
H1H2h3
Here is a small question to see, is to modify the machine's host name.
Vim /etc/sysconfig/network
Then restart the computer on the OK. The most basic environment already has, is 3 machines.
In fact, the construction of Hadoop is not very troublesome, we take a holistic look at the need for a few steps?
- Configure the Hosts
- Create a Hadoop run account
- Configuring SSH password-free connection
- Download and unzip the Hadoop installation package
- Configure Namenode, modify site file
- Configure hadoop-env.sh
- Configuring Master and is a slave file
- Replicate Hadoop to each node
- Formatting Namenode
- Start Hadoop
- Use JPS to verify that the background processes are successfully started
We step by step to subdivide, perhaps in the middle there is wrong or insufficient place, hope the master to correct. Because I am also a beginner here, welcome to shoot bricks.
Configure the hosts
We need to meet the host on 3 machines, with a host name instead of a string of IP addresses, the world suddenly clean.
sudo 192.168.230.133 192.168.230.160 192.168.230.161 h3
Create a Hadoop run account
Someone here may feel unprofessional, because I want to install Hadoop under the Hadoop user directory, but the best way is to install the common directory, for example:/usr/local. I'm just a beta, and then Hadoop is installed under the Hadoop user.
sudo groupadd Hadoop- s-D /home/hadoop-m hadoop-g hadoop sudo passwd hadoop
Here's what the 3 commands mean:
- Set up a Hadoop group
- Add a Hadoop user and create a Hadoop user directory under Home and add it to the Hadoop group
- Set a password for Hadoop
Configuring SSH password-free connection
This step is also important, generally in a lot of machines, the same users can mutual password-free access, in the efficiency is a good thing, especially n multi-machine cluster, you think, a script will all the things please finish, how cool.
Before this, OPS and I shared a password-free rule. This is probably the case:
Offline environment----------a springboard--and production environment, each of which is best to enter a password, because it is for security reasons, and then in each layer the same user password-free login is no problem, it is really a pattern.
Ssh-keygen-t DSA-P' -f ~/.ssh/id_dsa Cat ~/.ssh/id~/.ssh/ Authorized_keys
Analysis: 1. Generate a public key, private key
2. Put the public key into the Authorized_keys
Here's the problem, you need to change their permissions to 600.
[email protected]. ssh]$ chown Authorized_keys
H1
ssh-Copy-ID -i ~/ssh/id_dsa. pub [email protected]ssh-copy-ID -i ~/ssh/id_dsa. pub [email protected]
H2
ssh-Copy-ID -i ~/ssh/id_dsa. pub [email protected]ssh-copy-ID -i ~/ssh/id_dsa. pub [email protected]
H3
ssh-Copy-ID -i ~/ssh/id_dsa. pub [email protected]ssh-Copy- ID -i ~/ssh/id_dsa. pub [email protected]
download and unzip the Hadoop installation package
This is the latest version of the download: http://mirrors.cnnic.cn/apache/hadoop/common/current/2.7.0
We configure all the parameters that need to be configured on the H1, and then copy them all to H2,h3.
Hadoop v2.7.0 configuration file under/home/hadoop/hadoop/etc/hadoop, we need to configure each:
[Email protected] hadoop]$ vim Core-site.xml<configuration> <property > <name>Fs.default.name</name> <value>hdfs://h1:9000</value> <final>True</final> </Property > <property > <name>Hadoop.tmp.dir</name> <value>/home/hadoop/hadoop/tmp</value> <description>A BAS for other temporary directories</Description> </Property > </configuration>
And then:
[Hadoop@h1export java_home=/opt/java/jdk1. 7. 0_55
If you don't know Java_home, where is it? In the terminal directly knocked Echo $JAVA _home saw.
Then again:
[Email protected] hadoop]$ vim Hdfs-site.xml<configuration> <property > <name>Dfs.name.dir</name> <value>/home/hadoop/hdfs/name</value> </Property > <property > <name>Dfs.data.dir</name> <value>/home/hadoop/hdfs/data</value> </Property > <property > <name>Dfs.replication</name> <value>1</value> </Property > </configuration>
Then again:
<configuration><property ><name> Mapred.job.tracker</name><value>h1:9001</value ><final>true</final></ Property ></configuration>
Then again:
[Hadoop@h1 Hadoop] Touch Mastersvim Masters configuration in H1
Then again:
[Hadoop@h1 Hadoop] Vim slaves configuration into H2 H3
replicate Hadoop to each node
[Hadoop@h1 ~] h2:~[hadoop@h1 ~]h3:~
You also need to configure the system variables on 3 units.
Export Export Path=$PATH:$HADOOP _installexport hadoop_common_lib_native_dir="/home/hadoop/ Hadoop/lib/native "export hadoop_opts="$HADOOP _opts -djava.library.path=/home/hadoop/ Hadoop/lib "source /etc/profile
Well, it's up.
Finally, using the JPS command, you can see that Hadoop-related programs are up.
Hadoop has a lot of admin backgrounds to look at:
Like what:
Http://192.168.230.133:50070/dfshealth.html#tab-overview
Next, we'll continue to record more blogs about Hadoop first.
Configuration and installation of Hadoop fully distributed mode