01_note_hadoop introduction of source and system; Hadoop cluster; CDH Family
Unzip Tar Package Installation JDK and environment variable configuration
TAR-XZVF jdkxxx.tar.gz to/usr/app/(custom app to store the app after installation)
Java-version View current system Java version and environment
Rpm-qa | grep Java View installation packages and dependencies
Yum-y remove xxxx (remove grep out of each package)
Configure the environment variable/etc/profile, and then enable configuration after configuration Source/etc/profile
Note:
Turn off the firewall to avoid accidental error (CentOS7 using SYSTEMCTL command)
sudo service iptables stop
Check the status. Sudo chkconfig iptables off
Chkconfig Check all service conditions
Host Mutual access password-free settings (Understanding password-free access direction, copy RSA from A to B, a can access B withou pwd)
SSH-KEYGEN-T RSA
SCP. ssh/id_rsa.pub [Email Protected]:/home/henry/.ssh/authorized_keys
Ssh-copy-id-i id_rsa.pub-p [email protected] (copy multiple keys to Authorized_keys for more than one password-free access)
Id_rsa >> Private key
Id_rsa_pub >> Public Key
The local public key also needs to be put in
Hostname and IP-free password may not be universal, both of them try to succeed
Note:
SCP remote Copy:scp-r/usr/jdk1.8.0 [email protected]:/usr/(-R comes with copy folder contents)
Note Permission denied situation, if directly with the normal user write to the path such as/usr without write permission, will be error, the solution is to use
[Email protected] Write or write to \home\user
/etc/host resolves host name to IP address (all related master/slave hosts need to be configured)
Install Hadoop and configure
Download hadoop-2.8.0.tar.gz and extract To/home/henry
Cd/etc/hadoop (for version 2.8;/config for version 1.x)
Reference 2.x:
Http://www.cnblogs.com/edisonchou/p/5929094.html
Http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html
VI hadoop-env.sh file To configure Java home
VI Core-site.xml
VI Hdfs-site.xml
VI Yarn-site.xml
VI Mapred-site.xml (need to copy one first: CP mapred-site.xml.template Mapred-site.xml)
VI Master (Configure Namenode hostname, 2.x may only need to configure slaves)
VI Slaves (Configure Datanode hostname, note remove localhost, otherwise the master itself will act as Datanode)
sudo vi/etc/profile configuration hadoop_home
Hadoop Namenode-format
Start Hadoop attempt
Sbin/start-dfs.sh may need to enter Yes to continue, note to return $ before you can
sbin/start-yarn.sh
Verify startup success:/usr/jdkxxx/bin/jps (Java-related process statistics, Java process State)
The Web interface accesses the http://10.0.0.11:50070, or configures the host with http://master.henry:50070 access
Note:
Note You need to shut down the firewall systemctl stop Firewalld.service (CentOS7)
Boot Disable firewall systemctl disable Firewalld.service (CentOS7)
View firewall status Systemctl status Firewalld.service (CentOS7)
ENV-List environment variable configuration
Configuration Hadoop2.8.1 compilation Environment -http://blog.csdn.net/bjtudujunlin/article/details/50728581
Yum intall SVN (!!! Note that the host firewall may cause some network requests to fail, such as Avast)
Yum install autoconf Automake libtool cmake Dependency Package
Download and install Maven
Download and install Google Protoc (requires compiling the specified compiled path./configure--prefix=/usr/app/protoc)
Config/etc/profile
Mvn-v OK
Protoc--version OK
SVN download Source Compile Hadoop
MVN Package-dskiptests-pdist,native,docs-dtar (-dtar comes with generating a. Tar installation package)
SVN checkout http://svn.apache.org/repos/asf/hadoop/common/trunk/(Hadoop trunk or/common/tags/x.x.x for old
Version
The compiled storage directory is defined in the Pom.xml file
<outputDirectory>hadoop-dist/target</outputDirectory>
Hadoop1.2.1 Lab Environment Operation Sample algorithm
mkdir input
echo "Hello world!" > test1.txt (Echo Standard output to screen,> output to file)
echo "Hello hadoop!" > Test2.txt
Hadoop_home/bin/hadoop fs-ls (view HDFs file)
Hadoop_home/bin/hadoop Fs-put. /input./in (HDFs root directory./is/user/henry)
Hadoop_home/bin/hadoop Fs-ls
Hadoop_home/bin/hadoop Fs-ls./in/*
Hadoop_home/bin/hadoop fs-cat./in/test1.txt (Test OK no problem description HDFs upload OK)
Bin/hadoop jar Hadoop-example-1.2.1.jar WordCount in Out (Hadoop built-in sample count words)
Hadoop_home/bin/hadoop Fs-ls (generated out folder)
Hadoop_home/bin/hadoop Fs-ls./out
Hadoop_home/bin/hadoop Fs-cat./OUT/PART-XXX (successfully running a mapreduce job)
Note:
(If error: Org.apache.hadoop.mapred.SafeModeException:JobTracker is in safe mode, turn off safety)
Hadoop Dfsadmin-safemode Leave
Hadoop2.8.1 Lab Environment Operation Sample algorithm
Note:
It looks like a mapreduce sample, such as a Hadoop jar./share/hadoop/mapreduce/hadoop-mapreduce-
Examples-2.8.1.jar Pi 5 5) before you can use the Hadoop fs-xxx command, such as-ls,-put ...
The Hadoop fs root directory is/user/henry
(If error (could only being replicated to 0 nodes, instead of 1))
!!! Firewall firewall, systemctl disable Firewalld.service, but the power-on restart will not take effect to disable, need to stop first
Http://localhost:50070/dfshealth.jsp for 1.2.1
Http://localhost:50070/dfshealth.html for 2.x (can view file system HDFs files)
Http://localhost:50030/jobtracker.jsp for 1.2.1
Hadoop fs-put (upload)/-get (download)/-rm/-rmr/(2.x.x command for HDFs dfs-xxx)
HDFs dfsadmin-report (get HDFs basic statistics)
The test environment currently under construction
Master-namenode
Slave01-datanode1
Slave02-datanode2
CentOS7 Installation configuration Hadoop 2.8.x, JDK installation, password-free login, Hadoop Java sample program run