Hadoop Linux Installation Step flow 1. Hardware preparation 2. Software preparation (recommended CDH) 3. Distribute the Hadoop installation package to the individual nodes 4. JDK5. Modify/etc/ Hosts configuration file 6. Set SSH password-free login 7. Modify the configuration file 8. Start the service 9. Authentication 1. Download software 1.1 Apache version download link: http://www.apache.org/1.2 CDH download Link:/http WWW.CLOUDERA.COM/2. Installing the JDK, extracting the Hadoop installation package, and distributing it to each node 3. Modify the/etc/hosts file 4. Set SSH password-free login 5. Directory Introduction Bin ———— Hadoop's most basic management scripts and directories that use scripts etc ———— the directory where the Hadoop configuration files include ———— external programming library header file Lib ———— Hadoop provides dynamic programming libraries and static libraries externally, Use LIBEXEC with the Include directory ———— the directory where the shell configuration files for each service are located, to configure the log output sbin ———— The directory where each module of Hadoop compiles the jar package 6. configuration file (modified with SCP command distributed to each node) 1.ENV.SH2.MAPRED-SITE.XML3.CORE-SITE.XML4.YARN-SITE.XML5.HDFS-SITE.XML6.SL Ave7. Starting the service 1. Format Hdfsbin/hadoop namenode-format2. Start Hdfsstart-dfs.sh3. Yarnstart-yarn.sh8. Validate JPS (show five service processes) or Web Access/HTTP/host Name: Port (where port is configured in XML configuration file) 9. Problem 1. Hadoop does not start successfully after a virtual machine restart: Add the appropriate configuration in the Core-site.xml file because the/tmp directory is deleted each time you restart the system
Hadoop Linux Installation