1. Environment tool Version Introduction
Centos6.4 (final)
Jdk-7u60-linux-i586.gz
Hadoop-1.1.2.tar.gz
Sqoop-1.4.3.bin__hadoop-1.0.0.tar.gz
Mysql-5.6.11.tar.gz
2. Install centos
Refer to the use of online Ultra to create a USB flash drive to start and directly format the installation system. There are a lot of information on the Internet, but it is best not to change the host name during installation, also, it is best not to use the graphical interface to add users, because I have redone the system due to a problem and can complete all the terminal operations.
3. Install JDK
The installed centos has a JDK running environment. You need to uninstall it first because it only includes the JRE environment. It is better to install JDK because it requires debugging and compilation, rpm-Qa | grep JDK: Check the installed JDK version and run the command to uninstall it. Note that the machine must be connected to IOT platform. Configure the IP address and Yum-y remove JDK version name first, uninstall the package and install it in JDK. Here, the package with bin in the file is compiled and installed, you can decompress the package directly here. I use the following directory structure, as shown in the following environment variables. After decompression, put the package in/usr/Java. You need to configure the environment variable, VIM/etc/profile.
Export java_home =/usr/Java/jdk1.7.0 _ 60
Export jre_home =/usr/Java/jdk1.7.0 _ 60
Export classpath =.: $ java_home/lib/dt. jar: $ java_home/lib/tools. jar: $ jre_home/lib
Export Path = $ path: $ java_home/bin: jre_home/bin
Then ESC, save and exit
Source/etc/profile
Java-version
If the Java version is displayed, the JDK configuration is successful.
4. hadoop Installation
Before that, make sure that the SSH protocol is installed on the machine, such as rpm-Qa | grep ssh.
If there is information such as SSH, it indicates that there is SSH in the machine. Otherwise, you need to install the yum command, and Yum-y Install SSH.
Then create a hadoop user (here you can name it at will), groupadd hadoop, useradd-G hadoop (add hadoop to the hadoop group), and then switch to the hadoop user, you can add the sudo permission for hadoop in sudoers. Pay attention to the access permission for the sudoers file. After the permission is changed, you need to change it back to root all = (all) add hadoop all = (all) All under all. I forgot to remind you that it is best to install centos in English, which may bring a lot of convenience, (You will know after trying), Su-hadoop.
Then configure login without a password. Due to the pseudo distribution, it is configured on the computer, sudo service sshd restart, ssh-keygen-t rsa-p''
Enter again
The generated file is saved to/home/hadoop/. Ssh/by default, and then CD/home/hadoop/. SSH
Cat id_rsa.pub> authorized_keys
You can log on successfully without a password!
Next install hadoop, first decompress, and then put in/usr/local/hadoop, then need to configure a series of files, need to configure environment variables, hadoop-env.sh, core-site.xml, hdfs-site.xml, mapred-site.xml, the configuration information will be omitted. After configuration, you must grant the hadoop folder permission to the hadoop user, sudo chown-r hadoop: hadoop/usr/local/hadoop /, then source/etc/profile, hadoop namenode-format, For the first formatting, do not create TMP, name, data and other file paths in advance, the system will automatically create, otherwise it will cause node startup exceptions, then start, start-all.sh, JPs command to view the startup points, a total of 5 plus JPs 6
Namenode
Datanode
Jobtracker
Tasktracker
Sencondnamenode
JPS
There have been countless errors in the process. It can be confirmed that if the datanode does not exist, it is likely that the problem caused by formatting multiple times cannot be solved: go to hadoop/data/current/, VIM version, and check whether the namespaceid is consistent with the name. If not, change it to the same one. Then restart the cluster without formatting, try to avoid formatting multiple times. After the cluster is started successfully, you can run its own example.
5. Install MySQL
I will introduce a good blog here. I just installed it. To achieve this, yum needs to install some compilation tools, centosmysqlinstall.
6. Install sqoop
Decompress the package and put it in/usr/local/sqoop. Then, configure the environment variables, check whether the sqoop version is successfully configured, and then perform related necessary configurations, sqoop depends on the hadoop-core.jar in hadoop and MySQL connection jar package, are put in sqoop/lib, and then need to modify the sqoop configuration file, as shown in the attachment: First hadoop needs to start, then perform the MySQL import HDFS operation. sqoop import -- connect MYSQL: // localhost: 3306/databasename -- table tablename -- username -- Password-M 1, by default, it is imported to HDFS. Later you can configure hbase, hive, and so on, and then view the import: hadoop FS-CAT/user/hadoop/test/part-m-00000, such as the import process problems, if the problem is caused by a version issue, the system will prompt a methodnotfoundexecption exception. If it is another exception, you can modify the HDFS configuration file to solve the problem. This is what I solve, and then it is easy to use.
Add two property properties to the hdfs-site.xml. One is security, and the other is permission.
<Name> DFS. Permissions </Name> <value> false </value>
<Name> DFS. safemode. Threshold. PCT </Name> <value> 0 </value>
This article was written only afterwards, so there may be problems. If you have any unclear partners, you can ask me, 374492359
This article is from the "Java notes" blog, please be sure to keep this source http://maidoujava.blog.51cto.com/7607166/1533071