Environment : Centos7+hadoop2.5.2+hive1.2.1+mysql5.6.22+indigo Service 2
train of thought : Hive load log →hadoop distributed execution → requirement data into MySQL
Note : Hadoop log Analysis System on the Internet a lot of data, but most of them have to write a small problem, can not run smoothly, but this article has been personally validated, can be coherent. It also includes a detailed explanation of the exceptions that may be encountered and related solutions. 1) Log format Analysis
First, analyze the log format of Hadoop, which is the simplest log format, one line, and the log format can be described in turn as: Date, time, category, related class, and hint information. As shown below:
2) Hive Storage table Design
Rdate Time Type Rclass infor1 Infor2 Infor3
String array string string string, String , string 3) Hive table definition
CREATE table if not exists Loginfo (
rdate string, time
array<string>,
type string,
Relateclass String,
information1 string,
information2 string,
Information3 string)
row format delimited Fields terminated
by "collection items terminated by", '
map keys terminated by ': ';
4 MySQL table definition
drop table if exists Hadooplog;
CREATE TABLE Hadooplog (
ID int (one) not null auto_increment,
rdate varchar (x) null, time
varchar (50) Default NULL,
type varchar ($) default NULL,
relateclass tinytext default NULL,
information longtext Default NULL,
primary key (ID)
) engine=innodb default Charset=utf8;
5 MySQL database operation
[Root@master/]# cd/usr/bin
[root@master bin]# service MySQL start starting
mysql
success! [Root@master bin]# mysql-uroot–p
mysql> CREATE DATABASE hive;
Query OK, 1 row affected (0.03 sec)
mysql> use hive
Database changed mysql> source/usr/local/mysql/
Sql/hadooplog.sql
Query OK, 0 rows affected, 1 warning (0.05 sec)
Query OK, 0 rows affected (0.18 sec)
mysq l> desc Hadooplog;
6) DBHelper: Responsible for establishing a connection with Hive and MySQL
Package com.smq.hive;
Import java.sql.Connection;
Import Java.sql.DriverManager;
Import java.sql.SQLException;
public class DBHelper {private static Connection conntohive = null;
private static Connection conntomysql = null;
Private DBHelper () {}///Hive connection public static Connection Gethiveconn () throws SQLException { if (conntohive = = null) {try {class.forname ("org
. Apache.hive.jdbc.HiveDriver ");
catch (ClassNotFoundException err) {err.printstacktrace ();
System.exit (1);
} conntohive = Drivermanager.getconnection ("Jdbc:hive2://192.168.2.10:10000/default", "", "");
return conntohive;
///MySQL connection public static Connection Getmysqlconn () throws SQLException { if (Conntomysql = = null) {try {class.forname ("com.mysq
L.jdbc.driver ");
catch (ClassNotFoundException err) {err.printstacktrace ();
System.exit (1);
} Conntomysql = Drivermanager.getconnection ("Jdbc:mysql://192.168.2.10:3306/hive",
"Root", "xxxx");
return conntomysql;
public static void Closehiveconn () throws SQLException {if (conntohive!= null) {
Conntohive.close ();
The public static void Closemysqlconn () throws SQLException {if (conntomysql!= null) {
Conntomysql.close ();
} public static void Main (string[] args) throws SQLException { System.out.println (Getmysqlconn ());
Closemysqlconn (); }
}
7) Hiveutil: Tools Class
Package com.smq.hive;
Import java.sql.Connection;
Import Java.sql.ResultSet;
Import java.sql.SQLException;
Import java.sql.Statement;
public class Hiveutil {//CREATE table public static void CreateTable (String sql) throws SQLException {
Connection conn = Dbhelper.gethiveconn ();
Statement stmt = Conn.createstatement ();
Stmt.execute (SQL); }//Based on criteria query data public static ResultSet querydata (String sql) throws SQLException {Connect
Ion conn = Dbhelper.gethiveconn ();
Statement stmt = Conn.createstatement ();
ResultSet res = stmt.executequery (SQL);
return res; }//Load data public static void LoadData (String sql) throws SQLException {Connection conn =
Dbhelper.gethiveconn ();
Statement stmt = Conn.createstatement ();
Stmt.execute (SQL); //Store data in MySQL publIC static void Hivetomysql (ResultSet res) throws SQLException {Connection conn = Dbhelper.getmysqlconn ();
Statement stmt = Conn.createstatement ();
while (Res.next ()) {String rdate = res.getstring (1);
String time = res.getstring (2);
String type = res.getstring (3);
String Relateclass = res.getstring (4);
String information = res.getstring (5) + res.getstring (6) + res.getstring (7);
StringBuffer sql = new StringBuffer ();
Sql.append ("INSERT into Hadooplog values (0, '");
Sql.append (rdate + "', '");
Sql.append (Time + "', '");
Sql.append (Type + "', '");
Sql.append (Relateclass + "', '");
Sql.append (Information.replaceall ("\", "\") + "')"); SysTEM.OUT.PRINTLN (SQL);
Stmt.executeupdate (Sql.tostring ()); }
}
}
8) Analyszehadooplog: Main class
Package com.smq.hive;
Import Java.sql.ResultSet;
Import java.sql.SQLException; public class Analyszehadooplog {public static void main (string[] args) throws SQLException {Str
Ingbuffer sql = new StringBuffer ();
Step one: Create TABLE Sql.append in Hive ("CREATE table if not exists loginfo (");
Sql.append ("Rdate string,");
Sql.append ("Time array<string>,");
Sql.append ("type string,");
Sql.append ("Relateclass string,");
Sql.append ("Information1 string,");
Sql.append ("Information2 string,");
Sql.append ("Information3 string)");
Sql.append ("Row format delimited fields terminated by");
Sql.append ("collection items terminated by ', '");
Sql.append ("Map keys terminated by ': '");
SYSTEM.OUT.PRINTLN (SQL); Hiveutil.createtable (sql.tostring ());
Step two: Load Hadoop log file Sql.delete (0, Sql.length ());
Sql.append ("Load data local Inpath");
Sql.append ("'/usr/local/mysql/sql/hadoop-root-namenode-master.log '");
Sql.append ("Overwrite into table Loginfo");
SYSTEM.OUT.PRINTLN (SQL);
Hiveutil.loaddata (Sql.tostring ());
Step Three: Query useful information sql.delete (0, Sql.length ());
Sql.append ("Select Rdate,time[0],type,relateclass,");
Sql.append ("Information1,information2,information3");
Sql.append ("from Loginfo where type= ' INFO ')";
SYSTEM.OUT.PRINTLN (SQL);
ResultSet res = Hiveutil.querydata (sql.tostring ());
Fourth step: The information detected after transformation to save in MySQL Hiveutil.hivetomysql (res);
Fifth step: Close Hive connection dbhelper.closehiveconn (); Sixth step: Close MySQL connection dbhelper.cLosemysqlconn (); }
}
9 Eclipse Operation steps
a) Import all jar packages below Hive/lib
b To import the JDBC driver package for MySQL
c start Hadoop, MySQL services
d start the Hive remote service:
]# Hive–service Hiveserver &
Exception occurred:
Exception in thread "main" Java.lang.ClassNotFoundException:org.apache.hadoop.hive.service.HiveServer
Solution:
]# Hive–service Hiveserver2 &
( e) Run the program: Run As→run on Hadoop
An exception occurred 1:
Java.lang.ClassNotFoundException:org.apache.hadoop.hive.jdbc.HiveDriver
Solution:
In hive1.2, the related directory structure is changed
"Org.apache.hadoop.hive.jdbc.HiveDriver";
To
"Org.apache.hive.jdbc.HiveDriver";
If you use Hiveserver2, you must make the following changes:
"Jdbc:hive://localhost:10002/default", "", "";
To
"Jdbc:hive2://localhost:10002/default", "" "," "
An exception occurred 2:
Exception in thread "main" java.sql.SQLException:null, message from server: "Host ' 192.168.2.1 ' isn't allowed to connect To this MySQL server "
Solution:
Mysql> GRANT all privileges on. To ' myuser ' @ '% ' identified by ' MyPassword ' with GRANT OPTION;
mysql> FLUSH privileges;
f) Login to MySQL view: SELECT * FROM Table
10) Anomaly Statistics
It is shown that during the learning of Hadoop, the proportion of abnormal occurrence is low and the learning process is smooth.
As you can see in the following figure, Datanode is more prone to anomalies than namenode
11) Workload statistics
The top and bottom graphs can see that the maximum amount of learning month is 6, July
The following figure reflects that the time of study is mainly in the middle of the morning and at the weekend.