Deploy RDB using the SparkSQL distributed SQL engine in Linux | install MySQL + Hive (Tutorial) and sparksqlrdb
●Deploy MySQL
# Find and delete local MySQLrpm-qa | grep mysqlrpm-e mysql-libs-5.1.66-2.el6_3.i686 -- nodeps # Install the specified version of MySQLrpm-ivh MySQL-server-5.1.73-1.glibc23.i386.rpm rpm-ivh MySQL-client-5.1.73-1.glibc23.i386.rpm # change the password of mysql (run the following command directly) /usr/bin/mysql_secure_installation (Note: Set the root password and select Delete anonymous user to allow remote connection) # log on to mysqlmysql-u root-p
Complete the basic MySQL installation.
Add an account to Spark SQL and activate account permissions. The default database name is hiveMetadata. If both the account and password are spark, you can authorize the SQL statement to write it as follows:
mysql> grant all on hiveMetastore.* to spark@'localhost' identified by 'spark';mysql> flush privileges;
●
Prepare configuration file conf/hive-site.xml
Next, we are going to start JDBC/ODBC Server. Before starting, we need to prepare the following configuration files.
If you are working with an existing Hive, you can directly use the Hive configuration file conf/hive-site.xml, or create a new one, prepare a configuration file named hive-site.xml under the conf directory, the content is as follows:
javax.jdo.option.ConnectionURL
jdbc:mysql://localhost:3306/hiveMetastore?createDatabaseIfNotExist=true
JDBC connect string for a JDBC metastore
javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver
Driver class name for a JDBC metastore
javax.jdo.option.ConnectionUserName
spark
username to use against metastore database
javax.jdo.option.ConnectionPassword
spark
password to use against metastore database
hive.hwi.war.file
lib/hive-hwi-0.12.0.war
This sets the path to the HWI war file, relative to ${HIVE_HOME}.
hive.hwi.listen.host
0.0.0.0
This is the host address the Hive Web Interface will listen on
hive.hwi.listen.port
9999
This is the port the Hive Web Interface will listen on
●
Start JDBC/ODBC Server
Now you can start JDBC/ODBC Server. The command is:
./sbin/start-thriftserver.sh
●
Use beeline interactive tools
After the JDBC/ODBC Server is started, we can use beeline to test whether the startup is normal:
./bin/beeline
The following command can be used to connect to the JDBC/ODBC Server:
beeline> !connect jdbc:hive2://localhost:10000
You can also specify the JDBC Server directly when starting beeline:
./bin/beeline -u 'jdbc:hive2://localhost:10000'
You can modify the system code to prevent Chinese garbled characters in SQL query results.
LANG=zh_CN.UTF-8; ./bin/beeline -u 'jdbc:hive2://localhost:10000'
●
Run Spark SQL command line interface
./bin/spark-sql