Hive [1] first Knowledge and installation, hive first knowledge of Installation
The premise of this article is that Hadoop & Java & mysql databases have been installed and configured, and environment variables have been configured in place
I. Basic Introduction to HiveHive is a data warehouse product in the Hadoop family. The biggest feature of Hive is that it provides SQL-like syntax, encapsulates the underlying MapReduce process, and enables SQL-based business personnel, you can also directly use Hadoop for big data operations. This solves the bottleneck of the original data analysts for big data analysis.
Hive originated from Facebook, which makes it possible to perform SQL queries on Hadoop, so that it can be conveniently used by non-programmers. Hive is a Hadoop-based data warehouse tool that maps structured data files into a database table and provides a complete SQL query function. It can convert SQL statements into MapReduce tasks for running.
Hive is the basic architecture of data warehouse built on Hadoop. It provides a series of tools for data extraction, conversion, and loading (ETL). This is a mechanism for storing, querying, and analyzing large-scale data stored in Hadoop. Hive defines a simple SQL-like query language called HQL, which allows users familiar with SQL to query data. At the same time, this language also allows developers familiar with MapReduce to develop custom mapper and reducer to handle complicated analysis tasks that cannot be completed by built-in mapper and reducer.
Advantage: Hive is most suitable for Data Warehouse applications. It can maintain massive amounts of data and mine data to form comments and reports. It is easy to get started with just a little understanding of SQL syntax;
Disadvantage: Hive does not support record-level update, insert, or delete operations; this is the constraint and limitation of Hadoop and HDFS design, which limits the work that Hive can do. However, you can generate a new table through a query or import the query results to a file. Hive does not support transactions;
Hive also has graphical Management of Commercial Products: Cloudera's Hue Project (https://github.com/cloudera/hue)
Hive functional modules:
Ii. Install Hive1) download Hive with the latest stable [stable] version: wget http://mirror.bit.edu.cn/apache/hive/stable/apache-hive-0.14.0-bin.tar.gz -- get the latest version of tar-zxvf apache-hive-0.14.0-bin.tar.gz -- decompress cp apache-hive-0.14.0-bin/usr/local/software/-- and copy to the installation directory/usr/local/softwaremv apache -hive-0.14.0-bin hive-0.14.0 -- change the name 2) hive configuration cd hive-0.14.0/confcp hive-default.xml.template hive-site.xmlcp hive-log4j.properties.template hive-log4j.prope Rties modify configuration file: store Hive metadata in MySQL vi hive-siet.xml <property> <name> javax. jdo. option. connectionURL </name> <value> jdbc: mysql: // 192.168.128.129: 3306/hive_metadata? CreateDatabaseIfNotExist = true </value> <description> JDBC connect string for a JDBC metastore </description> </property> <name> javax. jdo. option. connectionDriverName </name> <value> com. mysql. jdbc. driver </value> <description> Driver class name for a JDBC metastore </description> </property> <name> javax. jdo. option. connectionUserName </name> <value> root </value> <description> username to use against metastore database </description> </property> <name> javax. jdo. option. connectionPassword </name> <value> 911 </value> <description> password to use against metastore database </description> </property> <name> hive. metastore. warehouse. dir </name> <value>/user/hive/warehouse </value> <description> location of default database for the warehouse </description> </property> modify hive-log4j.properties # log4j. appender. eventCounter = org. apache. hadoop. metrics. jvm. eventCounterlog4j. appender. eventCounter = org. apache. hadoop. log. metrics. eventCounter 3) set the environment variable vi/etc/profileexport HIVE_INSTALL =/usr/local/software/hive-0.14.0export PATH = $ PATH: $ HIVE_INSTALL/binexport CLASS_PATH = $ CLASS_PATH: HIVE_INSTALL/libsource/etc/profile to make the modification take effect in a timely manner. 4) on hdfs, create a directory $ HADOOP_HOME/bin/hadoop fs-ls/$ HADOOP_HOME/bin/hadoop fs-mkdir/user/hive/warehouse $ HADOOP_HOME/bin/hadoop fs-chmod g + w/tmp $ HADOOP_HOME/bin/hadoop fs-chmod g + w/user/hive/warehouse