Hive [1] first Knowledge and installation, hive first knowledge of Installation

Source: Internet
Author: User

Hive [1] first Knowledge and installation, hive first knowledge of Installation
The premise of this article is that Hadoop & Java & mysql databases have been installed and configured, and environment variables have been configured in place
I. Basic Introduction to Hive
Hive is a data warehouse product in the Hadoop family. The biggest feature of Hive is that it provides SQL-like syntax, encapsulates the underlying MapReduce process, and enables SQL-based business personnel, you can also directly use Hadoop for big data operations. This solves the bottleneck of the original data analysts for big data analysis.

Hive originated from Facebook, which makes it possible to perform SQL queries on Hadoop, so that it can be conveniently used by non-programmers. Hive is a Hadoop-based data warehouse tool that maps structured data files into a database table and provides a complete SQL query function. It can convert SQL statements into MapReduce tasks for running.

Hive is the basic architecture of data warehouse built on Hadoop. It provides a series of tools for data extraction, conversion, and loading (ETL). This is a mechanism for storing, querying, and analyzing large-scale data stored in Hadoop. Hive defines a simple SQL-like query language called HQL, which allows users familiar with SQL to query data. At the same time, this language also allows developers familiar with MapReduce to develop custom mapper and reducer to handle complicated analysis tasks that cannot be completed by built-in mapper and reducer.

Advantage: Hive is most suitable for Data Warehouse applications. It can maintain massive amounts of data and mine data to form comments and reports. It is easy to get started with just a little understanding of SQL syntax;

Disadvantage: Hive does not support record-level update, insert, or delete operations; this is the constraint and limitation of Hadoop and HDFS design, which limits the work that Hive can do. However, you can generate a new table through a query or import the query results to a file. Hive does not support transactions;

Hive also has graphical Management of Commercial Products: Cloudera's Hue Project (https://github.com/cloudera/hue)

Hive functional modules:

 

Ii. Install Hive1) download Hive with the latest stable [stable] version: wget http://mirror.bit.edu.cn/apache/hive/stable/apache-hive-0.14.0-bin.tar.gz -- get the latest version of tar-zxvf apache-hive-0.14.0-bin.tar.gz -- decompress cp apache-hive-0.14.0-bin/usr/local/software/-- and copy to the installation directory/usr/local/softwaremv apache -hive-0.14.0-bin hive-0.14.0 -- change the name 2) hive configuration cd hive-0.14.0/confcp hive-default.xml.template hive-site.xmlcp hive-log4j.properties.template hive-log4j.prope Rties modify configuration file: store Hive metadata in MySQL vi hive-siet.xml <property> <name> javax. jdo. option. connectionURL </name> <value> jdbc: mysql: // 192.168.128.129: 3306/hive_metadata? CreateDatabaseIfNotExist = true </value> <description> JDBC connect string for a JDBC metastore </description> </property> <name> javax. jdo. option. connectionDriverName </name> <value> com. mysql. jdbc. driver </value> <description> Driver class name for a JDBC metastore </description> </property> <name> javax. jdo. option. connectionUserName </name> <value> root </value> <description> username to use against metastore database </description> </property> <name> javax. jdo. option. connectionPassword </name> <value> 911 </value> <description> password to use against metastore database </description> </property> <name> hive. metastore. warehouse. dir </name> <value>/user/hive/warehouse </value> <description> location of default database for the warehouse </description> </property> modify hive-log4j.properties # log4j. appender. eventCounter = org. apache. hadoop. metrics. jvm. eventCounterlog4j. appender. eventCounter = org. apache. hadoop. log. metrics. eventCounter 3) set the environment variable vi/etc/profileexport HIVE_INSTALL =/usr/local/software/hive-0.14.0export PATH = $ PATH: $ HIVE_INSTALL/binexport CLASS_PATH = $ CLASS_PATH: HIVE_INSTALL/libsource/etc/profile to make the modification take effect in a timely manner. 4) on hdfs, create a directory $ HADOOP_HOME/bin/hadoop fs-ls/$ HADOOP_HOME/bin/hadoop fs-mkdir/user/hive/warehouse $ HADOOP_HOME/bin/hadoop fs-chmod g + w/tmp $ HADOOP_HOME/bin/hadoop fs-chmod g + w/user/hive/warehouse

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.