Compared with many tutorials, Hive has introduced concepts first. I like to install them first, and then use examples to introduce concepts. Install Hive first. Check whether the corresponding yum source has been installed. If the yum source blog. csdn. netnsrainbowarticledetails42429339hive is not installed according to the yum source file written in this tutorial
Compared with many tutorials,
I. Installation of Hive
The components are arranged as follows:
172.16.57.75 bd-ops-test-75 mysql-server
172.16.57.77 bd-ops-test-77 Hiveserver2
1. Install Hive
Install the Hive on 77:
# Yum Install hive Hive-metastore
Label: First, an overview of the task map: The process is to first delete the files on HDFs with Thdfsdelete, then import the data from the organization tables in Oracle into HDFS, establish hive connection-"Hive Build Table-" Tjava Get system Time-" Thiveload Import the files on HDFs into the hive table. The settings for each of these components are described b
Hive Installation Deployment(Installation will have version issue hadoop1.0 version above please install hive-0.90 testhadoop2.0 above Please install hive-0.12.0 or the latest version of the test)Hive-0.9.0:http://pan.baidu.com/s/1rj6f8hive-0.12.0:http://mirrors.hust.edu.cn/apache/
[A "," B "," C "]
3
[D "," E "," F "]
4
[D "," E "," F "]
Add a lateral view:
Select mycol1, mycol2 from basetable lateral view explode (col1) mytable1 as mycol1 lateral view explode (col2) mytable2 as mycol2;
The execution result is:
Int mycol1
String mycol2
1
""
1
"B"
1
"C"
2
""
2
"B"
2
"C"
3
"D"
Union syntax
Select_statement Union all select_statement...
Union is used to combine the result sets of multiple select statements into an independent result set. Currently, only union all (BAG Union) is supported ). Duplicate rows are not eliminated. The number and name of columns returned by each select statement must be the same. Otherwise, a syntax error is thrown.
If some extra processing is required for the Union result, the entire statement can be embedded in the from clause, as
Label: Start Thriftserver in spark after modification and then connect in beeline mode under Spark's Bin or write a. sh file every time you execute it directly . sh file contents such as:./beeline-u jdbc:hive2://yangsy132:10000/default-n root-p YangsiyiSpark on Hive configures Hive's metastore to MySQL
2.3 Hive Internal Introduction: P44The jar file under the $HIVE _home/lib is a specific functional part; (CLI module) Other components, Thrift services, remote access to other process features, and the ability to access hive using JDBC and ODBC; all hive clients need a Metastore Service (meta-data Service), which is us
Label: Environment:hadoop2.2.0 hive0.13.1 Ubuntu 14.04 LTS java version "1.7.0_60"oracle10g * * * Welcome reprint. Please indicate source * * * http://blog.csdn.net/u010967382/article/details/38709751Download the installation package at the following addressHttp://mirrors.cnnic.cn/apache/hive/stable/apache-hive-0.13.1-bin.tar.gz The installation package is extracted to the server/home/fulong/
First, the installation mode introduction:Hive official on-line introduces 3 kinds of hive installation methods, corresponding to different application scenarios. 1, in-line mode (meta data to protect the village in the embedded derby species, allow a session link, try multiple session links will be error) 2. Local mode (install MySQL locally instead of Derby store metadata) 3. Remote mode (remotely install MySQL instead of Derby storage metadata)
Alex's Hadoop cainiao Tutorial: tutorial 10th Hive getting started, hadoophiveInstall Hive
Compared to many tutorials, I first introduced concepts. I like to install them first, and then use examples to introduce them. Install Hive first.
First confirm whether the corresponding yum source has been installed, if not as written in this tutorial install cdh yum sour
[Spark] [Hive] [Python] [SQL] A small example of Spark reading a hive table$ cat Customers.txt1Alius2Bsbca3Carlsmx$ hiveHive>> CREATE TABLE IF not EXISTS customers (> cust_id String,> Name string,> Country String>)> ROW FORMAT delimited fields TERMINATED by ' \ t ';hive> Load Data local inpath '/home/training/customers.txt ' into table customers;
How to configure remote MetaStore in hive:
1) Configure hive to use local MySQL to store MetaStore (server a 111.121.21.23) (Remote MySQL storage can also be used)
2) After the configuration is complete, start the service bin/hive -- service MetaStore (default listening port: 9083) on server)
3) configure the hive clie
UNARCHIVE PARTITION(ds=‘2008-04-08‘, hr=‘12‘)
Cautions and limitations warnings and restrictions
In some older versions of Hadoop, HAR had a few bugs the could cause data loss or other errors. Be sure that these patches is integrated into your version of Hadoop:
https://issues.apache.org/jira/browse/HADOOP-6591 (fixed in HADOOP 0.21.0)https://issues.apache.org/jira/browse/MAPREDUCE-1548 (fixed in Hadoop 0.22.0)https://issues.
First, control the number of maps in the hive task:1. Typically, the job produces one or more map tasks through the directory of input.The main determinants are: The total number of input files, the file size of input, the size of the file block set by the cluster (currently 128M, can be set dfs.block.size in hive; command to see, this parameter can not be customized modification);2. For example:A) Assuming
By default, NULL is saved as \ n in the hive table, and you can view the table's source file (Hadoop fs-cat or Hadoop fs-text), where a large amount of \ n is stored in the file,
resulting in a lot of wasted space. And in Java, Python directly into the path to manipulate the source data, the resolution should also be noted. In
addition, in the source file of the hive table, the default column delimiter i
Directory structure
Hadoop cluster (CDH4) practice (0) PrefaceHadoop cluster (CDH4) Practice (1) Hadoop (HDFS) buildHadoop cluster (CDH4) Practice (2) Hbasezookeeper buildHadoop cluster (CDH4) Practice (3) Hive BuildHadoop cluster (CHD4) Practice (4) Oozie build
Hadoop cluster (CDH4) practice (0) Preface
During my time as a beginner of Hadoop, I wrote a series of introductory Hadoop articles, the first of which is "Hadoop cluster practice (0) Compl
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.