Hive Programming Guide (1)

Source: Internet
Author: User
Tags time zones
Hive command:
$ Hive-e "select * from mytable limit 1;" OKname1 1name2 2 Time taken: 3.935 seconds $ hive-e "select * from mytable limit 1; ">/tmp/myfile $ cat/tmp/myfileOKname1 1name2 2 Time taken: 3.935 seconds

Silent mode:
$ Hive-S-e "select * from mytable limit 1;">/tmp/myfile $ cat/tmp/myfilename1 1name2 2

Execute Hive query from the file:
$ Hive-f/tmp/queries. hql

In Hive shell, you can use the source command to execute a script file:
$ Cat/tmp/queries. hqlselect * from mytable limit 1; $ hivehive> source/tmp/queries. hql ;...

Run the shell command:
You can run a simple bash shell command without exiting hive cli. Just add it before the command! It can end with a semicolon.
Hive>! /Bin/echo "hello, world! ";" Hello, world ";

Hive cli does not support interactive commands that require user input, and does not support automatic completion of shell MPs queues and file names. For example ,! Ls *. hql; this command searches for files named *. hql, rather than displaying all files ending with. hql.

Use hadoop's dfs command in hive
You can run the hadoop dfs command in hive cli. You only need to remove the keyword hadoop in the hadoop command and end it with a semicolon.
Hive> dfs-ls/; Found 2 itemsdrwxr-xr-x-root supergroup 0 2014-11-06/flagdrwxr-xr-x-root supergroup 0 2014-11-07/user

How to annotate hive scripts
You can use a string starting with -- to represent comments. For example:
-- Copyright 2012 -- This is a testselect * from mytable limit 1;

Supported data types:
Data type length
Tinyint 1byte signed integer
Smalint 2byte signed integer
Int 4 byte signed integer
Bigint 8 byte signed integer
Boolean type
Float single-precision floating point number
Double-precision floating point number
String
Timestamp integer, floating point or string
Binary byte array

Timestamp
Integer: the number of seconds from 00:00:00
Floating point number: the number of seconds from 00:00:00, accurate to the nanosecond (9 digits after the decimal point)
String: YYYY-MM-DD hh: mm: ss. ffffffff
Timestamp indicates UTC time. hive provides built-in functions for mutual conversion between different time zones: to_utc_timestamp and from_utc_timestamp

Binary can contain any byte in the record, which prevents hive from trying to parse it as a number or string.


Set data type
1) struct
Similar to struct or object in C language. You can use the "." symbol to access the element content. For example, if the data type of a column is struct {first, last}, the first element can be referenced by field name. first.
2) map
For example, if the key-value pair of a map is 'first '-> 'name', you can use the field name ['first'] to access this element.
3) array
For example, if the array value is ['name'], the first element can access 「


The following is a statement example for creating a table:
Create table employee (name string, age tinyint, salary float, subordinates array <string>, address struct <country: string, province: string, city: string, street: string> );

Default record and field separator in hive
\ N: each row is a record.
^ A (Ctrl + A): used to separate fields (columns). You can use the octal encoding \ 001 in the create table statement.
^ B: used to separate elements in an array or struct, or to separate key-value pairs in a map. You can use the octal encoding \ 002 in the create table statement.
^ C: used to separate key-value pairs in a map. You can use octal encoding \ 003 in the create table statement.
The preceding table creation statement is the same as the following statement:
Create table employee (name string, age tinyint, salary float, subordinates array <string>, address struct <country: string, province: string, city: string, street: string>) row format delimitedfields terminated by '\ 001' collection items terminated by' \ 002' map keys terminated by '\ 003' lines terminated by' \ n' stored as textfile;
The row format delimited keywords must be written before other words (except stored.

HiveQL: Data Definition
HiveQL is a Hive query language that does not fully comply with any ansi SQL standard revision. Hive does not support row-level insert, update, and delete operations. Hive does not support transactions either.

In hive, the concept of a database is essentially a table directory or namespace. However, for large clusters with many groups and users, table naming conflicts can be avoided.
If you do not explicitly specify a database, the default database default is used.
The following shows how to create a database:
Hive> create database financial;

If the database financial already exists, an error message is thrown. Use the following statement to avoid throwing an error message in this case:
Hive> create database financial if not exists financial;

You can use the following statement to view the databases contained in hive:
Hive> show databases; defaultfinancial

If there are many databases, you can use regular expressions to filter the required database names:
Hive> show databases like 'F. * 'financial
The preceding example shows the names of databases whose names start with f.


Hive creates a directory for each database. The tables in the database are stored in sub-directories of the Database Directory. The only exception is the tables in the default database, because the database itself does not have its own directory.
The directory where the database is located until specified by the attribute "hive. metastore. warehouse. dir. Add the user to set the value of this attribute to/user/hive/warehouse, then when you create a database financial, hive will create a corresponding directory/user/hive/warehouse/financial. db.
Note that the database file directory name ends with. db.
You can use the following command to modify the default location:
Hive> create database financial location '/user/hive/mywarehouse ';
You can also add a description to the database. The statement is as follows:
Hive> create databases financial comment 'holds all financial tables ';
You can use the following statement to view the database description:
Hive> describe database financial; financial Holds all financial tables hdfs: // master-server/user/hive/warehouse/financial. db
You can use the following statement to switch the database:
Hive> use financial;... hive> use default ;...

You can delete a database:
Hive> drop database if exists financial;

By default, hive does not allow you to delete a database that contains tables. You can either delete all the tables in the database, delete the database, or add the keyword cascade after the command to delete the database:
Hive> drop database if exists financial cascade;

Modify database:
You can use the alter database command to set key-value pair attribute values for a database's dbproperties to describe the database's attribute information. However, other metadata of the database cannot be changed, including the database name and directory location:
Hive> alter database privileges ALS set dbproperties ('created by '= 'Aaron ')

Create a table:
The create table statement follows the SQL syntax convention. For example:
Create table if not exists financial. employee (name string, age tinyint, salary float, subordinates array <string>, address struct <country: string, province: string, city: string, street: string> );

Display tables in the database:
Hive> use financial; hive> show tables; employee

Even if it is not in a financial database, you can list tables in the database:
Hive> use default; hive> show tables in financial; employee

Similarly, you can use a regular expression to filter out the required table names:
Hive> use financial; hive> show tables like 'emp. * '; employee

Reprinted please indicate the source:Http://blog.csdn.net/iAm333

Hive Programming Guide (1)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.