Hive Command Line interface

Source: Internet
Author: User
Tags readable versions hdfs dfs
Hive Command Line interface

The command-line interface, the CLI, is the most common way to interact with hive. Using the CLI, users can create tables, check patterns, query tables, and so on. CLI Options

The following command shows a list of options provided by the CLI:

[Hadoop@localhost hive]$ hive--help--service CLI
usage:hive-
 d,--define <key=value>          Variable  Subsitution to apply to hive
                                  commands. e.g.-D a=b or--define a=b
    --database <databasename>     Specify the Database to use-
 e <quoted-query-string> SQL from command line-
 F <filename>                    sql from files -
 h,--                        Help Print Help information
    --hiveconf <property=value> use   value for given property
    --hivevar <key =value>         Variable subsitution to the Apply to hive
                                  commands. e.g.--hivevar a=b-I
 <filename>                    Initialization SQL File
 -s,--Silent                      silent mode in interactive shell
 -v,--verbose                     verbose mode ( Echo executed SQL to the
                                  console)
Variables and Properties

–define Key=value is actually equivalent to-hivevar key=value. Both allow users to define user-defined variables at the command line to be referenced in a hive script to accommodate different situations. This feature is only supported for Hive v0.8.0 and later versions.

When the user uses this feature, hive places these key-value pairs into the Hivevar namespace, which can be distinguished from the other 3 built-in namespaces (that is, hiveconf, System, and Env).

name Space Use Permissions Description
Hivevar Readable/writable (Hive v0.8.0 and later versions) user-defined variables
Hiveconf Readable/writable Hive-related Assignment properties
System Readable/writable Java-defined assignment properties
Env can only be read Environment variables defined by the shell environment (for example, bash)

The inside of the hive variable is stored as a Java string. Users can reference variables in the query. Hive replaces the variable reference of the query with the variable value before it submits the query to the query processor.

In the CLI, you can use the SET command to display or modify the value of a variable. Such as:

Hive> set env:home;
Env:home=/home/hadoop

Only the SET command is entered on the command line, and all variables in the namespace hivevar,hiveconf,env and system are printed, and all properties defined in Hadoop, such as those that control HDFs and MapReduce, are also printed if the-v parameter is appended to it.

The set command can also be used to assign a new value to a variable. Let's take a special look at the Hivevar namespace and how to define a variable from the command line:

[Hadoop@localhost hive]$ hive--define foo=bar

hive> set foo;
Foo=bar

hive> set Hivevar:foo;
Hivevar:foo=bar

hive> set hivevar:foo=bar2;

hive> set foo;
Foo=bar2

hive> set Hivevar:foo;
Hivevar:foo=bar2

As we can see, the prefix Hivevar: is optional. The –hivevar tag and the –define tag are the same.

Variable references in query statements in the CLI are replaced before they are submitted to the query processor. Consider this CLI session as follows:

Hive> CREATE TABLE toss1 (i int, ${hivevar:foo} string);
OK time
taken:1.94 seconds
hive> DESCRIBE toss1;
OK
i                       int                                         
bar2                    string time                                      
taken:0.356 seconds, Fetched:2 row (s)
hive> CREATE TABLE toss2 ( i2 int, ${foo} string);
OK time
taken:0.285 seconds
hive> DESCRIBE toss2;
OK
i2                      int                                         
bar2                    string time                                      
taken:0.05 seconds fetched:2 Row (s)
hive> DROP TABLE Toss1;
OK time
taken:0.817 seconds
hive> DROP TABLE toss2;
OK Time
taken:0.162 seconds

Let's take a look at the –hiveconf option, which configures all properties of the hive behavior. We use it to specify the Hive.cli.print.current.db property. Turn on this property to print out the current database name in front of the CLI prompt, with the default database name defaults. The default value for this property is false.

[Hadoop@localhost hive]$ hive--hiveconf hive.cli.print.current.db=true;

Hive (default) > Set hive.cli.print.current.db;
Hive.cli.print.current.db=true

Hive (default) > Set hiveconf:hive.cli.print.current.db;
Hiveconf:hive.cli.print.current.db=true

Hive (default) > Set hiveconf:hive.cli.print.current.db=false;

Hive> set hiveconf:hive.cli.print.current.db=true;

We can even add a new hiveconf property:

[Hadoop@localhost hive]$ hive--hiveconf y=5;

hive> set y;
Y=5

hive> CREATE TABLE whatsit (i int);
OK time
taken:0.883 seconds

hive> INSERT into TABLE whatsit VALUES (5);
OK time
taken:3.206 seconds

hive> SELECT * from Whatsit WHERE i=${hiveconf:y};
OK
5 time
taken:0.127 seconds, fetched:1 row (s)

We also need to know about the System namespace, where Java System Properties have readable and writable authority over the namespace content, whereas the Env namespace provides only readable permissions for environment variables:

Hive> set system:user.name;
System:user.name=hadoop

hive> set system:user.name=yourusername;

Hive> set system:user.name;
System:user.name=yourusername

hive> set env:home;
Env:home=/home/hadoop

hive> set env:home=/home;
env:* variables can not is set.
Query returned Non-zero code:1, Cause:null

Unlike the Hivevar variable, users must specify system attributes and environment variables using the systems: and env: prefixes.

The Env namespace can be used as an optional way to pass variables to hive, considering the following example:

[Hadoop@localhost hive]$ year=2012
[hadoop@localhost hive]$ hive-e "select * FROM MyTable WHERE year= ${env:year}"

The query processor will see the actual variable value 2012 in the WHERE clause.

Tip: All the built-in properties in HIVE are listed in $hive_home/conf/hive-default.xml.template, which is a "sample" configuration file. The default values for these properties are also described in the configuration file.
"Use once" command in Hive

The user may sometimes expect to execute one or more queries (separated by semicolons), and the hive CLI exits immediately after execution ends. Hive provides such functionality because the CLI can accept the-e command in this form. If the table MyTable has a string field and an integer field, we can see the following output:

[Hadoop@localhost hive]$ hive-e "select * FROM MyTable LIMIT 2"
OK
name1 name2 time
taken : 1.088 seconds, Fetched:2 row (s)

You can use this feature to save query results to a file in a temporary emergency. Adding the-s option turns on the static mode so that rows such as "OK" and "time Taken" are removed in the output, as well as some other insignificant output information, as in the following example:

[Hadoop@localhost hive]$ hive-s-E "select * FROM MyTable LIMIT 2"
name1 name2   20

Finally, when a user cannot fully remember a property name, you can use this very useful technique to blur the property name without having to scroll through the output of the set command to find it. Suppose the user does not remember which attribute specifies the path to the warehouse (data Warehouse) of the management table, which can be viewed by the following command:

[Hadoop@localhost hive]$ hive-s-E "SET" | grep warehouse
hive.metastore.warehouse.dir=/user/hive/warehouse
hive.warehouse.subdir.inherit.perms= True
To execute a hive query from a file

You can use the-f file name in hive to execute one or more of the query statements in the specified file. By management, these hive query files are generally saved as files with the. Q or. hql suffix name.

[Hadoop@localhost hive]$ hive-f query.hql 
OK
name1 ten name2 name3 time
taken : 1.124 seconds, Fetched:3 row (s)

[Hadoop@localhost hive]$ cat query.hql 
SELECT * from MyTable LIMIT 3;

In the hive shell, users can use the source command to execute a script file. Such as:

Hive> SOURCE query.hql;
OK
name1
name2   name3 time
taken:0.04 seconds, Fetched:3 row (s)
Hiverc File

The-i option followed by the file name, allowing the user to specify a file that will be executed before the prompt appears when the CLI is started. Hive will automatically look for files named. Hiverc in the home directory, and will automatically execute the commands in this file (if any in the file).

It is very convenient to use this file for commands that the user needs to execute frequently. For example, set system Properties, or add a Java package that customizes the hive extension for distributed memory for Hadoop.

The following example shows the contents of a typical $home/.hiverc file:

ADD Jar/path/to/custom_hive_extensions.jar;
Set hive.cli.print.current.db=true;
Set hive.exec.mode.local.auto=true;

The 1th line of the example above shows the addition of a jar file to Hadoop distributed memory. Line 2nd indicates that the current working database is displayed before the CLI prompt is modified, and the last 1 rows indicate that "encourage" hive is executed locally if it can be performed using local mode (even when Hadoop is executed in distributed or pseudo-distributed mode), which speeds up the data query speed of the small data set.

Warning: A relatively easy mistake is to forget to add a comma to the back of each line. If the user makes this mistake, then this property will contain all the text in the next few lines until a comma is found.
more about using the Hive CLI

The CLI supports some other useful features.

Auto-complete function

If the user taps the TAB key during the input process, the CLI automatically complements the possible keyword or function name. For example, if the user enters Sele and then presses the TAB key, the CLI will automatically not complete the word as SELECT. View Operation Command History

Users can use the up and down arrows to scroll through previous commands. In fact, the input before each line is displayed separately, and the CLI does not use multiple lines of commands and queries as a separate history entry. Hive logs the most recent 100, 00 lines of commands to the file $home/.hivehistory.

If the user wants to execute a previously executed command, simply scroll to that record and press ENTER. If the user needs to modify this line of records before executing, then you need to use left and RIGHT arrow keys to move the cursor to the place where you need to modify and then edit the changes again. After the modification, the user can simply tap the ENTER key to submit the command without switching to the end of the command. Execute shell Command

Users do not need to exit the hive CLI to execute a simple bash shell command. Just add it before the command! And with a semicolon (;) end, you can:

Hive>! /bin/echo "What Up Dog";
What up dog "

hive>! pwd;
/home/hadoop/hive

Interactive commands that require user input are not available in the Hive CLI, and the Shell's pipeline and file name completion features are not supported. For example! ls. hql; This command represents a file that looks for a file named. hql instead of displaying all files ending in. hql. DFS commands for using Hadoop within hive

Users can execute a hadoop DFS command in the hive CLI by removing the keyword HDFs from the Hadoop command and ending with a semicolon:

Hive> Dfs-ls/;
Found 4 Items
drwxr-xr-x   -Hadoop supergroup          0 2016-01-26 16:09/hbase
drwxrwxr-x   -Hadoop SuperGroup          0 2016-01-26 15:31/tmp
drwxr-xr-x   -Hadoop supergroup          0 2016-01-26 15:13/user
Drwxr-xr-x   -Hadoop supergroup          0 2016-01-26 15:47/zookeeper

This approach to using Hadoop commands is actually more than equivalent to executing HDFs DFS in the bash shell ... command to be more efficient. Because the latter initiates a new JVM instance each time, hive executes the commands in the same process. how to annotate in a hive script

A user can use a string that begins with – to represent a comment, for example:

--Copyright (c)
--This was a hive script

SELECT * from MyTable;
Show Field names

We can have the CLI print out the field name (this feature is off by default). We can turn this feature on by setting the HIVECONF configuration item Hive.cli.print.header to true:

Hive> set hive.cli.print.header=true;

Hive> SELECT * from MyTable;
OK
mytable.str mytable.i
name1
name2
name3
+ name4 taken:1.367 seconds, Fetched:4 row (s)

If the user wants to always see the field name, simply add the line configuration to the $home/.hiverc file.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.