One, hive command line 1, hive support some of the commands
Command Description
quit Use quit or exit to leave the interactive shell.
Set Key=value Use the To set value of particular configuration variable. One thing to note here's if you misspell the variable name, the CLI won't show an error.
Set This would print a list of configuration variables that is overridden by user or hive.
set-v This would print all Hadoop and hive configuration variables.
add file [file] [file]* Adds a file to the list of resources
Add Jar Jarname
list FILE list all the files added to the distributed cache
list FILE [file]* Check If given resources is already added to distributed cache
! [CMD] executes a shell command from the hive shell
DFS [dfs cmd] Executes a DFS command from the hive shell
[query] Executes a hive query and prints results to
Source FILE Used to execute a script file inside the CLI.
2. Grammatical structure
Hive [-hiveconf x=y]* [<-i filename>]* [<-f filename>|<-e query-string;] [-S]
Description
1,-I from the file initialization HQL
2.-e executes the specified HQL from the command line
3.-F Execute HQL script
4,-V output executes the HQL statement to the console
5.-P Connect to Hive Server on port number
6.-hiveconf X=y (use the To set Hive/hadoop configuration variables)
7,-S: Indicates that a naming operation is performed in the form of a log that does not print
3. Example (1) Run a query
Hive-e "SELECT * from Cookie.cookie1;"
(2) running a file
Writing Hive.sql files
Running a written file
(3) Operation parameter file
Start hive from the configuration file and load configuration parameters from the configuration file
Second, the configuration of hive parameters 1, hive parameter configuration Daquan
Https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties
2. How to set the parameters of hive
When developing a hive application, it is inevitably necessary to set the parameters of the hive. Setting the parameters of Hive allows you to tune the execution efficiency of the HQL code or help locate the problem. However, one of the frequently encountered problems in practice is why the parameters set are not functioning? This is usually caused by the wrong way of setting
For general parameters, there are three ways to set it up:
1. configuration file (globally valid)
2. Command line parameters (valid for Hive launch instance)
3. Parameter declaration (valid for Hive connection session)
(1) configuration file
Configuration files for Hive include:
A. User-defined profile: $HIVE _conf_dir/hive-site.xml
B. Default profile: $HIVE _conf_dir/hive-default.xml
The user-defined configuration overrides the default configuration.
In addition, hive reads the configuration of Hadoop, because Hive is started as a client of Hadoop, and Hive's configuration overrides Hadoop's configuration.
Configuration file settings are valid for all Hive processes that are natively started.
(2) command line parameters
When you start Hive (client or Server mode), you can add-hiveconf param=value at the command line to set parameters such as:
This setting is valid for this startup session (which is the session of all requests for Server mode startup).
(3) Parameter declaration
Parameters can be set using the Set keyword in HQL, for example:
The scope of this setting is also the session level.
set hive.exec.reducers.bytes.per.reducer= average load data per reduce task hive estimates the total amount of data and then divides that value by the above parameter values to get the Number of Reducetask
Set hive.exec.reducers.max= Sets the limit on the number of reduce tasks
Set mapreduce.job.reduces= Specifies the number of fixed reduce tasks
However, this parameter < business logic determines that only one reduce task> hive will be ignored if necessary, such as set mapreduce.job.reduces = 3, but if you use order by in the HQL statement, you will suddenly The settings for this parameter are omitted.
The priority of the above three settings is incremented in turn. That is , the parameter declaration overrides the command-line arguments, and the command-line parameters override the configuration file Settings . Note that some system-level parameters, such as log4j-related settings, must be set in the first two ways, because the read of those parameters was completed before the session was established.
Hive Learning Pathway (18) shell operations for Hive