2.3 Hive Internal Introduction: P44
The jar file under the $HIVE _home/lib is a specific functional part; (CLI module) Other components, Thrift services, remote access to other process features, and the ability to access hive using JDBC and ODBC; all hive clients need a Metastore Service (meta-data Service), which is used by Hive to store table schema information and other metadata information; By default, a limited single-process storage service is provided using the built-in Derby SQL Server; HWI Hive Web Interface provides remote access to hive services; conf Configuration file for hive stored in directory 2.4 launch hive into CLI mode$HIVE _home/bin/hiveDisplays the commands that the user executes and the location of the local file system where the query log data resides, as well as information about the time consumed by OK and queries; Note: Keywords in Hive are case-insensitive, and if you use the Derby database as the metadata store, Then the current directory that will be used by the user is to create a metastore_db directory that was established by Derby when the hive session was started, and if the user switches other directories to execute the boot hive, it will create the directory in the other directory. And forget that the previous directory will cause data loss, so it is best to configure the metadata store as a permanent path;Hive.metastore.warehouse.dirThe default value for specifying the location in which Hive table storage resides in Hadoop is/usr/hive/warehouseThis specifies a different value for the property to allow each user to define their own data Warehouse directory, which avoids affecting other system users, such as the setHive.metastore.warehouse.dir=/user/myname/hive/warehouse;In order to not have to specify such a script each time you start hive, you can put it in$HIVE _home/.hivercFile, each time the hive file is started, the file is executed; 2.5 using the JDBC connection metadata hive does not have a metadata storage component in the components provided, this is what Hadoop does not have, it needs to be provided externally; metadata stores the metadata information such as the schema and partition information of the table, This information is specified by the user when performing the operation (create Tablse, alter TALBE), because the multiuser system may be concurrent with these metadata stores, so the default built-in database does not apply to the production environment; set up MySQL to job metadata store: Suppose MySQL is running on port 3306 of the DB1.MYDOMAIN.PVT server, and the database name is hive_db settings hive-siet.xml in the metadata store database configuration<property><name>Javax.jdo.option.ConnectionURL</name><value>jdbc:mysql://db1.mydomain.pvt:3306/hive_db?createdatabaseifnotexist=true</value ><DESCRIPTION>JDBC connect string for a JDBC metastore</description></property><property><name>javax.jdo.option.ConnectionDriverName</name><value>com.mysql.jdbc.Driver</value><description>driver class name for a JDBC metastore</description></property><property><name>javax.jdo.option.ConnectionUserName</name><value>root</value><description>username to use against Metastore database</description></property><property><name>javax.jdo.option.ConnectionPassword</name><value>911</value><description>password to use against Metastore database</description></property><property>In order for Hive to be able to connect to MYSQL, the JDBC driver needs to be placed under the classpath: After MySql jdbc (jconnector:http://www.mysql.com/downloads/connecotr/j/) is downloaded, it can be placed in Hive Library Path, under $HOME _hive/lib; hive will store the metadata in MYSQL when the configuration information is complete; 2.6 Hive command$HOME _hive/bin/hiveHive e service Channel into the CLIbin/hive--HELFPView the help of the Hive command the contents of the service List provide services, including the CLI that we use frequently, can enable a service by--service name Service name, Hive service: CLI command line interface; User defined table, execute query, etc. hiveserv Er Hive Server listens to a daemon from another process's Thrift connection Hwi Hiveweb interface, a simple web interface that can execute query statements and other commands, and use the CLI to query the jar ha without logging on to a machine in the cluster An extension of the Doop jar command that performs an application that requires a hive environment Metastore launch an extended hive metadata service that can be used by multiple clients rcfilecat a tool that can print out the contents of a rcfile format file;--auxpath option allows the user to specify a colon-separated, subordinate Java package These files contain custom extensions that the user may need;--config file directory This command allows users to override the default property configuration in the $HIVE _home/conf, and point to a new profile directory; 2.7 command Line interface$hive--help--service CLI Displays the list of options provided by the CLI usage:hive -d,--define <key=value> Variable subsitution to apply to hive &NB Sp commands. e.g.-D a=b or--define a=b --database <databasename> Specify the database to Use -e <quoted-qu ery-string> SQL from command line -f <filename> SQL from files -h Set Evn:homeDisplays the directory where hive is currently executing; Set output more directorieshvie-v;If you do not add-V, the changes will be printed out, and if you use-V you will also print all the attributes defined in Hadoop.Set--define Foo=barDefine variable set foo display variable Foo=bar;set Hivevar:foo;Display variable foo display content Hivevar:foo=barset hivevar:foo=bar2;setting variables;set foo;Show Foo=bar2 from the above can be seen Hivevar: is optional; in the CLI, the variable reference in the query statement is first replaced before it is submitted to the query processor.CREATE TABLE Tossl (i int,${hivevar:foo} string);describe Tossl;Displays the value of the ${hivevar:foo} variable;Set Hive.cli.print.current.dbView the value of this setting; default is false; Set to True to print the current database name at the CLI prompt, with default data name;hive> set system:user.name;Display system:user.name=root;Hive>set env:home;Show Env:home=/root Note: The user must specify system attributes and environment variables using the systems: or env: prefix;
The "one-use" command in Hive can do this:$hive-E "select * FROM MyTable limit 3";Can be executed once command will exit;$ hive-s-E"SELECT * FROM MyTable limit 3" >/tmp/myquery;The-s option can turn on silent mode, the result does not show OK, Timetaken, etc., and the result is stored in/tmp/myquery file instead of HDFs;$hive-S-E "set" | grep warehouseIf you do not remember a property name, you can use this command to query the property, and to execute a hive query from a file, you can use the-f file name in hive to try to execute one or more of the query statements in the specified file; These hive query files are generally saved as files of. Q or. hql suffixes;$hive-F/path/to/file/withqueryies.hqlhive> source/path/to/file/withqueries.hql;Executes the Hive statement in the file in the CLI; $hive-E "LOAD DATA LOCAL inpath '/tmp/myfile ' into Talbe src"; 2.7.5 Hiverc File$hive-i '/tmp/myfile 'Allows the user to specify a file that will be executed by the CLI when it is started before the prompt appears. You can also find it in Hive's HOME directory. HivercFiles, and the commands in this file are automatically executed, which can be used for commands that are required and frequently executed by the user, such as setting system properties and variables, for example:ADD Jar/path/to/costom_hive_extensions.jarA jar file to a semi-circle in distributed memoryset hive.cli.print.current.db=true;Show the current working database before displaying the CLI promptset hive.exec.mode.local.auto=true;Encourage Hive to execute locally if it can be executed using local mode, which speeds up the data query speed of small datasets Note: after each line; Be sure to remember to add 2.7.6 more Hive CLI introduction 1) Auto Completion function: press TAB TAB during the input command, the CLI will automatically complete the possible keywords; 2) View Operation command history: You can use the up and down arrows to view previous commands, which are recorded in the $HOME _hive/.h Ivehistory, can save 10,000 2.7.8 execute shell command The user does not need to exit the Hive CLI to execute a simple bash shell command, just add it before the command! and with; You can end it.hive>!/bin/echo "what Up Dog";hive>! pwd;Note: You cannot use the Shell's pipeline feature and file name auto-instance feature; 2.7.9 DFS commands for using Hadoop in hive [using Hadoop commands in Hvie is faster than using the bash shell because it executes these in the same process in hive. Tokenhive> DFS-SL/;Simply remove the Hadoop keyword from the Hadoop command and end with a semicolonhive> dfs-help;Chadong the list of all feature options provided by DFS 2.7.10 the comments in the Hive script
You can annotate statements in a file that holds hivequery, but not in the CLI; 2.7.11 display field namesset hive.cli.print.header=true;To display the field name
HIVE[2] Basic Introduction