Hive table information query: Table structure, table query ...

Source: Internet
Author: User
Tags file size

1.hive Fuzzy Search Table

Show tables like ' *name* ';

2. View table structure Information
DESC formatted table_name;
DESC table_name;

3. View partition information
Show partitions table_name;

4. Querying data based on partitions
Select Table_coulm from table_name where partition_name = ' 2017-02-25 ';

5. View HDFs File information
DFS-LS/USER/HIVE/WAREHOUSE/TABLE02;

6. Load data into table from file (overwrite overwrite, append not need overwrite keyword)
LOAD DATA LOCAL inpath ' dim_csl_rule_config.txt ' OVERWRITE into table dim.dim_csl_rule_config;
--insert data from a query statement to table
INSERT OVERWRITE TABLE test_h02_click_log PARTITION (dt) SELECT *
From Stage.s_h02_click_log where dt= ' 2014-01-22 ' limit 100;

7. Exporting data to a file
Insert Overwrite directory '/tmp/csl_rule_cfg ' select a.* from Dim.dim_csl_rule_config A;
Hive-e "Select Day_id,pv,uv,ip_count,click_next_count,second_bounce_rate,return_visit,pg_type from Tmp.tmp_h02_ Click_log_baitiao_ag_sum where day_id in (' 2017-03-06 ', ' 2017-03-07 ', ' 2017-03-08 ', ' 2017-03-09 ', ' 2017-03-10 '); >/home/jrjt/testan/baitiao.dat;

8. Custom UDF functions
1. Inheriting the UDF class
2. Overriding the Evaluate method
3. Make the project into a jar package
Execute command in 4.hive add Jar/home/jrjt/dwetl/pub/udf/udf/getproperty.jar;
5. Creating the function create temporary function Get_pro as ' JD. Get_property '//jd.jd.get_property as the classpath;

9. Query display column name and row to column display
Set hive.cli.print.header=true; Print column names
Set hive.cli.print.row.to.vertical=true; Turn on row to column function, if you must turn on the Print column name function
Set hive.cli.print.row.to.vertical.num=1; Set the number of columns to display per row

10. View the table file size, download the file to a directory, show how many lines to a file
Dfs-du Hdfs://bjyzh3-hd-jrjt-4137.jd.com:54310/user/jrjt/warehouse/stage.db/s_h02_click_log;
dfs-get/user/jrjt/warehouse/ods.db/o_h02_click_log_i_new/dt=2014-01-21/000212_0/home/jrjt/testan/;
Head-n 1000 file name > file name

11. Killing a task does not execute in hive shell
Hadoop Job-kill job_201703041453_58315

12.hive-wui Path
http://192.168.8.111/jobtracker.jsp

13. Deleting a partition
ALTER TABLE Tmp_h02_click_log_baitiao drop partition (dt= ' 2017-03-01 ');
ALTER TABLE D_h02_click_log_basic_d_fact drop partition (dt= ' 2017-01-17 ');


14.hive command-line Operations
Execute a query, show the progress of MapReduce on the terminal, after execution, finally output the query results to the terminal, and then the hive process exits without entering interactive mode.
Hive-e ' Select Table_cloum from table '
-S, the output on the terminal will not have the MapReduce progress, executed, will only output the query results to the terminal. This mute mode is useful, and third-party programs get a result set through the standard output of hive via a third-party program call.
Hive-s-E ' select Table_cloum from table '
Execute SQL file
Hive-f Hive_sql.sql

Basic commands for operating Hadoop files on 15.hive
View File Size
dfs-du/user/jrjt/warehouse/tmp.db/tmp_h02_click_log/dt=2014-02-15;
deleting files
dfs-rm/user/jrjt/warehouse/tmp.db/tmp_h02_click_log/dt=2014-02-15;

16. Insert data SQL, Export data sql
The 1.insert syntax format is:
Basic Insert Syntax:
INSERT OVERWRITE TABLE tablename [Partiton (Partcol1=val1,partclo2=val2)]select_statement from From_statement
Insert Overwrite table Test_insert select * from test_table;

Insert operations on multiple tables:
From Fromstatte
INSERT OVERWRITE TABLE tablename1 [Partiton (Partcol1=val1,partclo2=val2)]select_statement1
INSERT OVERWRITE TABLE tablename2 [Partiton (Partcol1=val1,partclo2=val2)]select_statement2

From test_table
Insert Overwrite table Test_insert1
Select key
Insert Overwrite table Test_insert2
Select value;

Insert, the FROM clause can be placed after the SELECT clause, or it can be placed before the INSERT clause.
Hive does not support inserting a single line of INSERT statements, nor does IT support update operations. The data is loaded into the built-in table in load mode. Once the data is imported, it cannot be modified.

2. Save data to filesystem with a query
INSERT OVERWRITE [LOCAL] Directory directory SELECT .... From .....

To import data to a local directory:
Insert overwrite local directory '/home/zhangxin/hive ' select * from Test_insert1;
The resulting file overwrites other files in the specified directory and deletes files that already exist in the directory.

To export data into HDFs:
Insert Overwrite directory '/user/zhangxin/export_test ' select value from test_table;

The same query results can be inserted into multiple tables or directories at the same time:
From Test_insert1
Insert overwrite local directory '/home/zhangxin/hive ' SELECT *
Insert Overwrite directory '/user/zhangxin/export_test ' select value;

17.mapjoin usage Scenarios: 1. There is a table in the association operation is very small 2. Link operation with no equivalent
Select/*+ Mapjoin (A) */f.a,f.b from A T joins B F on (F.A=T.A and f.ftime=20110802)

18.perl Start Task
perl/home/jrjt/dwetl/app/app/a_h02_click_log_credit_user/bin/a_h02_click_log_credit_user.pl
App_a_h02_click_log_credit_user_20140215.dir >&/home/jrjt/dwetl/log/app/20140306/a_h02_click_log_credit_ User.pl.4.log

19. View the Perl process
Ps-ef|grep Perl

20.hive command Move table data to another table directory and add partitions
dfs-cp/user/jrjt/warehouse/tmp.db/tmp_h02_click_log/dt=2014-02-18/user/jrjt/warehouse/ods.db/o_h02_click_log/;
dfs-cp/user/jrjt/warehouse/tmp.db/tmp_h02_click_log_baitiao/*/user/jrjt/warehouse/dw.db/d_h02_click_log_ baitiao_basic_d_fact/;--Copy all partition data
ALTER TABLE D_h02_click_log_baitiao_basic_d_fact Add partition (dt= ' 2014-03-11 ') location '/user/jrjt/warehouse/dw.db /d_h02_click_log_baitiao_basic_d_fact/dt=2014-03-11 ';

21. Export White stripe data
Hive-e "Select Day_id,pv,uv,ip_count,click_next_count,second_bounce_rate,return_visit,pg_type from Tmp.tmp_h02_ Click_log_baitiao_ag_sum where day_id like ' 2014-03% '; >/home/jrjt/testan/baitiao.xlsx;

22.hive Modifying table names
ALTER TABLE o_h02_click_log_i RENAME to O_H02_CLICK_LOG_I_BK;

23.hive Copy table structure
CREATE TABLE d_h02_click_log_baitiao_ag_sum like tmp.tmp_h02_click_log_baitiao_ag_sum;

24.hive Add Field
ALTER TABLE tmp_h02_click_log_baitiao_ag_sum Add columns (current_session_timelenth_count bigint comment ' page total length of stay ');
ALTER TABLE tmp_h02_click_log_baitiao Change current_session_timelenth current_session_timelenth bigint comment ' Current session dwell time ';

25.hive Open Simple mode does not enable Mr
Set Hive.fetch.task.conversion=more;

26. Output input table and input partition messages that execute statements are read in JSON format
Explain Dependency Query

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.