Transferred from: http://www.aboutyun.com/forum.php?mod=viewthread&tid=8590&highlight=hive
Questions Guide:
1. How to view the hive table structure.
2. How to view table structure information.
3. How to view the partition information.
4. Which command can blur the search table
1.hive Fuzzy Search Table
Show tables like ' *name* ';
2. View table structure Information
DESC formatted table_name;
DESC table_name;
3. View partition information
Show partitions table_name;
4. Querying data based on partitions
Select Table_coulm from table_name where partition_name = ' 2014-02-25 ';
5. View HDFs File information
DFS-LS/USER/HIVE/WAREHOUSE/TABLE02;
6. Load data into table from file (overwrite overwrite, append not need overwrite keyword)
LOAD DATA LOCAL inpath ' dim_csl_rule_config.txt ' OVERWRITE into table dim.dim_csl_rule_config;
--Insert
OVERWRITE table Test_h02_click_log PARTITION (DT) SELECT * from Stage.s_h02_click_log by inserting data into table from query statement
where dt= ' 2014-01-22 ' limit 100;
7. Exporting data to a file
Insert Overwrite directory '/tmp/csl_rule_cfg ' select a.* from Dim.dim_csl_rule_config A;
Hive-e "Select Day_id,pv,uv,ip_count,click_next_count,second_bounce_rate,return_visit,pg_type
From Tmp.tmp_h02_click_log_baitiao_ag_sum where day_id in (' 2014-03-06 ', ' 2014-03-07 ', ' 2014-03-08 ', ' 2014-03-09 ', ' 2014-03-10 '); " >
/home/jrjt/testan/baitiao.dat;
8. Custom UDF functions 1. Inheriting UDF classes
2. Overriding the Evaluate method
3. Make the project into a jar package
Execute command in 4.hive add Jar/home/jrjt/dwetl/pub/udf/udf/getproperty.jar;
5. Creating the function create temporary function Get_pro as ' JD. Get_property '//jd.jd.get_property as the classpath;
9. Query display column name and row to column display
Set hive.cli.print.header=true; Print column name
set hive.cli.print.row.to.vertical=true;//Turn on row to column function, if you must turn on the Print column name function
set Hive.cli.print.row.to.vertical.num=1; Set the number of columns to display per row
10. View the table file size, download the file to a directory, show how many lines to a file
Dfs-du Hdfs://bjyzh3-hd-jrjt-4137.jd.com:54310/user/jrjt/warehouse/stage.db/s_h02_click_log;
dfs-get/user/jrjt/warehouse/ods.db/o_h02_click_log_i_new/dt=2014-01-21/000212_0/home/jrjt/testan/;
Head-n 1000 file name > file name
11. Killing a task does not execute in hive shell
Hadoop Job-kill job_201403041453_58315
12.hive-wui Path http://172.17.41.38/jobtracker.jsp
13. Deleting a partition
ALTER TABLE Tmp_h02_click_log_baitiao drop partition (dt= ' 2014-03-01 ');
ALTER TABLE D_h02_click_log_basic_d_fact drop partition (dt= ' 2014-01-17 ');
14.hive command line operation on 15.hive operation Hadoop file basic commands
View File Size
dfs-du/user/jrjt/warehouse/tmp.db/tmp_h02_click_log/dt=2014-02-15;
deleting files
dfs-rm/user/jrjt/warehouse/tmp.db/tmp_h02_click_log/dt=2014-02-15;
16. Insert data SQL, Export data sql
The 1.insert syntax format is:
Basic Insert Syntax:
INSERT OVERWRITE TABLE tablename [Partiton (Partcol1=val1,partclo2=val2)]select_statement from From_statement
Insert Overwrite table Test_insert select * from test_table;
Insert operations on multiple tables:
From Fromstatte
INSERT OVERWRITE TABLE tablename1 [Partiton (partcol1=val1,partclo2=val2)]select_statement1
INSERT OVERWRITE TABLE tablename2 [Partiton (Partcol1=val1,partclo2=val2)]select_statement2 from
test_table
Insert Overwrite table Test_insert1
Select key
Insert Overwrite table test_insert2
select value;
Insert, the FROM clause can be placed after the SELECT clause, or it can be placed before the INSERT clause. Hive does not support inserting a single line of INSERT statements, nor does IT support update operations. The data is loaded into the built-in table in load mode. Once the data is imported, it cannot be modified.
2. Save data to filesystem with a query
INSERT OVERWRITE [LOCAL] Directory directory SELECT .... From .....
To import data to a local directory:
Insert overwrite local directory '/home/zhangxin/hive ' select * from Test_insert1;
The resulting file overwrites other files in the specified directory and deletes files that already exist in the directory.
To export data into HDFs:
Insert Overwrite directory '/user/zhangxin/export_test ' select value from test_table;
The same query results can be inserted into multiple tables or directories at the same time:
From Test_insert1
Insert overwrite local directory '/home/zhangxin/hive ' select *
insert overwrite directory '/ User/zhangxin/export_test ' select value;
17.mapjoin usage Scenarios: 1. There is a table in the association operation is very small 2. Link operation with no equivalent
Select/*+ Mapjoin (A) */f.a,f.b from A t join B F
18.perl Start Task
perl/home/jrjt/dwetl/app/app/a_h02_click_log_credit_user/bin/a_h02_click_log_credit_user.pl
APP_A_H02_ Click_log_credit_user_20140215.dir >&/home/jrjt/dwetl/log/app/20140306/
A_h02_click_log_credit_user.pl.4.log
19. View the Perl process
Ps-ef|grep Perl
20.hive command Move table data to another table directory and add partitions
dfs-cp/user/jrjt/warehouse/tmp.db/tmp_h02_click_log/dt=2014-02-18/user/jrjt/warehouse/ods.db/o_h02_click_log/;
dfs-cp/user/jrjt/warehouse/tmp.db/tmp_h02_click_log_baitiao/*/user/jrjt/warehouse/dw.db/
d_h02_click_log_baitiao_basic_d_fact/;--Copy all partition data
ALTER TABLE D_h02_click_log_baitiao_basic_d_fact add Partition (dt= ' 2014-03-11 ')
Location '/user/jrjt/warehouse/dw.db/d_h02_click_log_baitiao_basic_d_fact/dt=2014-03-11 ';
21. Export White stripe data
Hive-e "Select Day_id,pv,uv,ip_count,click_next_count,second_bounce_rate,return_visit,pg_type
From Tmp.tmp_h02_click_log_baitiao_ag_sum where day_id like ' 2014-03% '; >/home/jrjt/testan/baitiao.xlsx;
22.hive Modifying table names
ALTER TABLE o_h02_click_log_i RENAME to O_H02_CLICK_LOG_I_BK;
23.hive Copy table structure
CREATE TABLE d_h02_click_log_baitiao_ag_sum like tmp.tmp_h02_click_log_baitiao_ag_sum;
24.hive website Https://cwiki.apache.org/conflue ... ionandconfiguration
Http://www.360doc.com/content/12/0111/11/7362_178698714.shtml
25.hive Add Field
ALTER TABLE Tmp_h02_click_log_baitiao_ag_sum
Add columns (current_session_timelenth_count bigint comment ' page total length of stay ');
ALTER TABLE Tmp_h02_click_log_baitiao Change Current_session_timelenth
Current_session_timelenth bigint Comment ' current session dwell time ';
26.hive Open Simple mode does not enable Mr
Set Hive.fetch.task.conversion=more;
27. Output input table and input partition messages that execute statements are read in JSON format
Explain Dependency Query