Hadoop Hive Basic SQL syntax

Source: Internet
Author: User

1.DDL operation 1. Build Table 2.3. Create a simple table 4. Create an external table 5. Build the partition Table 6. Create a table and create an index field ds8. Copy an empty table 7. Show all table 9. Table 10 is displayed by a regular conditional regular expression. Modify table Structure 11. Table adds a column of 12. Add a column and add a column field Note 14. Change the table name 15. Delete Column 16. Add Delete partition 17. Rename Table 18. Modify the name type of the column Note 19. Table adds a column of 20. Add a column and add a column field Note 21. Add Update column 22. Increase table Metadata Information 23. Change the table file format with organization 24. Create a Delete view 25. Create a database 26. Show Command 2.DML Manipulating metadata storage 1. Load the file into the data table 2.3. Load local data with a given partition information 4.overwrite5. Inserting query results into hive table 6. Write query results to HDFs file system 7.INSERTINTO3.DQL operation data Query SQL1.1 Basic Select Operation 2. For example 3. Query 4. Output query data to directory 5. Output the query results to a local directory of 6. Select all columns to the local directory 7. Inserts a table's statistics into another table 8. Inserting multiple table data into the same table 9. Inserting a file stream directly into the file 10.2 Partition-based Query 11.3 Join4. The habit of changing from SQL to HIVEQL 1.Hive does not support equivalent connections 2. Semicolon character 3.IS not NULL4. Hive does not support inserting data into an existing table or partition 5.hive does not support insert into UPDATE The delete operation 6.hive supports embedding a mapreduce program to handle complex logic 7.hive supports writing converted data directly to different tables and also to partition HDFs and local directory 5. Actual Example 1. Create a table 2. Load data into a table 3. Total Statistics 4. Now do some complex data analysis 5 . Generate weekly information for data 6. Using mapping Scripts 7. Divide data by week 8. Processing Apache Weblog Data
Catalogue

Hive is a data Warehouse analysis system built on Hadoop that provides a rich set of SQL queries to analyze data stored in a Hadoop Distributed file system that can be structured

Data file is mapped to a database table, and provides a full SQL query function, you can convert the SQL statement to a MapReduce task to run, through your own SQL to query the analysis needs

To the content, this set of SQL for short, hive SQL, so that users unfamiliar with MapReduce are very convenient to use the SQL language query, summary, analysis of data. And the MapReduce developers can

Mapper and reducer are written as plug-ins to support hive for more complex data analysis.
It is slightly different from SQL for relational databases, but supports most of the statements such as DDL, DML, and common aggregation functions, connection queries, and conditional queries. Hive is not suitable for online

Online) transactions, and does not provide real-time query functionality. It works best with batch jobs that are based on large amounts of immutable data.
Hive Features: Scalable (dynamically adding devices on a Hadoop cluster), scalable, fault tolerant, and loose coupling of input formats.

There is a detailed description of the query language in the official Hive documentation, please refer to: http://wiki.apache.org/hadoop/Hive/LanguageManual, the content of this article is mostly translated from this page, During the course of use, there are a number of things to be aware of.

1. DDL Operations

• Build tables • Delete tables • Modify table structure • Create/delete views • Create a database • Show commands
DDL operation type
CREATE [EXTERNAL] TABLE [IF not EXISTS] table_name   [(col_name data_type [COMMENT col_comment], ...)]   [COMMENT Table_comment]   [Partitioned by (Col_name data_type [COMMENT col_comment], ...)]   [CLUSTERED by (Col_name, Col_name, ...)   [SORTED by (Col_name [asc| DESC], ...)] into num_buckets buckets]   [ROW FORMAT Row_format]   [STORED as File_format]   [location Hdfs_path]
ddl-Building Table

Create tables creates a table with the specified name. Throws an exception if a table of the same name already exists, and the user can ignore the exception with the If not EXIST option

The external keyword allows the user to create an external table that specifies a path to the actual data while the table is in progress (location)

like allows users to copy existing table structures, but does not replicate data

Comment can add a description to a table and field

Row FORMAT

delimited [TERMINATED by char] [COLLECTION ITEMS TERMINATED by Char]

[MAP KEYS TERMINATED by Char] [LINES TERMINATED by Char]

| SERDE Serde_name [with Serdeproperties (Property_name=property_value, Property_name=property_value, ...)]

Users can customize the SerDe or use their own SerDe when building a table. If you do not specify row format or row format delimited, your own SerDe will be used. When building a table, the user also needs to specify the columns for the table, and the user specifies the columns of the table as well as the custom serde,hive data that determines the specific columns of the table by SerDe.

stored as

Sequencefile

| Textfile

| Rcfile

| InputFormat input_format_classname OutputFormat Output_format_classname

If the file data is plain text, you can use STORED as textfile. If the data needs to be compressed, use STORED as SEQUENCE.

1.3 Creating a simple table:

Hive> CREATE TABLE pokes (foo INT, bar STRING);

1.4 Create an external table:

CREATEEXTERNALTABLEPage_view (viewtimeINTUseridBIGINT, Page_url string, referrer_url string, IP string COMMENT'IP Address of the User', country STRING COMMENT'Country of Origination') COMMENT'This is the Staging Page view table'ROW FORMAT delimited fields TERMINATED by '\054'STORED astextfile Location'';
Create an external table

1.5 Building a partitioned table

CREATE TABLEPar_table (viewtimeINTUseridBIGINT, Page_url string, referrer_url string, IP string COMMENT'IP Address of the User') COMMENT'This is the Page view table'partitioned by(Date STRING, pos string) ROW FORMAT delimited ' \ t ' fields TERMINATED by '\ n'STORED asSequencefile;
Partition Table

1.6 Built Buckets Table

CREATE TABLEPar_table (viewtimeINTUseridBIGINT, Page_url string, referrer_url string, IP string COMMENT'IP Address of the User') COMMENT'This is the Page view table'partitioned by(Date STRING, pos string)CLUSTERED  by(userid) SORTED by(Viewtime) into  +BUCKETS ROW FORMAT delimited ' \ t ' fields TERMINATED by '\ n'STORED asSequencefile;
Create bucket Table

1.7 Create a table and create an indexed field DS

Hive> CREATE TABLE invites (foo INT, bar string) partitioned by (DS string);

1.8 Copying an empty table

CREATE TABLE Empty_key_value_store

Like Key_value_store;

Example

Create Table  User_info (user_idintby'\ t'by  '\ n';
Example

The data format for importing data tables is: The fields are tab-divided, and lines are broken.

and to our file content format:

100636 100890 C5c86f4cddc15eb7 YYYVYBTVT
100612 100865 97cc70d411c18b6f Gyvcycy
100078 100087 Ecd6026a15ffddf5 qa000100

1.9 Show All tables:

Hive> SHOW TABLES;

1.10 Displays the table by a positive condition (regular expression),

hive> SHOW TABLES '. *s ';

Hadoop Hive Basic SQL syntax

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.