Research on 2_ data model of Hive

Last Update:2015-10-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1.Hive data type :

Basic data types: tinyint, smallint, int, bigint, float, double, Boolean, string

Composite data type:

Array: An ordered field that must be of the same type

Map: A set of disordered health/value pairs, the type of kin must be of atomic type

struct: A named set of fields that can be of different types

The complex data type usage is as follows:

Createtablecomplex(

col1 ARRAY<INT>,

Col2 MAP<STRING, INT >, Col3 STRUCT<a:STRING,b : INT ,c: DOUBLE > ); Select col1[0],col2[‘b’],col3.c from complex;2. Hive Data Model: The data model mainly includes: database, table, partition, bucket (1) database: Equivalent to the namespace in relational databases, the role is to isolate the databases application into different database schemas, hive provides the Create Statements such as database dbname, use dbname, and drop Database dbname (2) Table: tables consist of stored data and some metadata of the description table, stored data stored in a distributed file system, and metadata stored in a relational database. When the table is not loaded yet, only a directory is created on HDFs, such as Table A, where the path to the HDFs is ${hive warehouse path}/a, and the data file is copied to the HDFs directory after the data is loaded, with the same file name as the loaded data file, such as ${ The Hive Warehouse path}/a/empinfo.txthive has two tables: 1> managed Table: The data file for this table is loaded into the Data Warehouse directory of hive settings 2> External table: This table is stored in an HDFs directory other than the Hive Data Warehouse directory. You can also create a managed table in the hive's Data Warehouse: hive>Create table tuoguan_tbl (flied string); hive>load data local inpath ‘home/hadoop/test.txt’ into table tuoguan_tbl;To create an external table:

hive> Create external Table external_tb1 (field string)
> Location '/user/username/input/tb_wordcount '; //If no location data is loaded into Hive's data Warehouse

hive>load data local inpath ‘test.txt’ into table external_tbl;

The difference between a managed table and an external table differs in addition to the directory in which the data is loaded, and one is the difference between using the drop command, the data stored at the drop by the managed table and the metadata are deleted, and the external table removes only the metadata and does not delete the stored data.

To view specific information about a table using:

Desc TABLENAME or DESC formatted tableName

(3) Partition: Partition

Hive partitions are roughly divided by the values of a column, and each partition corresponds to a directory on the HDFs, for example:

There are several directories/user/username/input/2015/01,/user/username/input/2015/02 two directories, the building table wants to be divided into years, months, can be built table:

Createtablelogs(id int,line string)

Partitioned by (year string,month string);Then query the select * from logs where month=02 then query will only scan/user/username/input/2015/02 this directory (4) Bucket: To use a bucket, first open the hive control of the bucket:hive> set hive.enforce.bucketing = trueBuckets are hashed according to the specified value, and each bucket is a file in the table directory

Research on 2_ data model of Hive

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Research on 2_ data model of Hive

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Research on 2_ data model of Hive

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support