Hive Data Model and storage

Last Update:2018-07-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

hive Data Model and storage

In the last article, I've enumerated a simple example of hive operations, created a table test, and loaded the data into this table, which are similar to relational database operations, and we often compare hive with relational databases, precisely because hive many knowledge points and relational databases are similar.

There are tables, partitions, and hive in relational databases, which are called hive data models in hive technology. Today, this article describes the hive data types, data models, and file storage formats. This knowledge can be analogous to relational database knowledge.

First I want to talk about the data type of hive.

Hive supports two types of data, one called atomic data types, and one called complex data types.

The atomic data types include numeric, Boolean, and string types, as shown in the following table:

Basic data types
Type	Describe	Example
TINYINT	1-byte (8-bit) signed integer	1
SMALLINT	2-byte (16-bit) signed integer	1
Int	4-byte (32-bit) signed integer	1
BIGINT	8-byte (64-bit) signed integer	1
FLOAT	4-byte (32-bit) single-precision floating-point number	1.0
DOUBLE	8-byte (64-bit) double-precision floating-point number	1.0
BOOLEAN	True/false	True
STRING	String	' Xia ', ' Xia '

From the table above we see that hive does not support date types, in hive dates are represented by strings, while the commonly used date format conversion operations are performed through custom functions.

　　Hive is developed in Java, and the basic data types in hive and Java basic data types are also one by one corresponding, except for the string type. Signed integer types: TINYINT, SMALLINT, int, and bigint are equivalent to Java byte, short, int, and long atomic types, which are 1-byte, 2-byte, 4-byte, and 8-byte signed integers respectively. Hive floating-point data types float and double, corresponding to the Java base type float and double. The Boolean type of hive is equivalent to the Java basic data type Boolean.

The type of string for hive is equivalent to the varchar type of the database, which is a mutable string, but it cannot declare how many characters it can store, theoretically it can store the number of characters in 2GB.

Hive supports basic types of conversions, the basic types of which can be converted to high byte types, such as tinyint, SMALLINT, int can be converted to float, and all integer types, float, and string types can be converted to double types. These transformations can be considered from the Java language type conversion, since hive is written in Java. It is also supported to convert a high byte type to a low byte type, which requires the use of the hive custom function cast.

Complex data types include arrays (array), mappings (map), and struct bodies (STRUCT), as shown in the following table:

Complex data types
Type	Describe	Example
ARRAY	A set of ordered fields. field must be of the same type	Array (1,2)
MAP	A set of unordered key/value pairs. The type of the key must be atomic, the value can be any type, the type of the same mapped key must be the same, and the type must be the same	Map (' A ', 1, ' B ', 2)
STRUCT	A set of named fields. field types can be different	Struct (' A ', 1,1,0)

Now let's take a look at hive using examples of complex data types, building tables:

Create table Complex (col1 array< int, Col2 map<string, INT, Col3 Struct<a:string,b:int, C:double & gt;);

Query statement:

select

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Hive Data Model and storage

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Hive Data Model and storage

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support