Hive Data Model and storage

Source: Internet
Author: User

hive Data Model and storage



In the last article, I've enumerated a simple example of hive operations, created a table test, and loaded the data into this table, which are similar to relational database operations, and we often compare hive with relational databases, precisely because hive many knowledge points and relational databases are similar.

There are tables, partitions, and hive in relational databases, which are called hive data models in hive technology. Today, this article describes the hive data types, data models, and file storage formats. This knowledge can be analogous to relational database knowledge.

First I want to talk about the data type of hive.

Hive supports two types of data, one called atomic data types, and one called complex data types.

The atomic data types include numeric, Boolean, and string types, as shown in the following table:

Basic data types

Type

Describe

Example

TINYINT

1-byte (8-bit) signed integer

1

SMALLINT

2-byte (16-bit) signed integer

1

Int

4-byte (32-bit) signed integer

1

BIGINT

8-byte (64-bit) signed integer

1

FLOAT

4-byte (32-bit) single-precision floating-point number

1.0

DOUBLE

8-byte (64-bit) double-precision floating-point number

1.0

BOOLEAN

True/false

True

STRING

String

' Xia ', ' Xia '

From the table above we see that hive does not support date types, in hive dates are represented by strings, while the commonly used date format conversion operations are performed through custom functions.

  Hive is developed in Java, and the basic data types in hive and Java basic data types are also one by one corresponding, except for the string type. Signed integer types: TINYINT, SMALLINT, int, and bigint are equivalent to Java byte, short, int, and long atomic types, which are 1-byte, 2-byte, 4-byte, and 8-byte signed integers respectively. Hive floating-point data types float and double, corresponding to the Java base type float and double. The Boolean type of hive is equivalent to the Java basic data type Boolean.

The type of string for hive is equivalent to the varchar type of the database, which is a mutable string, but it cannot declare how many characters it can store, theoretically it can store the number of characters in 2GB.

Hive supports basic types of conversions, the basic types of which can be converted to high byte types, such as tinyint, SMALLINT, int can be converted to float, and all integer types, float, and string types can be converted to double types. These transformations can be considered from the Java language type conversion, since hive is written in Java. It is also supported to convert a high byte type to a low byte type, which requires the use of the hive custom function cast.

Complex data types include arrays (array), mappings (map), and struct bodies (STRUCT), as shown in the following table:

Complex data types

Type

Describe

Example

ARRAY

A set of ordered fields. field must be of the same type

Array (1,2)

MAP

A set of unordered key/value pairs. The type of the key must be atomic, the value can be any type, the type of the same mapped key must be the same, and the type must be the same

Map (' A ', 1, ' B ', 2)

STRUCT

A set of named fields. field types can be different

Struct (' A ', 1,1,0)

Now let's take a look at hive using examples of complex data types, building tables:

Create table Complex (col1 array< int, Col2 map<string, INT, Col3 Struct<a:string,b:int, C:double & gt;);

  

Query statement:

select 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.