HIVE[3] Data types and file formats

Source: Internet
Author: User

Hive supports most data basic data types in relational databases and also supports 3 collection types; The basic data type of 3.1 Hive supports many different types of shaping and floating-point data type, as follows (all reserved words):tinyint1byte signed integerSmalint2byte signedint4byte signedbigint8byte signedBooleanBoolean type, True or FalsefloatSingle-precision floating-point numberDoubleDouble-precision floating-point numberstringStringTimpstampInteger, floating-point number, or stringbinaryByte arrays Note that these are implementations of the interfaces in Java, so the specific behavior details of these types are exactly the same as the corresponding types in Java, if the user in the query will be a float type column and a double type of FSF, Hive The type is implicitly converted to the larger of the two types;cast (s as INT)convert s to int type 3.2 Aggregate data type the columns in hive support the use of struct, map, array collection data types, for example:STRUCTColumn type Struct{first string,last STRING} such as: struct (' John ', ' Doe ')MAPis a set of key values to the collection field name [last] get values such as: Map (' first ', ' Join ', ' last ', ' Doe ')ARRAYis a collection of variables with related types and names [' John ', ' Doe '] such as: Array ([' John ', ' Doe ']) For example: Employee Relations tableCREATE TABLE Employees (Name STRING, --name salary Float, --Salary subordinates Array<string>, --subordinate staff deductions Map<string,float>, --the content deducted from the wages when the salary is paid (e.g. tax, social Security, Provident Fund, etc.) address struct<street:string,city:string,stat:string,zip:int> --Employee home addresses )3.3 Text file data encoded with comma-separated values (CSV) or tab-delimited value (TSV) files; The disadvantage is that commas or tabs that do not need to be treated as delimiters in the file should be used with caution; Default records and field separators in Hive:\ nLine breaks for text files^aDelimited fields (columns), which can be represented in the CREATE TABLE statement using octal encoding (\001)^bSeparates elements in an ARRAY or STRUCT, or separates between key-value pairs in a MAP, using octal encoding (\002) to denote^cUsed for separating the keys and values in the MAP, using octal encoding (\003) to indicate that the default delimiters are not used, and specifying additional delimiters, for example:CREATE TABLE Employees (name STRING,Salary FLOAT,Subordinates ARRAY (STRING),deductions MAP (string,float),Address struct<street:string,city:string,state:string,zip:int>)ROW FORMAT Delimited--must be written before the following clause (except stored as)fileds TERMINATED by ' \001 '--hive will use ^a as the column delimiterCOLLECTION ITEMS TERMINATED by ' \002 '--Indicates that hive will use ^b as the delimiter between the collection elementsMAP KEYS TERMINATED by ' \003 '--Indicates that hive will use ^c as the delimiter between the key values of the MAPLINES TERMINATED by ' \ n '--The following two sentences indicate that ROW FORMAT delimited does not need to be a keywordSTORED as Textfile;This sentence is seldom used to note: Before the directory Hive for linesterminated by public support \ nthe line and row between the delimiter can only be \nhive also support other types of text format, 15 lessons to elaborate the definition of a table is separated by commas:CREATE TABLE Some_data (fistr float, second float, third float) row format delimited fileds terminated by ', ';3.4 Read-time mode hive does not check the schema when the data is written to the database, nor does it validate when the data is loaded, but rather at query time, which is read-time mode, and if the pattern and file content do not match, the number of fields in each row of records is less than the number of fields defined in the corresponding pattern. Then the user will see a lot of null values in the query results, and if some fields are numeric, but Hive finds a non-numeric string value when it reads, the null value will be returned, except that Hive will try to recover the error as much as possible;

HIVE[3] Data types and file formats

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.