Hive Quick Start

Source: Internet
Author: User
Tags null null prepare hadoop fs
Hive Quick Start

@ (HIVE) [HIVE] (a) Simple introduction

1. Create a table

CREATE table if not exists ljh_emp (
name string,
salary float,
gender string)
comment ' Basic information of a employee '
row format delimited fields terminated by ', ';

2. Prepare data files
Create the test directory with only one file in the directory, and the contents of the file are as follows:

Ljh,25000,male
Jediael,25000,male
Llq,15000,female

3, import the data into the table

Load data local inpath '/home/ljhn1829/test ' into table ljh_emp;

4, the contents of the query table

SELECT * from Ljh_emp;
OK
ljh    25000.0    male
jediael    25000.0
male llq 15000.0 Female Time taken:0.159 seconds, Fetched:3 row (s)
(ii) about separators

1. Default Separator
The line default delimiter in hive is \ n, the field delimiter is CTRL + A, and ctrl+b,ctrl+c can be used to separate array,struct,map, as detailed in the Hive Programming Guide P44.
Therefore, if the table is in the form without specifying row format delimited fields terminated by ', the default field delimiter is considered Ctrl + A.
There are 2 different solutions:
The first is to specify the separator when creating the table, as shown in the previous example,
The second is to use CTRL + A in the data file, see the following example

2. Use CTRL + a full separator in the data file
(1) Create a table
CREATE table Ljh_test_emp (name string, salary float, gender string);
(2) Prepare data files
Create the Test2 directory, with only one file in the directory, the contents of the file are as follows:
Ljh^a25000^amale
Jediael^a25000^amale
Llq^a15000^afemale
One of the ^a characters can only be seen in VI, cat can not see.
The output ^a method is: In VI insert mode, first press CTRL + V, then press CTRL + A
(3) Import the data into the table
CREATE table Ljh_test_emp (name string, salary float, gender string);
(4) Query data

Hive> select * from Ljh_test_emp;
OK
ljh    25000.0    male
jediael    25000.0
male llq 15000.0 female time taken : 0.2 seconds, Fetched:3 row (s)

3, the separator is not specified, and CTRL + A is not used as a separator in the file, the following error occurred
(1) Create a table

CREATE table if not exists ljh_emp_test (
name string,
salary float,
gender string)
comment ' Basic Information of a employee ';

(2) Preparing data

Ljh,25000,male
Jediael,25000,male
Llq,15000,female

(3) Import the data into the table

Load data local inpath '/home/ljhn1829/test ' into table ljh_emp_test;

(4) View the data in the table

SELECT * from Ljh_emp_test;
OK
ljh,25000,male null null
jediael,25000,male    null
Llq,15000,female    NULL    NULL time
taken:0.185 seconds, Fetched:3 row (s)

As you can see, because the delimiter is CTRL + A, each row in the file is treated as the first field when the data is imported, resulting in the subsequent 2 fields being null.

(iii) a more complex table
1. Create a table

CREATE TABLE Employees (
    name string,
    slalary float,
    suboddinates array<string>,
    deductions Map <string,float>, Address
    struct<stree:string, city:string, state:string, zip:int>
)
Partitioned by (Country string, state string);

2. Prepare data

John doe^a100001.1^amary smith^btodd jones^afederal taxes^c.2^bstatetaxes^c.05^binsurance^c.1^a1 Michigan Ave.^ bchicago^bil^b60600
Mary smith^a80000.0^abill king^afederal taxes^c.2^bstate taxes^c.05^binsurance^c.1^a100 Ontario st.^bchicago^bil^b60601
Todd jones^a70000.0^a^afederal taxes^c.15^bstate taxes^c.03^binsurance^c.1^ A200 Chicago ave.^boak park^bil^b60700
Bill king^a60001.0^a^afederal taxes^c.15^bstate taxes^c.03^binsurance^ c.1^a300 obscure dr.^bobscuria^bil^b60100

Note ^a: Separating fields ^b: separating elements in Array/struct/map ^c: Separating the KV in the map
See "Hive Programming Guide" P44.

3, import the data into the table

Load data local inpath '/home/ljhn1829/phd ' into table employees partition (country= ' Us ', state= ' Ca ');

4. View Table Data

Hive> SELECT * FROM Employees;
OK
John Doe    100001.1    ["Mary Smith", "Todd Jones"]    {"The taxes": 0.2, "statetaxes": 0.05, "Insurance ": 0.1}    {" Stree ":" 1 Michigan Ave. "," City ":" Chicago "," state ":" IL "," Zip ": 60600}    US    ca
Mary Smith    80000.0    ["Bill King"]    {"The taxes": 0.2, "state taxes": 0.05, "Insurance": 0.1}    {"Stree": "Ontario St.", "City": "Chicago", "state": "IL", "Zip": 60601}    US    CA
Todd Jones    70000.0    []    {"Taxes": 0.15, "state taxes": 0.03, "Insurance": 0.1}    {"Stree": "Chicago Ave.", "City": "Oak Park", "state": "IL", "Zip": 60700}    US    ca
Bill King    60001.0    []    {"Taxes": 0.15, "state taxes": 0.03, "Insurance": 0.1}    {"Stree": "Obscure Dr.", " City ': ' Obscuria ', ' state ': ' IL ', ' Zip ': 60100}    US    ca time
taken:0.312 seconds, Fetched:4 row (s)

5, view the files in the HDFs

Hadoop fs-ls/data/gamein/g4_us/meta/employees/country=us/state=ca
Found 1 Items
-rwxr-x---   3 ljhn1829 G4_us        428 2015-05-12 12:49/data/gamein/g4_us/meta/employees/country=us/state=ca/progamming_hive_data.txt

The content in this file is consistent with the original file.

(iv) Inserting data through a SELECT clause
1. Create a table

CREATE TABLE Employees2 (
    name string,
    slalary float,
    suboddinates array<string>,
    deductions Map<string,float>, Address
    struct<stree:string, city:string, state:string, zip:int>
)
Partitioned by (Country string, state string);

2. Inserting data

Hive>  set hive.exec.dynamic.partition.mode=nonstrict;

Otherwise, the following exceptions occur:

 failed:semanticexception [Error 10096]: Dynamic partition Strict mode requires at least one static partition col Umn. To turn this off set hive.exec.dynamic.partition.mode=nonstrict inserts into table Employees2 partition (country,state) SE
Lect name,slalary,suboddinates,deductions,address, E.country, e.state from Employees e; 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.