Hive Regexserde View

Source: Internet
Author: User

Externalkeyword It allows the user to create an external table. The actual data (location) in the specified path is simultaneously constructed in the table. When Hive creates an internal table. The data is moved to the path that the data warehouse points to, and if you create an external table, only the path where the data is located, the location of the incorrect data is changed. When you delete a table, the metadata and data for the internal table are deleted together. The external table simply removes the metadata and does not delete the data

1. Like agrees that the user replicates the existing table structure, but does not replicate the data

2. Use of Regexserde in hive

The Regexserde is a serialization/deserialization of the hive itself, which is mainly used to handle the normal form.

CreateTable Test_serde (

C0string,

C1string,

c2string)

Rowformat

SERDE ' Org.apache.hadoop.hive.contrib.serde2.RegexSerDe '

Withserdeproperties

(' Input.regex ' = ' ([^]*) ([^]*) ([^]*) ',

' output.format.string ' = '%1 $ s%2$s%3$s ')

Storedas textfile;

3. Table and column names do not distinguish between uppercase and lowercase

4. Create an external table to specify the data storage path

Create externaltable Exter_trl (

Id int,

Name String,

Age int,

Tel string

Location '/user/data/trl/external ';

)

1. Import the data into the external table. The data is not moved to its own Data Warehouse folder, which means that the data in the external table is not managed by itself!

And the table is not the same;

2, when the table is deleted. Hive will delete all metadata and data that are part of the table. While deleting the external table. Hive deletes only the metadata for the external table, and the data is not deleted.

So, how do you choose which kind of table to use? In most cases there is not much difference, so the choice is just a matter of personal preference.

But as an experience, assuming that all processing needs to be completed by hive, then you should create the table, otherwise use the external table.

5. Loading data into a specified partition

LoadData Inpath

'/user/data/clickstat_gp_fatdt0/0 ' Overwriteinto TABLE c02_clickstat_fatdt1

PARTITION (dt= ' 20140820 ');

5. The synchronization of the built table specifies the path strength of HDFs

CREATE externaltable Page_view (viewtime INT, UserID BIGINT,

Page_url String,referrer_url STRING,

IP stringcomment ' ip Address of the User ',

Country stringcomment ' country of origination ')

COMMENT ' This isthe staging page view table '

ROW formatdelimited fields TERMINATED by ' \054 '

STORED Astextfile

Location '

6. HIVE View(view reduces complex queries)

CREATE VIEW Test_trlas

SELECT * from T1join T2

On (t1.id=t2.id) WHERE t1.name= ' TRL ';

Copyright notice: This article Bo Master original articles, blogs, without consent may not be reproduced.

Hive Regexserde View

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.