Hive Load JSON Data solution

Source: Internet
Author: User

Hive official does not support data loading in JSON format, the default support CSV format file loading, how to implement JSON data format resolution without relying on external jar package, this blog focuses on this problem solution

First create the metadata table:

string ' \ t ' ' Com.hadoop.mapred.DeprecatedLzoTextInputFormat ' ' Org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat ' ' Hdfs://sps1:9090/data/accesslog '

To create a view chart:

 CREATE View Access_log_view as select  eventtime, IP, appName, FP, username, Target from Access_log lateral view json_tuple (content,  " eventtime  " ,  " ip   ", "  appname   ", "  Span style= "COLOR: #800000" >FP   ", "  username   ", "  target   ") T1 as Eventtime, IP, appName, FP, username, target; 

The view chart uses JSON tuple to extract the data from the JSON object, which enables field separation.

However, some log files are/user/aaa/dt=2013-12-01/ds=01/access.log with partitioned directories, which require the support of partitioned tables for this format

To create a partitioned table:

string int int ' \ t ' ' Com.hadoop.mapred.DeprecatedLzoTextInputFormat ' ' Org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat ' ' Hdfs://sps1:9090/data/accesslog4 ';

But the problem came and found no way to load the data, what to do about that.

Next we need to manually load the partition:

ALTER TABLE Access_log add partition (dt=?,ds=?)

This will allow you to find the data. Remember that you must partition add, otherwise you will not be able to find the data.

To create a view chart:

Same as CREATE view above

But partitioning is increasing over time and this cannot be human, we need automated scripting to help us complete

#!/bin/~/. BASHRCDate= 'date +%y-%m-%d ' hour= 'date +%H ' CMD="ALTER TABLE databasename.tablename ADD PARTITION (dt= ' $date ', ht= ' $hour '); "  "$cmd"

So far, the problem with hive loading JSON data and partitioned tables has been explained, and we continue our discussion without understanding the message below.

Hive loading JSON data solution

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.