Hive has been used for a while, but no related logs have been written, because hive is mainly used in the create table, upload data, and crud processes. Later, I needed some frequently used methods in my work. I learned that hive supports UDF (user define function). I have read some articles and found that UDF writing is also very simple, inherit the UDF and over
I have already set up the hadoop and hive environments, created a table page in hive, and loaded the data in. Now I want to count the traffic of each URL from this table and put it in another relational database or display it on the page. What should I do?
Go to the official website and check whether Java, Python, and PHP can be used for implementation. The following is a simple script written in Python.
F
Tags: lazy expand lib Access time info version MySQL database blog artOverviewThe metadata information of Hive is usually stored in the relational database, and the common MySQL database is managed as a meta-database. The previous installation of Hive also stores metadata information in the MySQL database.Hive metadata information has 57 tables in MySQL dataOne, the metadata table (version) that stores the
1.remote IntegratedThis type of storage requires running a MySQL server on the remote server , and the meta service needs to be started on the Hive server . here with mysql test server,IP bit 192.168.1.214, new hive_ Remote database, character set bit latine1;$ vim Hive-site.xmlConfiguration> Property>name>Hive.metastore.warehouse.dirname>value>/user/hive/wareho
Http://web.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-14-2.pdf(Auxiliary reference: Https://cwiki.apache.org/confluence/display/Hive/Correlation+Optimizer)
IntroductionPrimary deficiencies of hive: storage and query plan execution. Three main improvement points are proposed in this paper.
New file Format ORC
Query plan component Optimization (Association optimizer correlati
Getting Started with Hive (ii) metadata for hive architecture 0Hive
Hive stores metadata in the database (Metastore), supports databases such as MySQL, Derby, and Oracle, and Hive defaults to the Derby database
The metadata in hive includes the name of the table
Label:For Windows 32-bit and 64-bit systems that have multiple versions of Hive ODBC Connector, the version must be fully matched during installation (that is: 32-bit Connector can only be run on 32-bit systems, and 64-bit Connector can only be 64-bit system operation) Reference: http://doc.mapr.com/display/MapR/Hive+ODBC+Connector#HiveODBCConnector-HiveODBCConnectoronWindows Directory:
Package
Install hive and hive
Hive installation is relatively simple, because there is no need to modify too many configuration files
1. Download and decompress
I put it in/usr/hadoop/hive
2. Set the environment variable. (It seems that it is not set)
vim /etc/profileexport JAVA_HOME=/usr/java/jdk8export HADOOP_HOME=/usr/
viii. Query Statement select for hive
In all database systems, the SELECT statement is the most used, but also the most complex piece, the query in hive Select support syntax is certainly more complex, this article only try to introduce. 8.1 Basic Query Syntax
The Select base syntax in hive is basically consistent with the standard SQL syntax, which supports whe
1. Create the lib121 directory under the hive0.13.1 version
Cd/opt/cloudera/parcels/cdh/lib/hive;mkdir lib1212. Download the hive1.2.1 version and copy all files from this version of Lib to lib121
3. Modify the Hive_lib variable in/opt/cloudera/parcels/cdh/lib/hive/bin/hive
hive_lib=${hive_home}/lib121
4. Update the JLine jar package on Hadoop and remove the ol
Label: Style Color Io ar SP file on problem log Logs record the process of running the program and are a powerful tool for finding problems. There are two types of logs in hive: 1. the system log records the hive running status and error status. 2. The job log records the historical execution process of jobs in hive. Where are system logs stored? The storage
The hive Custom function consists of three UDFs, UDAF, UDTFUDF (User-defined-function) one in and outUDAF (user-defined Aggregation funcation) aggregation function, the more in one out. Count/max/minUDTF (user-defined table-generating Functions) One more step out, such as lateral view explore ()How to use: Add a custom function's jar file in a hive session, and then create a function to use itUdf1, the UDF
HIVE-UDF operationOperation procedure of UDF:Add A custom function to the jar file in the HIVE session , and then create the function, The function is then used. Below is an example of the following topics:Topic: Statistics of PV and UV for each activityFirst, Java through the regular expression, intercept the title name.Take a link to intercept the red string.http://cms.yhd.com/sale/vtxqclczfto? tc=ad.0.0.
what is hive. Data warehousing: Storing, querying, and analyzing large-scale datasql language: Easy-to-use class SQL query languageO Programming Model: Allows developers to customize UDFs, Transform, Mapper, and Reducer to make it easier to do work that complex mapreduce cannot doo data format: process data in any data format on Hadoop, or use an optimized format to store data on Hadoop, rcfile,orcfile,parquestData Services: HiveServer2, multiple API
1. Hive's inner tableThe inner table of Hive is the normally created table, which is already mentioned in http://www.cnblogs.com/raphael5200/p/5208437.html;2, the appearance of hiveTo create a hive's appearance, you need to use the keyword External:CREATE EXTERNAL TABLE [IF not EXISTS] [db_name.] TABLE_NAME [(col_name data_type [COMMENT col_comment], ...)] [COMMENT Table_comment] [Partitioned by (Col_name data_type [COMMENT col_comment], ...)]
Describe:Hive Table Pms.cross_sale_path is established with the date as the partition, the HDFs directory/user/pms/workspace/ouyangyewei/testusertrack/job1output/ The data on the Crosssale, written on the $yesterday partition of the tableTable structure:HIVE-E "Set Mapred.job.queue.name=pms;drop table if exists pms.cross_sale_path;create external table Pms.cross_sale_ Path (track_id string,track_time string,session_id string,gu_id string,end_user_id string,page_category_id bigint, algorithm_id i
Read the table structure in hive. This article contains the table class, the field class is used to encapsulate the table structure, and it will be OK after a rough look.
(Change the code format)
1. Table class
Public class table {
Private string tablename;
Private list
Public table (){
}
Public table (string tablename, list
This. tablename = tablename;
This. Field = field;
}
Public String gettablename (){
Return tablename;
}
Public void setta
Tags: Word exist Derby configuration driver data pre XML color / /server110:3306/hive?createdatabaseifnotexist=true
Hive replaces default Derby's hive-site.xml configuration with MySQL as metadata
One, hive command line 1, hive support some of the commands
Command Description
quit Use quit or exit to leave the interactive shell.
Set Key=value Use the To set value of particular configuration variable. One thing to note here's if you misspell the variable name, the CLI won't show an error.
Set This would print a list of configuration variables that is overridden by user or
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.