Hadoop (vi)--sub-project Pig

Last Update:2015-09-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Front, the two pillars of hadoop , HDFS and MapReduce, We use Java to write map-reduce by placing data files of big data on HDFS. To achieve a variety of data analysis, and predict something to achieve the business value of big data, thus also reflects the value of Hadoop .

&NBSP, But in a traditional system, we analyze data through a database, such as a relational database: oracle Span lang= "en-US" >,sql server,mysql no SQL Span lang= "ZH-CN" > database: mongodb hadoop Big data analysis, how can we quickly and smoothly over it so that it does not java hadoop hadoop pig hive pig How to operate hadoop Span lang= "ZH-CN".

One,Pig is what:

Pigis aYaHooDonate toApacheone of the projects. EquivalentHadoopa client that belongs to theHadoopthe upper-layer derivative architecture that willMap-reducefor encapsulation. ThroughPigclient users can use thePig Latina similar toSQLlanguage pairs for data flowHDFSprocessed by the data below. Just a little easier.Pigis thatPig LatinData Flow language andMap-reducetranslation, similar to the direct interface of different languages. So some people say,PigincludePig Interfaceand thePig Latintwo parts. Look at this picture below:

Two,Pig Two modes of operation:

1 , Local mode: All files and execution procedures are performed locally and are typically used for testing programs. Turn on native mode:pig-x Local

2 , Map Reduce Mode: Actual working mode, pig translates the query into a mapreduce job, and then executes on the Hadoop cluster.

Three, installation Pig:

1, download unzip: to Apache Download website Pig , I downloaded the following: pig-0.15.0.tar.gz , placed under the same path as the hadoop (self-determined); Unzip:tar- XVF pig-0.15.0.tar.gz

2 , set environment variables: VI /etc/profile setting Pig of the bin Paths and JDK 's path. Reboot to complete the local mode installation .

Export path= $PATH:/home/ljh/pig-0.15.0/bin

Export java_home=/usr/jdk1.8.0_51

3 , Map Reduce Mode configuration , Set Hadoop related environment variables, restart, direct use Pig make MapReduce mode runs.

exportpath= $PATH:/home/ljh/pig-0.15.0/bin : /home/ljh/ hadoop-1.2.1 /bin

Exportjava_home=/usr/jdk1.8.0_51

exportpig_classpath=/home/ljh/hadoop-1.2.1/conf/

Four,Grund Shell command:

Enter Grunt , you can use a similar Linux command to perform various operations, and Linux very much like, you can experiment a bit:

For example: ls , CD cat and the Linux exactly the same, just in Hadoop environment;

copytolocal Copy the file to local, copyfromlocal copy files from local to Hadoop environment;

SH, commands used to execute the operating system, such as: Sh/usr/jdk/bin/jps . Look at the bottom of a picture, do not understand can be checked.

Five,pig 's data model, and a simple comparison of traditional data models:

Pig

Database

Bag

Table

Tuple

Yes

Field

Property

Note: Pig with one Bag inside the various tuple can be made up of different numbers of different types of Field , and traditional databases are not the same.

Six,Pig latin( similar to the SQL -to-data-flow language ):

1 , common statements:

2 , corresponding to various SQL , read this blog, very good: http://guoyunsky.iteye.com/blog/1317084

3 , more grammar, read this blog, very detailed: http://blackproof.iteye.com/blog/1791980

Pig Latin is a language, and is relatively simple language, in our use, we are now available to check. No care about the data of various queries, filtering and so on.

Summary: Good, Pig relatively little knowledge, easy installation, easy to use, need to learn more is Pig Latin This scripting language proficiency, more various grammar, will be used to check. pig,hadoop 's upper-layer derivative framework, which makes operation of Hadoop a way more, now has:java 's mapreduce,pig 's Pig Latin, and of course the hive behind it . HIVEQL , I'll talk about it later. In a word, more understanding, more reading of various materials, more practice ...

Hadoop (vi)--sub-project Pig

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Hadoop (vi)--sub-project Pig

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Hadoop (vi)--sub-project Pig

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support