Install Pig and test in local mode.

Source: Internet
Author: User

Pig is an open-source project of Apache used to simplify MapReduce development. I have studied it for a while, but I have a little bit of experience. This is not much nonsense. We will go straight into the actual test.

Pig runs in two modes: Local click mode and cluster mode. Currently, I only want to test the Pig running process and the learning syntax. I do not need to use the distributed mode. The distributed mode is also similar.

My environment:

1. System: Ubuntu 12.04 64-bit

2. JDK: Oracle JDK1.7.0 _ 15

3. Pig: 0.9.2

Similar to other Apache projects, Pig installation is simple. Extract the package to any directory of the system and set the environment variables.

Export PIG_HOME = path
Export PATH = $ PATH: $ PIG_HOME/bin

After setting the environment variables, log out and log on to or open the terminal. Enter source/etc/profile to make the newly added environment variables take effect. Finally, enter pig-version in the terminal. Normally, the following words should appear:

Warning: $ Hadoop_HOME is deprecated.

Apache Pig version 0.9.2 (r1232772)
Compiled Jan 18 2012, 07:57:19

Here, Pig installation is successful. (Of course, if you fail to check whether your JDK installation and environment variables are correct, you can enter:

Pig-x local

Enter a shell program.

The general introduction to learning Hadoop is the Chinese version of Oreilly's Hadoop authoritative guide. The first program to test MapReduce is to count the number of times words appear in a text file. Pig is designed to simplify MapReduce development and can certainly implement this. Let me use this example to write a test example.

Link: Hadoop authoritative guide (Chinese Version 2nd) PDF

I have prepared a file named nie.txt, which is a common English article with a length of about 52KB.

  • 1
  • 2
  • Next Page

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.