Pig installation and deployment and testing in MapReduce Mode

Source: Internet
Author: User

Pig installation Configuration

1. Download the pig package: (pig-0.9.1)

Apache version: http://pig.apache.org/

2. decompress the file:

# Tar-zxvf pig-0.9.1.tar.gz

3. Configure/etc/profit

Export PIG_INSTALL =/usr/pig-0.9.1
Export PATH = $ PATH: $ PIG_INSTALL/bin
Export PIG_Hadoop_VERSION = 20 // support hadoop version, my hadoop-0.20.2
Source/etc/profile make the configuration file take effect

4. I will not talk about pig's Local Mode much here, mainly about some configurations in hadoop mode.

You can configure namenode and jobtracker in either of the following ways:

Method 1: Write the profile file export PIG_CLASSPATH = $ HADOOP_INSTALL/conf/

Method 2: Add pig. properties in the conf folder under the pig directory

Fs. default. name = hdfs: // hadoop149: 9000/
Mapred. job. tracker = hadoop149: 9004

5. Start pig

[Root @ localhost conf] # pig

17:57:48, 357 [main] INFO org. apache. pig. Main-Logging error messages to:/usr/pig-0.9.1/conf/pig_1323165468355.log

2011-12-06 17:57:48, 528 [main] INFO org.apache.pig.backend.hadoop.exe cutionengine. HExecutionEngine-Connecting to hadoop file system at: hdfs: // hadoop149: 9000/

17:57:48, 634 [main] INFO org.apache.pig.backend.hadoop.exe cutionengine. HExecutionEngine-Connecting to map-reduce job tracker at: hadoop149: 9004

• Grunt> the above message is displayed, indicating that pig is successfully started.

6. Test the execution of pig jobs in MapReduce mode.

Step 1: Upload passwd to the hdfs file system. If the path is as follows/passwd

Step 2: execute the following commands in the grunt compiler command line in sequence

A = load '/passwd' using PigStorage (':');

B = foreach A generate $0 as id;

Dump B;

You can directly view the command execution result on the screen.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.