Pigs and pythons (pig and Python)

Source: Internet
Author: User
Tags float number

Pig 0.9 uses Python as an embedded support voice, using the Jython interpreter to take advantage of the python2.5 function, the topmost layer of this interface is Org.apache.pig.scripting.Pig
First the Python script compiles a pig Latin script, then passes the variable defined in Python to it, and finally executes it.

1) Pig.compile or compilefromfile to pre-compile the code
2) The Bind method binds a variable in the control flow to a variable in the Pig Latin script and returns a Boundscript object
3) for the Boundscript object, you can call the Runsingle method to execute him, return a Pigstat object, if the Pig object is bound to a set of map containing parameters during the binding process, call the Run method, and return a Pigstats object as well.

A separate instance of a user-written UDF is built and runs in each map or reduce task, and the constructor parameter is a way of passing information to the user UDF.
Python corresponds to the type of pig

int number
Long number
Float number
Double number
Chararray string
ByteArray string
Map Dictionary
Tuple tuple
Bag List Oftuples




Pig's load function is created from Hadoop-based InputFormat, and the base class is Loadfunc,loadfunc's default implementation is for HDFs, and Pig provides the Preparetoread method for loading functions that provide a way to initialize themselves. Once the user's load function implements the GetSchema method, the LOAD statement no longer needs to define their schema.

Similarly, storage functions are built on Hadoop-based Outoutformat. A tuple of pig is accepted, and then the base class is the Preparetowrite method that Storefunc,pig calls the stored function in each map or reduce task, based on the output of a good thing, and then writes the key-value pair to the store, Putnext is the core method of the stored function.

Pigs and pythons (pig and Python)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.