NULL in Pig

Source: Internet
Author: User

Tag: Pig null

In the compare operation (= ,! =,>, <, >=, <=), Matches, arithmetic operations (+,-, *,/, including % ,?, If one of the operands is null, the result is null.
Count_star, does not filter NULL data
Cast operation: converts a NULL data type from one data type to another. The result is null.
AVG, Min, Max, sum, Count: these operations ignore null values.
Concat: Any word expression is null and the result is null.
Size: Any calculated object is null, and the result is null.
Tuple (.) or map (#): If the referenced object is null, the result is null.
Filter operation: If the expression of a filter is null, the filter will not reject the operation. (Eg: B = Filter A by X! = 5. If X is null ,! X is also null, X! = 5 will be blank, then the filter will not process this row of data .)
Ternary operator? : If the result of a bool expression is null, the result is null.

The following operations generate nulls: 1. Remove 02, user's udfs3, reference a nonexistent field 4. Reference a field that does not exist in a map 5. Reference a field that does not exist in a tuple 6. load data that does not exist null, null strings are not loaded and replaced with nullnull, which can be used as a constant. 7. null is generated when data types do not match during load.

Group/cogroup/join: When group is used to process a link, null in a link will be clustered together and treated as a null. When cogroup is used to process multiple links, if the key is null, the null values of multiple links are different and treated as different null keys respectively.
Join:
Data: A: 1 5 4
3 6
B: 1 7
2 8
10

Join [inner: The join null and null fields cannot match and are filtered out. Filtering out data with null key before join helps improve the join speed.A = LOAD '. /t1.txt 'as (A1: int, A2: INT); B = LOAD '. /t2.txt 'As (B1: int, B2: INT); C = join a by A1, B by B1; dump C)
Join [outer] d = join a by a1 left, B by B1; dump D; (1, 5, 1, 7)
(3, 6 ,,)
(, 4 ,,)
D = join a by a1 right, B by B1; dump D)
(, 2, 8)
(, 10)
D = join a by a1 full, B by B1; dump D)
(, 2, 8)
(3, 6 ,,)
(, 4 ,,)
(, 10)



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.