Tag: Pig null
In the compare operation (= ,! =,>, <, >=, <=), Matches, arithmetic operations (+,-, *,/, including % ,?, If one of the operands is null, the result is null.
Count_star, does not filter NULL data
Cast operation: converts a NULL data type from one data type to another. The result is null.
AVG, Min, Max, sum, Count: these operations ignore null values.
Concat: Any word expression is null and the result is null.
Size: Any calculated object is null, and the result is null.
Tuple (.) or map (#): If the referenced object is null, the result is null.
Filter operation: If the expression of a filter is null, the filter will not reject the operation. (Eg: B = Filter A by X! = 5. If X is null ,! X is also null, X! = 5 will be blank, then the filter will not process this row of data .)
Ternary operator? : If the result of a bool expression is null, the result is null.
The following operations generate nulls: 1. Remove 02, user's udfs3, reference a nonexistent field 4. Reference a field that does not exist in a map 5. Reference a field that does not exist in a tuple 6. load data that does not exist null, null strings are not loaded and replaced with nullnull, which can be used as a constant. 7. null is generated when data types do not match during load.
Group/cogroup/join: When group is used to process a link, null in a link will be clustered together and treated as a null. When cogroup is used to process multiple links, if the key is null, the null values of multiple links are different and treated as different null keys respectively.
Join:
Data: A: 1 5 4 3 6 B: 1 7 2 8 10 |
Join [inner:
The join null and null fields cannot match and are filtered out. Filtering out data with null key before join helps improve the join speed.A = LOAD '. /t1.txt 'as (A1: int, A2: INT); B = LOAD '. /t2.txt 'As (B1: int, B2: INT); C = join a by A1, B by B1; dump C)
Join [outer] d = join a by a1 left, B by B1; dump D; (1, 5, 1, 7)
(3, 6 ,,)
(, 4 ,,)
D = join a by a1 right, B by B1; dump D)
(, 2, 8)
(, 10)
D = join a by a1 full, B by B1; dump D)
(, 2, 8)
(3, 6 ,,)
(, 4 ,,)
(, 10)