Ix. Degradation Dimensions
This section discusses a technique called a degenerate dimension. This technology reduces the number of dimensions and simplifies the dimension Data Warehouse model. Simple patterns are easier to understand than complex and have better query performance. This dimension can be degraded when there is no data required for the data warehouse in a dimension, and the dependent data for the degraded dimension needs to be migrated to the fact table, and then the degraded dimension is deleted.
1. Degraded order Dimension
This section describes how to degrade order dimensions, including modifications to the Data Warehouse schema and periodic mount scripts. When using dimension degradation techniques, you first need to identify the data and analyze the data columns that you never use. For example, the Order_number column of an order dimension might be such a column. But if the user wants to see the details of the transaction, the order number is also required. Therefore, the order number is migrated to the Sales_order_fact table before the order dimension is degraded. The following figure shows the migrated pattern.
Perform the following four-step degraded order_dim dimension table sequentially:
(1) Add order_number column to Sales_order_fact table
(2) Transfer the order number of the Order_dim table to the Sales_order_fact
(3) Delete the Order_sk column in the Sales_order_fact table
(4) Delete Order_dim table
The following script completes the steps required for all degraded order dimensions.
Use DW;
ALTER TABLE sales_order_fact Rename to Sales_order_fact_old;
CREATE TABLE Sales_order_fact (
order_number int COMMENT ' order number ',
Customer_sk int COMMENT ' Customer surrogate key ',
Product_sk int COMMENT ' product surrogate key ',
Order_date_sk int COMMENT ' Order Date surrogate key ',
Allocate_date_sk int COMMENT ' Allocate date surrogate key ',
allocate_quantity int COMMENT ' Allocate quantity ',
Packing_date_sk int COMMENT ' Packing date surrogate key ',
packing_quantity int COMMENT ' Packing quantity ',
Ship_date_sk int COMMENT ' Ship date surrogate key ',
ship_quantity int COMMENT ' ship quantity ',
Receive_date_sk int COMMENT ' Receive date surrogate key ',
receive_quantity int COMMENT ' Receive Quantity ',
Request_delivery_date_sk int COMMENT ' request delivery date surrogate key ',
Order_amount decimal (10,2) COMMENT ' Order Amount ',
order_quantity int COMMENT ' order quantity ')
Clustered by (Order_number) into 8 buckets
Stored as Orc tblproperties (' transactional ' = ' true ');
INSERT INTO Table Sales_order_fact
Select T2.order_number,
T1.customer_sk,
T1.product_sk,
T1.order_date_sk,
T1.allocate_date_sk,
T1.allocate_quantity,
T1.packing_date_sk,
T1.packing_quantity,
T1.ship_date_sk,
T1.ship_quantity,
T1.receive_date_sk,
T1.receive_quantity,
T1.request_delivery_date_sk,
T1.order_amount,
T1.order_quantity
From Sales_order_fact_old T1
Inner join Order_dim t2 on t1.order_sk = T2.order_sk;
drop table sales_order_fact_old;
drop table order_dim;2. Modify a recurring Mount script
Another thing to do after a dimension is degraded is to modify the regular Mount script. The modified script needs to add the order number to the Sales Order fact table, and no longer need to import the order dimension. The contents of the modified Regular_etl.sql script file are shown below.
--Set variables to support transactions
Set hive.support.concurrency=true;
Set hive.exec.dynamic.partition.mode=nonstrict;
Set Hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.dbtxnmanager;
Set hive.compactor.initiator.on=true;
Set Hive.compactor.worker.threads=1;
Use DW;
--Set the effective time and expiration time of the SCD
SET hivevar:cur_date = Current_date ();
SET hivevar:pre_date = Date_add (${hivevar:cur_date},-1);
SET hivevar:max_date = CAST (' 2200-01-01 ' as date);
--Set the CDC time limit
INSERT OVERWRITE TABLE rds.cdc_time SELECT last_load, ${hivevar:cur_date} from Rds.cdc_time;
--Load the Customer dimension
--Sets the expiration of SCD2 on deleted records and address-related columns, handling null values with the <=> operator.
UPDATE Customer_dim
SET expiry_date = ${hivevar:pre_date}
WHERE Customer_dim.customer_sk in
(SELECT A.customer_sk
From (SELECT Customer_sk,
Customer_number,
Customer_street_address,
Customer_zip_code,
Customer_city,
Customer_state,
Shipping_address,
Shipping_zip_code,
Shipping_city,
Shipping_state
From Customer_dim WHERE expiry_date = ${hivevar:max_date}) a left JOIN
Rds.customer b on a.customer_number = B.customer_number
WHERE B.customer_number is NULL OR
( ! (A.customer_street_address <=> b.customer_street_address)
OR! (A.customer_zip_code <=> B.customer_zip_code)
OR! (A.customer_city <=> b.customer_city)
OR! (A.customer_state <=> b.customer_state)
OR! (A.shipping_address <=> b.shipping_address)
OR! (A.shipping_zip_code <=> B.shipping_zip_code)
OR! (A.shipping_city <=> b.shipping_city)
OR! (A.shipping_state <=> b.shipping_state)
));
--New lines to handle SCD2 on the customer_street_addresses column
INSERT into Customer_dim
SELECT
Row_number () over (order by T1.customer_number) + T2.sk_max,
T1.customer_number,
T1.customer_name,
T1.customer_street_address,
T1.customer_zip_code,
T1.customer_city,
T1.customer_state,
T1.shipping_address,
T1.shipping_zip_code,
T1.shipping_city,
T1.shipping_state,
T1.version,
T1.effective_date,
T1.expiry_date
From
(
SELECT
T2.customer_number Customer_number,
T2.customer_name Customer_name,
T2.customer_street_address customer_street_address,
T2.customer_zip_code Customer_zip_code,
T2.customer_city Customer_city,
T2.customer_state Customer_state,
T2.shipping_address shipping_address,
T2.shipping_zip_code Shipping_zip_code,
T2.shipping_city Shipping_city,
T2.shipping_state Shipping_state,
T1.version + 1 version,
${hivevar:pre_date} effective_date,
${hivevar:max_date} expiry_date
From Customer_dim T1
INNER JOIN Rds.customer T2
On t1.customer_number = T2.customer_number
and t1.expiry_date = ${hivevar:pre_date}
Left JOIN Customer_dim T3
On t1.customer_number = T3.customer_number
and t3.expiry_date = ${hivevar:max_date}
WHERE (!) ( T1.customer_street_address <=> t2.customer_street_address)
OR! (T1.customer_zip_code <=> T2.customer_zip_code)
OR! (T1.customer_city <=> t2.customer_city)
OR! (T1.customer_state <=> t2.customer_state)
OR! (T1.shipping_address <=> t2.shipping_address)
OR! (T1.shipping_zip_code <=> T2.shipping_zip_code)
OR! (T1.shipping_city <=> t2.shipping_city)
OR! (T1.shipping_state <=> t2.shipping_state)
)
and T3.customer_sk is NULL) t1
CROSS JOIN
(SELECT Coalesce (MAX (Customer_sk), 0) Sk_max from Customer_dim) T2;
--handling the SCD1 on the customer_name column
-because the SET clause of Hive's update does not support subqueries, a temporary table is used to store records that need to be updated, with the first delete and insert instead of update
-because SCD1 itself does not preserve historical data, it updates all customer_name changed records in the dimension table, instead of just updating the current version of the record
DROP TABLE IF EXISTS tmp;
CREATE TABLE tmp as
SELECT
A.customer_sk,
A.customer_number,
B.customer_name,
A.customer_street_address,
A.customer_zip_code,
A.customer_city,
A.customer_state,
A.shipping_address,
A.shipping_zip_code,
A.shipping_city,
A.shipping_state,
A.version,
A.effective_date,
A.expiry_date
From Customer_dim A, Rds.customer b
WHERE A.customer_number = B.customer_number and! (A.customer_name <=> b.customer_name);
DELETE from Customer_dim WHERE Customer_dim.customer_sk in (SELECT Customer_sk from TMP);
INSERT into Customer_dim SELECT * from TMP;
--processing of new customer records
INSERT into Customer_dim
SELECT
Row_number () over (order by T1.customer_number) + T2.sk_max,
T1.customer_number,
T1.customer_name,
T1.customer_street_address,
T1.customer_zip_code,
T1.customer_city,
T1.customer_state,
T1.shipping_address,
T1.shipping_zip_code,
T1.shipping_city,
T1.shipping_state,
1,
${hivevar:pre_date},
${hivevar:max_date}
From
(
SELECT t1.* from Rds.customer T1 left JOIN customer_dim t2 on t1.customer_number = T2.customer_number
WHERE T2.customer_sk is NULL) t1
CROSS JOIN
(SELECT Coalesce (MAX (Customer_sk), 0) Sk_max from Customer_dim) T2;
--Overload the PA customer dimension
TRUNCATE TABLE Pa_customer_dim;
INSERT into Pa_customer_dim
SELECT
Customer_sk
, Customer_number
, customer_name
, customer_street_address
, Customer_zip_code
, customer_city
, Customer_state
, shipping_address
, Shipping_zip_code
, shipping_city
, Shipping_state
, version
, Effective_date
, Expiry_date
From Customer_dim
WHERE customer_state = ' PA ';
--Load Product dimension
--set expiration of deleted records and SCD2 on Product_Name, product_category columns
UPDATE Product_dim
SET expiry_date = ${hivevar:pre_date}
WHERE Product_dim.product_sk in
(SELECT A.product_sk
From (SELECT product_sk,product_code,product_name,product_category
From Product_dim WHERE expiry_date = ${hivevar:max_date}) a left JOIN
Rds.product b on a.product_code = B.product_code
WHERE B.product_code is NULL OR (a.product_name <> b.product_name OR a.product_category <> b.product_category ));
--New lines to handle SCD2 on Product_Name, product_category columns
INSERT into Product_dim
SELECT
Row_number () over (order by T1.product_code) + T2.sk_max,
T1.product_code,
T1.product_name,
T1.product_category,
T1.version,
T1.effective_date,
T1.expiry_date
From
(
SELECT
T2.product_code Product_code,
T2.product_name Product_Name,
T2.product_category product_category,
T1.version + 1 version,
${hivevar:pre_date} effective_date,
${hivevar:max_date} expiry_date
From Product_dim T1
INNER JOIN rds.product T2
On t1.product_code = T2.product_code
and t1.expiry_date = ${hivevar:pre_date}
Left JOIN Product_dim T3
On t1.product_code = T3.product_code
and t3.expiry_date = ${hivevar:max_date}
WHERE (t1.product_name <> t2.product_name OR t1.product_category <> t2.product_category) and T3.product_sk is NULL) T1
CROSS JOIN
(SELECT Coalesce (MAX (Product_sk), 0) Sk_max from Product_dim) T2;
--Process the new product record
INSERT into Product_dim
SELECT
Row_number () over (order by T1.product_code) + T2.sk_max,
T1.product_code,
T1.product_name,
T1.product_category,
1,
${hivevar:pre_date},
${hivevar:max_date}
From
(
SELECT t1.* from rds.product T1 left JOIN product_dim t2 on t1.product_code = T2.product_code
WHERE T2.product_sk is NULL) t1
CROSS JOIN
(SELECT Coalesce (MAX (Product_sk), 0) Sk_max from Product_dim) T2;
--Load Sales order fact table
--New sales orders from the previous day
INSERT into Sales_order_fact
SELECT
A.order_number,
Customer_sk,
Product_sk,
E.order_date_sk,
Null
Null
Null
Null
Null
Null
Null
Null
F.request_delivery_date_sk,
Order_amount,
Quantity
From
Rds.sales_order A,
Customer_dim C,
Product_dim D,
Order_date_dim E,
Request_delivery_date_dim F,
Rds.cdc_time g
WHERE
A.order_status = ' N '
and A.customer_number = C.customer_number
and A.status_date >= c.effective_date
and A.status_date < C.expiry_date
and A.product_code = D.product_code
and A.status_date >= d.effective_date
and A.status_date < D.expiry_date
and To_date (a.status_date) = E.order_date
and To_date (a.request_delivery_date) = F.request_delivery_date
and A.entry_date >= G.last_load and A.entry_date < g.current_load;
--handling four status of distribution warehouse, packing, distribution and receiving
DROP TABLE IF EXISTS tmp;
CREATE TABLE tmp as
Select T0.order_number Order_number,
T0.customer_sk Customer_sk,
T0.product_sk Product_sk,
T0.order_date_sk Order_date_sk,
T2.allocate_date_sk Allocate_date_sk,
T1.quantity Allocate_quantity,
T0.packing_date_sk Packing_date_sk,
T0.packing_quantity Packing_quantity,
T0.ship_date_sk Ship_date_sk,
T0.ship_quantity Ship_quantity,
T0.receive_date_sk Receive_date_sk,
T0.receive_quantity Receive_quantity,
T0.request_delivery_date_sk Request_delivery_date_sk,
T0.order_amount Order_amount,
T0.order_quantity order_quantity
From Sales_order_fact t0,
Rds.sales_order T1,
Allocate_date_dim T2,
rds.cdc_time T4
where T0.order_number = t1.order_number and t1.order_status = ' A '
and To_date (t1.status_date) = T2.allocate_date
and T1.entry_date >= T4.last_load and T1.entry_date < t4.current_load;
DELETE from Sales_order_fact WHERE sales_order_fact.order_number in (SELECT order_number from TMP);
INSERT into Sales_order_fact SELECT * from TMP;
DROP TABLE IF EXISTS tmp;
CREATE TABLE tmp as
Select T0.order_number Order_number,
T0.customer_sk Customer_sk,
T0.product_sk Product_sk,
T0.order_date_sk Order_date_sk,
T0.allocate_date_sk Allocate_date_sk,
T0.allocate_quantity Allocate_quantity,
T2.packing_date_sk Packing_date_sk,
T1.quantity Packing_quantity,
T0.ship_date_sk Ship_date_sk,
T0.ship_quantity Ship_quantity,
T0.receive_date_sk Receive_date_sk,
T0.receive_quantity Receive_quantity,
T0.request_delivery_date_sk Request_delivery_date_sk,
T0.order_amount Order_amount,
T0.order_quantity order_quantity
From Sales_order_fact t0,
Rds.sales_order T1,
Packing_date_dim T2,
rds.cdc_time T4
where T0.order_number = t1.order_number and t1.order_status = ' P '
and To_date (t1.status_date) = T2.packing_date
and T1.entry_date >= T4.last_load and T1.entry_date < t4.current_load;
DELETE from Sales_order_fact WHERE sales_order_fact.order_number in (SELECT order_number from TMP);
INSERT into Sales_order_fact SELECT * from TMP;
DROP TABLE IF EXISTS tmp;
CREATE TABLE tmp as
Select T0.order_number Order_number,
T0.customer_sk Customer_sk,
T0.product_sk Product_sk,
T0.order_date_sk Order_date_sk,
T0.allocate_date_sk Allocate_date_sk,
T0.allocate_quantity Allocate_quantity,
T0.packing_date_sk Packing_date_sk,
T0.packing_quantity Packing_quantity,
T2.ship_date_sk Ship_date_sk,
T1.quantity Ship_quantity,
T0.receive_date_sk Receive_date_sk,
T0.receive_quantity Receive_quantity,
T0.request_delivery_date_sk Request_delivery_date_sk,
T0.order_amount Order_amount,
T0.order_quantity order_quantity
From Sales_order_fact t0,
Rds.sales_order T1,
Ship_date_dim T2,
rds.cdc_time T4
where T0.order_number = t1.order_number and t1.order_status = ' S '
and To_date (t1.status_date) = T2.ship_date
and T1.entry_date >= T4.last_load and T1.entry_date < t4.current_load;
DELETE from Sales_order_fact WHERE sales_order_fact.order_number in (SELECT order_number from TMP);
INSERT into Sales_order_fact SELECT * from TMP;
DROP TABLE IF EXISTS tmp;
CREATE TABLE tmp as
Select T0.order_number Order_number,
T0.customer_sk Customer_sk,
T0.product_sk Product_sk,
T0.order_date_sk Order_date_sk,
T0.allocate_date_sk Allocate_date_sk,
T0.allocate_quantity Allocate_quantity,
T0.packing_date_sk Packing_date_sk,
T0.packing_quantity Packing_quantity,
T0.ship_date_sk Ship_date_sk,
T0.ship_quantity Ship_quantity,
T2.receive_date_sk Receive_date_sk,
T1.quantity Receive_quantity,
T0.request_delivery_date_sk Request_delivery_date_sk,
T0.order_amount Order_amount,
T0.order_quantity order_quantity
From Sales_order_fact t0,
Rds.sales_order T1,
Receive_date_dim T2,
rds.cdc_time T4
where T0.order_number = t1.order_number and t1.order_status = ' R '
and To_date (t1.status_date) = T2.receive_date
and T1.entry_date >= T4.last_load and T1.entry_date < t4.current_load;
DELETE from Sales_order_fact WHERE sales_order_fact.order_number in (SELECT order_number from TMP);
INSERT into Sales_order_fact SELECT * from TMP;
--Update the Last_load field of the timestamp table
INSERT OVERWRITE TABLE rds.cdc_time SELECT current_load, current_load from Rds.cdc_time;3. Test the modified periodic mount
(1) Prepare test data
The test uses two new orders with a distribution warehouse, packaging, distribution, and receipt milestones. So each order needs to add five lines. The following script adds 10 rows to the Sales_order table in the source database.
Use source;
DROP TABLE IF EXISTS temp_sales_order_data;
CREATE TABLE Temp_sales_order_data as SELECT * from Sales_order WHERE 1=0;
SET @start_date: = Unix_timestamp (' 2016-07-25 ');
SET @end_date: = Unix_timestamp (' 2016-07-26 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (1, 131, 1, 1, @order_date, ' N ', ' 2016-08-01 ', @order_date, @amount, @quantity);
SET @start_date: = Unix_timestamp (' 2016-07-25 ');
SET @end_date: = Unix_timestamp (' 2016-07-26 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (2, 132, 2, 2, @order_date, ' N ', ' 2016-08-01 ', @order_date, @amount, @quantity);
SET @start_date: = Unix_timestamp (' 2016-07-26 ');
SET @end_date: = Unix_timestamp (' 2016-07-27 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (3, 131, 1, 1, @order_date, ' A ', ' 2016-08-01 ', @order_date, @amount, @quantity);
SET @start_date: = Unix_timestamp (' 2016-07-26 ');
SET @end_date: = Unix_timestamp (' 2016-07-27 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (4, 132, 2, 2, @order_date, ' A ', ' 2016-08-01 ', @order_date, @amount, @quantity);
SET @start_date: = Unix_timestamp (' 2016-07-27 ');
SET @end_date: = Unix_timestamp (' 2016-07-28 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (5, 131, 1, 1, @order_date, ' P ', ' 2016-08-01 ', @order_date, @amount, @quantity);
SET @start_date: = Unix_timestamp (' 2016-07-27 ');
SET @end_date: = Unix_timestamp (' 2016-07-28 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (6, 132, 2, 2, @order_date, ' P ', ' 2016-08-01 ', @order_date, @amount, @quantity);
SET @start_date: = Unix_timestamp (' 2016-07-28 ');
SET @end_date: = Unix_timestamp (' 2016-07-29 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (7, 131, 1, 1, @order_date, ' S ', ' 2016-08-01 ', @order_date, @amount, @quantity);
SET @start_date: = Unix_timestamp (' 2016-07-28 ');
SET @end_date: = Unix_timestamp (' 2016-07-29 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (8, 132, 2, 2, @order_date, ' S ', ' 2016-08-01 ', @order_date, @amount, @quantity);
SET @start_date: = Unix_timestamp (' 2016-07-29 ');
SET @end_date: = Unix_timestamp (' 2016-07-30 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (9, 131, 1, 1, @order_date, ' R ', ' 2016-08-01 ', @order_date, @amount, @quantity);
SET @start_date: = Unix_timestamp (' 2016-07-29 ');
SET @end_date: = Unix_timestamp (' 2016-07-30 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (132, 2, 2, @order_date, ' R ', ' 2016-08-01 ', @order_date, @amount, @quantity);
INSERT into Sales_order
Select NULL,
Order_number,
Customer_number,
Product_code,
Status_date,
Order_status,
Request_delivery_date,
Entry_date,
Order_amount,
Quantity
From Temp_sales_order_data T1
Order BY T1.status_date;
COMMIT; (2) perform five periodic loads
Use RDS;
Insert OVERWRITE TABLE rds.cdc_time SELECT ' 2016-07-25 ', ' 2016-07-26 ' from Rds.cdc_time; set regular_etl.sql in File Hivevar : Cur_date = Current_date (); row changed to set hivevar:cur_date = ' 2016-07-26 ';
./regular_etl.sh
Use RDS;
Insert OVERWRITE TABLE rds.cdc_time SELECT ' 2016-07-26 ', ' 2016-07-27 ' from Rds.cdc_time; set regular_etl.sql in File Hivevar : Cur_date = Current_date (); row changed to set hivevar:cur_date = ' 2016-07-27 ';
./regular_etl.sh
Use RDS;
Insert OVERWRITE TABLE rds.cdc_time SELECT ' 2016-07-27 ', ' 2016-07-28 ' from Rds.cdc_time; set regular_etl.sql in File Hivevar : Cur_date = Current_date (); row changed to set hivevar:cur_date = ' 2016-07-28 ';
./regular_etl.sh
Use RDS;
Insert OVERWRITE TABLE rds.cdc_time SELECT ' 2016-07-28 ', ' 2016-07-29 ' from Rds.cdc_time; set regular_etl.sql in File Hivevar : Cur_date = Current_date (); row changed to set hivevar:cur_date = ' 2016-07-29 ';
./regular_etl.sh
Use RDS;
Insert OVERWRITE TABLE rds.cdc_time SELECT ' 2016-07-29 ', ' 2016-07-30 ' from Rds.cdc_time; set regular_etl.sql in File Hivevar : Cur_date = Current_date (); row changed to set hivevar:cur_date = ' 2016-07-30 ';
./regular_etl.sh
Check the Sales_order_fact table for two orders.
Use DW;
Select T1.order_number Orn,
T2.order_date OD,
T1.order_quantity OQ,
T3.allocate_date AD,
T1.allocate_quantity AQ,
T4.packing_date PD,
T1.packing_quantity PQ,
T5.ship_date SD,
t1.ship_quantity Sq,
T6.receive_date Rd,
T1.receive_quantity RQ
From Sales_order_fact T1
Inner join Order_date_dim t2 on t1.order_date_sk = T2.order_date_sk
Left join Allocate_date_dim t3 on t1.allocate_date_sk = T3.allocate_date_sk
Left join Packing_date_dim t4 on t1.packing_date_sk = T4.packing_date_sk
Left join Ship_date_dim T5 on t1.ship_date_sk = T5.ship_date_sk
Left join Receive_date_dim T6 on t1.receive_date_sk = T6.receive_date_sk
where T1.order_number in (131, 132); The query results are shown in the following figure.
After testing the set hivevar:cur_date = Current_date () in the Regular_etl.sql file, the line recovery.