The practice of data Warehouse based on Hadoop Ecological Circle Learning Notes

Source: Internet
Author: User
Tags rand

Ix. Degradation Dimensions
This section discusses a technique called a degenerate dimension. This technology reduces the number of dimensions and simplifies the dimension Data Warehouse model. Simple patterns are easier to understand than complex and have better query performance. This dimension can be degraded when there is no data required for the data warehouse in a dimension, and the dependent data for the degraded dimension needs to be migrated to the fact table, and then the degraded dimension is deleted.
1. Degraded order Dimension
This section describes how to degrade order dimensions, including modifications to the Data Warehouse schema and periodic mount scripts. When using dimension degradation techniques, you first need to identify the data and analyze the data columns that you never use. For example, the Order_number column of an order dimension might be such a column. But if the user wants to see the details of the transaction, the order number is also required. Therefore, the order number is migrated to the Sales_order_fact table before the order dimension is degraded. The following figure shows the migrated pattern.

Perform the following four-step degraded order_dim dimension table sequentially:
(1) Add order_number column to Sales_order_fact table
(2) Transfer the order number of the Order_dim table to the Sales_order_fact
(3) Delete the Order_sk column in the Sales_order_fact table
(4) Delete Order_dim table

The following script completes the steps required for all degraded order dimensions.

Use DW;


ALTER TABLE sales_order_fact Rename to Sales_order_fact_old;


CREATE TABLE Sales_order_fact (


order_number int COMMENT ' order number ',


Customer_sk int COMMENT ' Customer surrogate key ',


Product_sk int COMMENT ' product surrogate key ',


Order_date_sk int COMMENT ' Order Date surrogate key ',


Allocate_date_sk int COMMENT ' Allocate date surrogate key ',


allocate_quantity int COMMENT ' Allocate quantity ',


Packing_date_sk int COMMENT ' Packing date surrogate key ',


packing_quantity int COMMENT ' Packing quantity ',


Ship_date_sk int COMMENT ' Ship date surrogate key ',


ship_quantity int COMMENT ' ship quantity ',


Receive_date_sk int COMMENT ' Receive date surrogate key ',


receive_quantity int COMMENT ' Receive Quantity ',


Request_delivery_date_sk int COMMENT ' request delivery date surrogate key ',


Order_amount decimal (10,2) COMMENT ' Order Amount ',


order_quantity int COMMENT ' order quantity ')


Clustered by (Order_number) into 8 buckets


Stored as Orc tblproperties (' transactional ' = ' true ');

INSERT INTO Table Sales_order_fact


Select T2.order_number,


T1.customer_sk,


T1.product_sk,


T1.order_date_sk,


T1.allocate_date_sk,


T1.allocate_quantity,


T1.packing_date_sk,


T1.packing_quantity,


T1.ship_date_sk,


T1.ship_quantity,


T1.receive_date_sk,


T1.receive_quantity,


T1.request_delivery_date_sk,


T1.order_amount,


T1.order_quantity


From Sales_order_fact_old T1


Inner join Order_dim t2 on t1.order_sk = T2.order_sk;

drop table sales_order_fact_old;
drop table order_dim;2. Modify a recurring Mount script
Another thing to do after a dimension is degraded is to modify the regular Mount script. The modified script needs to add the order number to the Sales Order fact table, and no longer need to import the order dimension. The contents of the modified Regular_etl.sql script file are shown below.

--Set variables to support transactions


Set hive.support.concurrency=true;


Set hive.exec.dynamic.partition.mode=nonstrict;


Set Hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.dbtxnmanager;


Set hive.compactor.initiator.on=true;


Set Hive.compactor.worker.threads=1;





Use DW;





--Set the effective time and expiration time of the SCD


SET hivevar:cur_date = Current_date ();


SET hivevar:pre_date = Date_add (${hivevar:cur_date},-1);


SET hivevar:max_date = CAST (' 2200-01-01 ' as date);





--Set the CDC time limit


INSERT OVERWRITE TABLE rds.cdc_time SELECT last_load, ${hivevar:cur_date} from Rds.cdc_time;





--Load the Customer dimension


--Sets the expiration of SCD2 on deleted records and address-related columns, handling null values with the <=> operator.


UPDATE Customer_dim


SET expiry_date = ${hivevar:pre_date}


WHERE Customer_dim.customer_sk in


(SELECT A.customer_sk


From (SELECT Customer_sk,


Customer_number,


Customer_street_address,


Customer_zip_code,


Customer_city,


Customer_state,


Shipping_address,


Shipping_zip_code,


Shipping_city,


Shipping_state


From Customer_dim WHERE expiry_date = ${hivevar:max_date}) a left JOIN


Rds.customer b on a.customer_number = B.customer_number


WHERE B.customer_number is NULL OR


(  ! (A.customer_street_address <=> b.customer_street_address)


OR! (A.customer_zip_code <=> B.customer_zip_code)


OR! (A.customer_city <=> b.customer_city)


OR! (A.customer_state <=> b.customer_state)


OR! (A.shipping_address <=> b.shipping_address)


OR! (A.shipping_zip_code <=> B.shipping_zip_code)


OR! (A.shipping_city <=> b.shipping_city)


OR! (A.shipping_state <=> b.shipping_state)


));





--New lines to handle SCD2 on the customer_street_addresses column


INSERT into Customer_dim


SELECT


Row_number () over (order by T1.customer_number) + T2.sk_max,


T1.customer_number,


T1.customer_name,


T1.customer_street_address,


T1.customer_zip_code,


T1.customer_city,


T1.customer_state,


T1.shipping_address,


T1.shipping_zip_code,


T1.shipping_city,


T1.shipping_state,


T1.version,


T1.effective_date,


T1.expiry_date


From


(


SELECT


T2.customer_number Customer_number,


T2.customer_name Customer_name,


T2.customer_street_address customer_street_address,


T2.customer_zip_code Customer_zip_code,


T2.customer_city Customer_city,


T2.customer_state Customer_state,


T2.shipping_address shipping_address,


T2.shipping_zip_code Shipping_zip_code,


T2.shipping_city Shipping_city,


T2.shipping_state Shipping_state,


T1.version + 1 version,


${hivevar:pre_date} effective_date,


${hivevar:max_date} expiry_date


From Customer_dim T1


INNER JOIN Rds.customer T2


On t1.customer_number = T2.customer_number


and t1.expiry_date = ${hivevar:pre_date}


Left JOIN Customer_dim T3


On t1.customer_number = T3.customer_number


and t3.expiry_date = ${hivevar:max_date}


WHERE (!) ( T1.customer_street_address <=> t2.customer_street_address)


OR! (T1.customer_zip_code <=> T2.customer_zip_code)


OR! (T1.customer_city <=> t2.customer_city)


OR! (T1.customer_state <=> t2.customer_state)


OR! (T1.shipping_address <=> t2.shipping_address)


OR! (T1.shipping_zip_code <=> T2.shipping_zip_code)


OR! (T1.shipping_city <=> t2.shipping_city)


OR! (T1.shipping_state <=> t2.shipping_state)


)


and T3.customer_sk is NULL) t1


CROSS JOIN


(SELECT Coalesce (MAX (Customer_sk), 0) Sk_max from Customer_dim) T2;





--handling the SCD1 on the customer_name column


-because the SET clause of Hive's update does not support subqueries, a temporary table is used to store records that need to be updated, with the first delete and insert instead of update


-because SCD1 itself does not preserve historical data, it updates all customer_name changed records in the dimension table, instead of just updating the current version of the record


DROP TABLE IF EXISTS tmp;


CREATE TABLE tmp as


SELECT


A.customer_sk,


A.customer_number,


B.customer_name,


A.customer_street_address,


A.customer_zip_code,


A.customer_city,


A.customer_state,


A.shipping_address,


A.shipping_zip_code,


A.shipping_city,


A.shipping_state,


A.version,


A.effective_date,


A.expiry_date


From Customer_dim A, Rds.customer b


WHERE A.customer_number = B.customer_number and! (A.customer_name <=> b.customer_name);


DELETE from Customer_dim WHERE Customer_dim.customer_sk in (SELECT Customer_sk from TMP);


INSERT into Customer_dim SELECT * from TMP;





--processing of new customer records


INSERT into Customer_dim


SELECT


Row_number () over (order by T1.customer_number) + T2.sk_max,


T1.customer_number,


T1.customer_name,


T1.customer_street_address,


T1.customer_zip_code,


T1.customer_city,


T1.customer_state,


T1.shipping_address,


T1.shipping_zip_code,


T1.shipping_city,


T1.shipping_state,


1,


${hivevar:pre_date},


${hivevar:max_date}


From


(


SELECT t1.* from Rds.customer T1 left JOIN customer_dim t2 on t1.customer_number = T2.customer_number


WHERE T2.customer_sk is NULL) t1


CROSS JOIN


(SELECT Coalesce (MAX (Customer_sk), 0) Sk_max from Customer_dim) T2;





--Overload the PA customer dimension


TRUNCATE TABLE Pa_customer_dim;


INSERT into Pa_customer_dim


SELECT


Customer_sk


, Customer_number


, customer_name


, customer_street_address


, Customer_zip_code


, customer_city


, Customer_state


, shipping_address


, Shipping_zip_code


, shipping_city


, Shipping_state


, version


, Effective_date


, Expiry_date


From Customer_dim


WHERE customer_state = ' PA ';





--Load Product dimension


--set expiration of deleted records and SCD2 on Product_Name, product_category columns


UPDATE Product_dim


SET expiry_date = ${hivevar:pre_date}


WHERE Product_dim.product_sk in


(SELECT A.product_sk


From (SELECT product_sk,product_code,product_name,product_category


From Product_dim WHERE expiry_date = ${hivevar:max_date}) a left JOIN


Rds.product b on a.product_code = B.product_code


WHERE B.product_code is NULL OR (a.product_name <> b.product_name OR a.product_category <> b.product_category ));





--New lines to handle SCD2 on Product_Name, product_category columns


INSERT into Product_dim


SELECT


Row_number () over (order by T1.product_code) + T2.sk_max,


T1.product_code,


T1.product_name,


T1.product_category,


T1.version,


T1.effective_date,


T1.expiry_date


From


(


SELECT


T2.product_code Product_code,


T2.product_name Product_Name,


T2.product_category product_category,


T1.version + 1 version,


${hivevar:pre_date} effective_date,


${hivevar:max_date} expiry_date


From Product_dim T1


INNER JOIN rds.product T2


On t1.product_code = T2.product_code


and t1.expiry_date = ${hivevar:pre_date}


Left JOIN Product_dim T3


On t1.product_code = T3.product_code


and t3.expiry_date = ${hivevar:max_date}


WHERE (t1.product_name <> t2.product_name OR t1.product_category <> t2.product_category) and T3.product_sk is NULL) T1


CROSS JOIN


(SELECT Coalesce (MAX (Product_sk), 0) Sk_max from Product_dim) T2;





--Process the new product record


INSERT into Product_dim


SELECT


Row_number () over (order by T1.product_code) + T2.sk_max,


T1.product_code,


T1.product_name,


T1.product_category,


1,


${hivevar:pre_date},


${hivevar:max_date}


From


(


SELECT t1.* from rds.product T1 left JOIN product_dim t2 on t1.product_code = T2.product_code


WHERE T2.product_sk is NULL) t1


CROSS JOIN


(SELECT Coalesce (MAX (Product_sk), 0) Sk_max from Product_dim) T2;





--Load Sales order fact table


--New sales orders from the previous day


INSERT into Sales_order_fact


SELECT


A.order_number,


Customer_sk,


Product_sk,


E.order_date_sk,


Null


Null


Null


Null


Null


Null


Null


Null


F.request_delivery_date_sk,


Order_amount,


Quantity


From


Rds.sales_order A,


Customer_dim C,


Product_dim D,


Order_date_dim E,


Request_delivery_date_dim F,


Rds.cdc_time g


WHERE


A.order_status = ' N '


and A.customer_number = C.customer_number


and A.status_date >= c.effective_date


and A.status_date < C.expiry_date


and A.product_code = D.product_code


and A.status_date >= d.effective_date


and A.status_date < D.expiry_date


and To_date (a.status_date) = E.order_date


and To_date (a.request_delivery_date) = F.request_delivery_date


and A.entry_date >= G.last_load and A.entry_date < g.current_load;

--handling four status of distribution warehouse, packing, distribution and receiving


DROP TABLE IF EXISTS tmp;


CREATE TABLE tmp as


Select T0.order_number Order_number,


T0.customer_sk Customer_sk,


T0.product_sk Product_sk,


T0.order_date_sk Order_date_sk,


T2.allocate_date_sk Allocate_date_sk,


T1.quantity Allocate_quantity,


T0.packing_date_sk Packing_date_sk,


T0.packing_quantity Packing_quantity,


T0.ship_date_sk Ship_date_sk,


T0.ship_quantity Ship_quantity,


T0.receive_date_sk Receive_date_sk,


T0.receive_quantity Receive_quantity,


T0.request_delivery_date_sk Request_delivery_date_sk,


T0.order_amount Order_amount,


T0.order_quantity order_quantity


From Sales_order_fact t0,


Rds.sales_order T1,


Allocate_date_dim T2,


rds.cdc_time T4


where T0.order_number = t1.order_number and t1.order_status = ' A '


and To_date (t1.status_date) = T2.allocate_date


and T1.entry_date >= T4.last_load and T1.entry_date < t4.current_load;

DELETE from Sales_order_fact WHERE sales_order_fact.order_number in (SELECT order_number from TMP);
INSERT into Sales_order_fact SELECT * from TMP;

DROP TABLE IF EXISTS tmp;


CREATE TABLE tmp as


Select T0.order_number Order_number,


T0.customer_sk Customer_sk,


T0.product_sk Product_sk,


T0.order_date_sk Order_date_sk,


T0.allocate_date_sk Allocate_date_sk,


T0.allocate_quantity Allocate_quantity,


T2.packing_date_sk Packing_date_sk,


T1.quantity Packing_quantity,


T0.ship_date_sk Ship_date_sk,


T0.ship_quantity Ship_quantity,


T0.receive_date_sk Receive_date_sk,


T0.receive_quantity Receive_quantity,


T0.request_delivery_date_sk Request_delivery_date_sk,


T0.order_amount Order_amount,


T0.order_quantity order_quantity


From Sales_order_fact t0,


Rds.sales_order T1,


Packing_date_dim T2,


rds.cdc_time T4


where T0.order_number = t1.order_number and t1.order_status = ' P '


and To_date (t1.status_date) = T2.packing_date


and T1.entry_date >= T4.last_load and T1.entry_date < t4.current_load;





DELETE from Sales_order_fact WHERE sales_order_fact.order_number in (SELECT order_number from TMP);


INSERT into Sales_order_fact SELECT * from TMP;

DROP TABLE IF EXISTS tmp;


CREATE TABLE tmp as


Select T0.order_number Order_number,


T0.customer_sk Customer_sk,


T0.product_sk Product_sk,


T0.order_date_sk Order_date_sk,


T0.allocate_date_sk Allocate_date_sk,


T0.allocate_quantity Allocate_quantity,


T0.packing_date_sk Packing_date_sk,


T0.packing_quantity Packing_quantity,


T2.ship_date_sk Ship_date_sk,


T1.quantity Ship_quantity,


T0.receive_date_sk Receive_date_sk,


T0.receive_quantity Receive_quantity,


T0.request_delivery_date_sk Request_delivery_date_sk,


T0.order_amount Order_amount,


T0.order_quantity order_quantity


From Sales_order_fact t0,


Rds.sales_order T1,


Ship_date_dim T2,


rds.cdc_time T4


where T0.order_number = t1.order_number and t1.order_status = ' S '


and To_date (t1.status_date) = T2.ship_date


and T1.entry_date >= T4.last_load and T1.entry_date < t4.current_load;





DELETE from Sales_order_fact WHERE sales_order_fact.order_number in (SELECT order_number from TMP);


INSERT into Sales_order_fact SELECT * from TMP;

DROP TABLE IF EXISTS tmp;


CREATE TABLE tmp as


Select T0.order_number Order_number,


T0.customer_sk Customer_sk,


T0.product_sk Product_sk,


T0.order_date_sk Order_date_sk,


T0.allocate_date_sk Allocate_date_sk,


T0.allocate_quantity Allocate_quantity,


T0.packing_date_sk Packing_date_sk,


T0.packing_quantity Packing_quantity,


T0.ship_date_sk Ship_date_sk,


T0.ship_quantity Ship_quantity,


T2.receive_date_sk Receive_date_sk,


T1.quantity Receive_quantity,


T0.request_delivery_date_sk Request_delivery_date_sk,


T0.order_amount Order_amount,


T0.order_quantity order_quantity


From Sales_order_fact t0,


Rds.sales_order T1,


Receive_date_dim T2,


rds.cdc_time T4


where T0.order_number = t1.order_number and t1.order_status = ' R '


and To_date (t1.status_date) = T2.receive_date


and T1.entry_date >= T4.last_load and T1.entry_date < t4.current_load;





DELETE from Sales_order_fact WHERE sales_order_fact.order_number in (SELECT order_number from TMP);


INSERT into Sales_order_fact SELECT * from TMP;

--Update the Last_load field of the timestamp table
INSERT OVERWRITE TABLE rds.cdc_time SELECT current_load, current_load from Rds.cdc_time;3. Test the modified periodic mount
(1) Prepare test data
The test uses two new orders with a distribution warehouse, packaging, distribution, and receipt milestones. So each order needs to add five lines. The following script adds 10 rows to the Sales_order table in the source database.

Use source;
DROP TABLE IF EXISTS temp_sales_order_data;
CREATE TABLE Temp_sales_order_data as SELECT * from Sales_order WHERE 1=0;

SET @start_date: = Unix_timestamp (' 2016-07-25 ');
SET @end_date: = Unix_timestamp (' 2016-07-26 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (1, 131, 1, 1, @order_date, ' N ', ' 2016-08-01 ', @order_date, @amount, @quantity);

SET @start_date: = Unix_timestamp (' 2016-07-25 ');
SET @end_date: = Unix_timestamp (' 2016-07-26 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (2, 132, 2, 2, @order_date, ' N ', ' 2016-08-01 ', @order_date, @amount, @quantity);

SET @start_date: = Unix_timestamp (' 2016-07-26 ');
SET @end_date: = Unix_timestamp (' 2016-07-27 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (3, 131, 1, 1, @order_date, ' A ', ' 2016-08-01 ', @order_date, @amount, @quantity);

SET @start_date: = Unix_timestamp (' 2016-07-26 ');
SET @end_date: = Unix_timestamp (' 2016-07-27 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (4, 132, 2, 2, @order_date, ' A ', ' 2016-08-01 ', @order_date, @amount, @quantity);

SET @start_date: = Unix_timestamp (' 2016-07-27 ');
SET @end_date: = Unix_timestamp (' 2016-07-28 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (5, 131, 1, 1, @order_date, ' P ', ' 2016-08-01 ', @order_date, @amount, @quantity);

SET @start_date: = Unix_timestamp (' 2016-07-27 ');
SET @end_date: = Unix_timestamp (' 2016-07-28 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (6, 132, 2, 2, @order_date, ' P ', ' 2016-08-01 ', @order_date, @amount, @quantity);

SET @start_date: = Unix_timestamp (' 2016-07-28 ');
SET @end_date: = Unix_timestamp (' 2016-07-29 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (7, 131, 1, 1, @order_date, ' S ', ' 2016-08-01 ', @order_date, @amount, @quantity);

SET @start_date: = Unix_timestamp (' 2016-07-28 ');
SET @end_date: = Unix_timestamp (' 2016-07-29 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (8, 132, 2, 2, @order_date, ' S ', ' 2016-08-01 ', @order_date, @amount, @quantity);

SET @start_date: = Unix_timestamp (' 2016-07-29 ');
SET @end_date: = Unix_timestamp (' 2016-07-30 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (9, 131, 1, 1, @order_date, ' R ', ' 2016-08-01 ', @order_date, @amount, @quantity);

SET @start_date: = Unix_timestamp (' 2016-07-29 ');
SET @end_date: = Unix_timestamp (' 2016-07-30 ');
SET @order_date: = From_unixtime (@start_date + rand () * (@end_date-@start_date));
SET @amount: = Floor (1000 + rand () * 9000);
SET @quantity: = Floor (+ rand () * 90);
INSERT into Temp_sales_order_data VALUES (132, 2, 2, @order_date, ' R ', ' 2016-08-01 ', @order_date, @amount, @quantity);

INSERT into Sales_order
Select NULL,
Order_number,
Customer_number,
Product_code,
Status_date,
Order_status,
Request_delivery_date,
Entry_date,
Order_amount,
Quantity
From Temp_sales_order_data T1
Order BY T1.status_date;

COMMIT; (2) perform five periodic loads

Use RDS;
Insert OVERWRITE TABLE rds.cdc_time SELECT ' 2016-07-25 ', ' 2016-07-26 ' from Rds.cdc_time; set regular_etl.sql in File Hivevar : Cur_date = Current_date (); row changed to set hivevar:cur_date = ' 2016-07-26 ';

./regular_etl.sh

Use RDS;
Insert OVERWRITE TABLE rds.cdc_time SELECT ' 2016-07-26 ', ' 2016-07-27 ' from Rds.cdc_time; set regular_etl.sql in File Hivevar : Cur_date = Current_date (); row changed to set hivevar:cur_date = ' 2016-07-27 ';

./regular_etl.sh

Use RDS;
Insert OVERWRITE TABLE rds.cdc_time SELECT ' 2016-07-27 ', ' 2016-07-28 ' from Rds.cdc_time; set regular_etl.sql in File Hivevar : Cur_date = Current_date (); row changed to set hivevar:cur_date = ' 2016-07-28 ';

./regular_etl.sh

Use RDS;
Insert OVERWRITE TABLE rds.cdc_time SELECT ' 2016-07-28 ', ' 2016-07-29 ' from Rds.cdc_time; set regular_etl.sql in File Hivevar : Cur_date = Current_date (); row changed to set hivevar:cur_date = ' 2016-07-29 ';

./regular_etl.sh

Use RDS;
Insert OVERWRITE TABLE rds.cdc_time SELECT ' 2016-07-29 ', ' 2016-07-30 ' from Rds.cdc_time; set regular_etl.sql in File Hivevar : Cur_date = Current_date (); row changed to set hivevar:cur_date = ' 2016-07-30 ';

./regular_etl.sh
Check the Sales_order_fact table for two orders.

Use DW;


Select T1.order_number Orn,


T2.order_date OD,


T1.order_quantity OQ,


T3.allocate_date AD,


T1.allocate_quantity AQ,


T4.packing_date PD,


T1.packing_quantity PQ,


T5.ship_date SD,


t1.ship_quantity Sq,


T6.receive_date Rd,


T1.receive_quantity RQ


From Sales_order_fact T1


Inner join Order_date_dim t2 on t1.order_date_sk = T2.order_date_sk


Left join Allocate_date_dim t3 on t1.allocate_date_sk = T3.allocate_date_sk


Left join Packing_date_dim t4 on t1.packing_date_sk = T4.packing_date_sk


Left join Ship_date_dim T5 on t1.ship_date_sk = T5.ship_date_sk


Left join Receive_date_dim T6 on t1.receive_date_sk = T6.receive_date_sk


where T1.order_number in (131, 132); The query results are shown in the following figure.


After testing the set hivevar:cur_date = Current_date () in the Regular_etl.sql file, the line recovery.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.