Many companies use data warehouses for data analysis. Generally, they extract data from online backup databases (such as mirror, logshipping, and slave) to the ods layer.
From the ods layer to dw to dm, especially from the ods layer to dw, data cleaning and loading takes some time and hardware resources.
But when the hardware becomes a bottleneck, how can we quickly complete cleaning and reprinting and timely provide data analysis?
The following provides a method to load data to the ods layer using Ssis directly through a partition table
1 Preparation
- /*create filegroup*/
- ALTER DATABASE [testxwj] ADD FILEGROUP [account_1]
- go
- ALTER DATABASE [testxwj] ADD FILEGROUP [account_2]
- go
- ALTER DATABASE [testxwj] ADD FILEGROUP [account_3]
-
- /*create file to filegroup*/
-
- ALTER DATABASE [testxwj] ADD FILE ( NAME = N'account_1', FILENAME = N'E:\account_1.ndf' , SIZE = 409600KB , FILEGROWTH = 20480KB ) TO FILEGROUP [account_1]
- GO
- ALTER DATABASE [testxwj] ADD FILE ( NAME = N'account_2', FILENAME = N'E:\account_2.ndf' , SIZE = 409600KB , FILEGROWTH = 20480KB ) TO FILEGROUP [account_2]
- GO
- ALTER DATABASE [testxwj] ADD FILE ( NAME = N'account_3', FILENAME = N'E:\account_3.ndf' , SIZE = 409600KB , FILEGROWTH = 20480KB ) TO FILEGROUP [account_3]
- GO16
2 Use ssis copy table
- sp_spaceused accountdetail;
- /* delete EarnTime is not null*/
-
- /*23 sec*/
- delete from accountdetail where EarnTime is null
- /*26 sec*/
- delete from accountdetail where isnull(CommitStatus,0)<1
- /*12 sec*/
- delete from accountdetail where isnull(EarnStatus,0) =0
Partition the transmitted table
- /*create partition function*/
- declare @bdate char(8),@edate varchar(8),@sql varchar(500)
- select
- @bdate=convert(char(8),GETDATE()-1 ,112)
- ,@edate=convert(char(8),GETDATE() ,112)
- select @bdate,@edate;
- set @sql='
- CREATE PARTITION FUNCTION ac_EarnTime (datetime)
- AS11 RANGE RIGHT FOR VALUES ( '''+@bdate+''' ,'''+@edate+''')'
- execute(@sql)
- /*create partition schema*/
- CREATE PARTITION SCHEME ac_schema_ac_EarnTime
- AS PARTITION ac_EarnTime TO (account_1,account_2,account_3);
- /*create partition table */
- alter table accountdetail
- alter column EarnTime datetime not null;
- alter TABLE accountdetail
- add CONSTRAINT [PK_PARTITIONmis] PRIMARY KEY
- ( id,EarnTime
- )ON ac_schema_ac_EarnTime(EarnTime)
Point partition 2 to dw. It is worth noting that accountdetail_dw must be in the same file group as partition 2.
- /*switch accountdetail to accountdetail_dwl*/
- ALTER TABLE accountdetail SWITCH PARTITION 2 TO accountdetail_dw ;
- /**/
The entire process is within five minutes. The most important part of the data warehouse is the original design and selection.
Original article title: how to improve data warehouse loading when hardware becomes a bottleneck?
Link: http://www.cnblogs.com/xwj1985/archive/2010/08/19/1803272.html