This section describes how ETL (data extraction, loading, and conversion) of my game transaction data analysis project is implemented.
Let's talk about the source system first. Because the server of our transaction master station is not hosted in the company, we cannot directly extract data from the source system. As a matter of fact, we already have a simple data analysis system. We don't have to worry about this. We did not use the sqlserver2005 Bi platform, but directly write winform.Program. The original data extraction is an Excel file provided by the main site, which is then imported to our analysis database.
Well, for ease of use, in fact, I am only working on a demo to prepare for future projects. Therefore, we extract data directly from the existing analysis Library mentioned above, because the data structure is similar, the ETL process is relatively simple.
Let's first look at the game dimension table:
First, create an integration services project.
Create an importdimgamepackage. dtsx SSIS package.
Drag a Data Flow task to the control flow Panel ()
Double-click the data flow task to go to the data flow panel.
Next, we drag the oledb source to the data flow panel, and double-click it to edit it, create a connection and select the source data table to extract data.
Then we drag an oledb target to the data flow panel, connect the oledb source and oledb target, and double-click Edit to create a new connection and select the target table for storing data.
Create a ing between the source table and the target table.
All are done, run, OK, and succeeded.
The Department and item dimensions are similar to this one. The time dimension is created by me. Currently, I use winform to write a date dimension generator.
The following describes the fact table of transaction data, which is a little more complex than the dimension. Because the data structure of the source and target systems is very similar, my ETL is very simple. In actual projects, it is impossible to have such good luck.
Create a new importfactgametradedatapackage. dtsx SSIS package.
Other simple steps are the same as above. During database design, the proxy keys are used for all dimensions. That is to say, in our data warehouse, the external key association between dimensions and fact tables is performed through the proxy key, the keys in the source system are recorded with only one field. Therefore, when we extract data, we need to change the ing relationships in the source system.
The key here is that when we edit the data stream, we use a component called "Search. Edited data stream
Next we will explain in detail how to use the lookup column. Let's look for the gamekey.
1. Double-click the gamekey component and select the table we want to search for on the reference table panel. Here is the game dimension table in our data warehouse table.
2. Create a ing between the fields to be searched and the fields in the fact table of the Source System in the column panel.
3. Add the searched column as the new column and obtain a unique alias (the column found here will be used when the oleing is established with the oledb target ).
The search for other dimension proxy keys is similar. The field ing of the oledb target is similar to that of the dimension table.
Well, today I mainly introduced the use of simple sqlserver2005 integration services. Today I have covered several concepts, data flow tasks, oledb sources, oledb targets, and search.
In the next section, I want to write about the process of creating an OLAP database using sqlserver2005 Analysis Services. Http://www.cnblogs.com/lijun4017/archive/2008/04/23/1166812.html)