Lesson 1st: create a simple ETL package, create a package for extracting data from a single flat file source, and then convert the data using the search conversion function, finally, load the data to the factcurrency fact data table of the adventureworksdw sample database.
However, a single flat file is rarely used in the extract, transform, and load (ETL) process. A typical ETL process extracts data from multiple flat file sources. Iterative control flow is required to extract data from multiple sources. One of the most likely features of Microsoft SQL Server 2005 integration services (SSIS) is that you will easily add iterations or loops to the package.
Integration Services provides two types of containers for the loop traversal package: foreach loop container and for loop container. The foreach loop container uses the enumerator to execute a loop, while the for loop usually uses a variable expression. This lesson uses the foreach loop container.
The foreach loop container enables the package to repeatedly execute the control flow for each member of the specified enumerator. You can use the foreach loop container to enumerate:
- Ado record set row and Architecture Information
- File and directory structure
- Systems, packages, and user Variables
- SQL management object (SMO)
In this lesson, you will modify the simple ETL package created in Lesson 1st to use the foreach loop container. The user-defined package variables will also be set so that the tutorial package can iterate through all the flat files in the folder.
In this task, a copy of the lesson 1. dtsx package contained in the SSIS Tutorial Project will be created. You will use this new copy to complete the remaining part of Lesson 2nd.
Create Lesson 2 Package
If business intelligence development studio is not enabled yet, click Start, point to all programs, point to Microsoft SQL Server 2005, and then click business intelligence development studio.
On the File menu, click open, Project/solution, and SSIS tutorial folders, click open again, and double-click SSIS tutorial. sln.
In Solution Explorer, right-click Lesson 1. dtsx and then click Copy ".
In Solution Explorer, right-click "SSIS package" and click "Paste ".
By default, the copied package is named Lesson 2. dtsx.
In Solution Explorer, double-click Lesson 2. dtsx to open the package.
In this task, you will add the folder function for circular access to flat files, and apply the same data flow conversion used in Lesson 1st to each flat file. The method is to add the foreach loop container to the control flow and configure it.
The added foreach loop container must be able to connect to each flat file in the folder. Because all files in this folder have the same format, the foreach loop container can use the same flat file Connection Manager to connect each file. The flat file connection manager used by the container is the same as the flat file connection manager you created in Lesson 1st.
Currently, the flat file Connection Manager in Lesson 1st only connects to a specific flat file. To connect to each flat file in this folder cyclically, you must configure both the foreach loop container and the flat file connection manager as follows:
- The foreach loop container maps the container's enumerated values to user-defined package variables. Then, the container uses this user-defined variable to dynamically modify the connectionstring attribute of the flat file Connection Manager and cyclically connect each flat file in the folder.
- The flat file Connection Manager uses User-Defined variables to fill in the connectionstring attribute of the Connection Manager created in Lesson 1st to modify the Connection Manager.
The process in this task shows you how to create and modify a foreach loop container to use user-defined package variables, and how to add a Data Flow task to this loop. You will learn how to modify the flat file Connection Manager to use User-Defined variables in the next task.
After these changes are made to the package, when the package runs, the foreach loop container cyclically accesses the file set in the sample data folder. Every time you find a file that matches the condition, the foreach loop container will fill in the User-Defined variables with the file name, map User-Defined variables to the connectionstring attribute of the samplecurrencydata flat file connection manager, and then run the data stream on the file. Therefore, in each iteration of A foreach loop, a data flow task uses a different flat file.
Note: |
Since Microsoft SQL Server 2005 integration services (SSIS) distinguishes between the control flow and the data flow, any loop added to the control flow does not need to modify the data flow. Therefore, you do not need to change the data streams created in Lesson 1st. |
Add a foreach loop container
In business intelligence development studio, click the control flow tab.
In the toolbox, expand control flow items and drag the foreach loop container to the design surface on the control flow tab.
Right-click the newly added foreach loop container and select edit ".
On the "General" page of the "foreach loop Editor" dialog box, enter the "foreach file in folder" for "name.
Configure the enumerator for the foreach loop container
Click set ".
On the "set" Page, select "foreach file enumerator ".
In the enumeration Configuration group, Click Browse ".
In the Browse folder dialog box, find the sample data folder that contains the sample data of the tutorial.
By default, the sample data of the tutorial is installed in the C:/program files/Microsoft SQL Server/90/samples/integration services/tutorial/creating a simple ETL package/sample data folder.
In the file box, type currency _ *. txt.
Map enumerators to User-Defined variables
Click "variable ing ".
On the "variable" column on the "variable ing" page, click an empty cell and select "<new variable...> ".
In the "add variable" dialog box, type varfilename for "name.
Click OK ".
Click OK again to exit the foreach loop editor dialog box.
Add a Data Flow task to a loop
In this task, you will modify the flat file Connection Manager created and configured in Lesson 1st. The flat file connection manager was configured to load a single file statically when it was initially created. To enable the flat file Connection Manager to reload files, you must modify the connectionstring attribute of the Connection Manager to accept User-Defined variables.User:varFileName
The variable contains the path of the file to be loaded at runtime.
By modifying the Connection Manager to use User-Defined variablesUser::varFileName
And fill in the connectionstring attribute of the Connection Manager. The Connection Manager can connect to different flat files. During running, each iteration of the foreach loop container will be dynamically updated.User::varFileName
Variable. When updating a variable, the Connection Manager is connected to different flat files and the data flow task is used to process other datasets.
Configure the flat file Connection Manager to use the variables of the connection string
In the Connection Manager pane, click samplecurrencydata.
In the "properties" window, click an empty cell for the "expression", and then click the ellipsis (…).
In the attribute column of the attribute expression editor dialog box, type or select connectionstring.
In the expression column, click the ellipsis (…) button. To open the expression generator dialog box.
In the expression generator dialog box, expand the variable node.
Drag the variable User: varfilename to the expression box.
Click OK to close the expression generator dialog box.
Click OK again to close the attribute expression editor dialog box.