6. ETL learning: Using loops in SSIs

Source: Internet
Author: User
Tags microsoft sql server 2005 ssis

Lesson 1st: create a simple ETL package, create a package for extracting data from a single flat file source, and then convert the data using the search conversion function, finally, load the data to the factcurrency fact data table of the adventureworksdw sample database.

However, a single flat file is rarely used in the extract, transform, and load (ETL) process. A typical ETL process extracts data from multiple flat file sources. Iterative control flow is required to extract data from multiple sources. One of the most likely features of Microsoft SQL Server 2005 integration services (SSIS) is that you will easily add iterations or loops to the package.

Integration Services provides two types of containers for the loop traversal package: foreach loop container and for loop container. The foreach loop container uses the enumerator to execute a loop, while the for loop usually uses a variable expression. This lesson uses the foreach loop container.

The foreach loop container enables the package to repeatedly execute the control flow for each member of the specified enumerator. You can use the foreach loop container to enumerate:

  • Ado record set row and Architecture Information
  • File and directory structure
  • Systems, packages, and user Variables
  • SQL management object (SMO)

In this lesson, you will modify the simple ETL package created in Lesson 1st to use the foreach loop container. The user-defined package variables will also be set so that the tutorial package can iterate through all the flat files in the folder.

In this task, a copy of the lesson 1. dtsx package contained in the SSIS Tutorial Project will be created. You will use this new copy to complete the remaining part of Lesson 2nd.

Create Lesson 2 Package
  1. If business intelligence development studio is not enabled yet, click Start, point to all programs, point to Microsoft SQL Server 2005, and then click business intelligence development studio.

  2. On the File menu, click open, Project/solution, and SSIS tutorial folders, click open again, and double-click SSIS tutorial. sln.

  3. In Solution Explorer, right-click Lesson 1. dtsx and then click Copy ".

  4. In Solution Explorer, right-click "SSIS package" and click "Paste ".

    By default, the copied package is named Lesson 2. dtsx.

  5. In Solution Explorer, double-click Lesson 2. dtsx to open the package.

 

In this task, you will add the folder function for circular access to flat files, and apply the same data flow conversion used in Lesson 1st to each flat file. The method is to add the foreach loop container to the control flow and configure it.

The added foreach loop container must be able to connect to each flat file in the folder. Because all files in this folder have the same format, the foreach loop container can use the same flat file Connection Manager to connect each file. The flat file connection manager used by the container is the same as the flat file connection manager you created in Lesson 1st.

Currently, the flat file Connection Manager in Lesson 1st only connects to a specific flat file. To connect to each flat file in this folder cyclically, you must configure both the foreach loop container and the flat file connection manager as follows:

  • The foreach loop container maps the container's enumerated values to user-defined package variables. Then, the container uses this user-defined variable to dynamically modify the connectionstring attribute of the flat file Connection Manager and cyclically connect each flat file in the folder.
  • The flat file Connection Manager uses User-Defined variables to fill in the connectionstring attribute of the Connection Manager created in Lesson 1st to modify the Connection Manager.

The process in this task shows you how to create and modify a foreach loop container to use user-defined package variables, and how to add a Data Flow task to this loop. You will learn how to modify the flat file Connection Manager to use User-Defined variables in the next task.

After these changes are made to the package, when the package runs, the foreach loop container cyclically accesses the file set in the sample data folder. Every time you find a file that matches the condition, the foreach loop container will fill in the User-Defined variables with the file name, map User-Defined variables to the connectionstring attribute of the samplecurrencydata flat file connection manager, and then run the data stream on the file. Therefore, in each iteration of A foreach loop, a data flow task uses a different flat file.

Note:
Since Microsoft SQL Server 2005 integration services (SSIS) distinguishes between the control flow and the data flow, any loop added to the control flow does not need to modify the data flow. Therefore, you do not need to change the data streams created in Lesson 1st.

 

 

Add a foreach loop container

  1. In business intelligence development studio, click the control flow tab.

  2. In the toolbox, expand control flow items and drag the foreach loop container to the design surface on the control flow tab.

  3. Right-click the newly added foreach loop container and select edit ".

  4. On the "General" page of the "foreach loop Editor" dialog box, enter the "foreach file in folder" for "name.

Configure the enumerator for the foreach loop container
  1. Click set ".

  2. On the "set" Page, select "foreach file enumerator ".

  3. In the enumeration Configuration group, Click Browse ".

  4. In the Browse folder dialog box, find the sample data folder that contains the sample data of the tutorial.

    By default, the sample data of the tutorial is installed in the C:/program files/Microsoft SQL Server/90/samples/integration services/tutorial/creating a simple ETL package/sample data folder.

  5. In the file box, type currency _ *. txt.

Map enumerators to User-Defined variables
  1. Click "variable ing ".

  2. On the "variable" column on the "variable ing" page, click an empty cell and select "<new variable...> ".

  3. In the "add variable" dialog box, type varfilename for "name.

  4. Click OK ".

  5. Click OK again to exit the foreach loop editor dialog box.

Add a Data Flow task to a loop
  • Drag the extract sample currency data flow task to the foreach loop container that is now renamed as foreach file in folder.

 

In this task, you will modify the flat file Connection Manager created and configured in Lesson 1st. The flat file connection manager was configured to load a single file statically when it was initially created. To enable the flat file Connection Manager to reload files, you must modify the connectionstring attribute of the Connection Manager to accept User-Defined variables.User:varFileNameThe variable contains the path of the file to be loaded at runtime.

By modifying the Connection Manager to use User-Defined variablesUser::varFileNameAnd fill in the connectionstring attribute of the Connection Manager. The Connection Manager can connect to different flat files. During running, each iteration of the foreach loop container will be dynamically updated.User::varFileNameVariable. When updating a variable, the Connection Manager is connected to different flat files and the data flow task is used to process other datasets.

Configure the flat file Connection Manager to use the variables of the connection string
  1. In the Connection Manager pane, click samplecurrencydata.

  2. In the "properties" window, click an empty cell for the "expression", and then click the ellipsis (…).

  3. In the attribute column of the attribute expression editor dialog box, type or select connectionstring.

  4. In the expression column, click the ellipsis (…) button. To open the expression generator dialog box.

  5. In the expression generator dialog box, expand the variable node.

  6. Drag the variable User: varfilename to the expression box.

  7. Click OK to close the expression generator dialog box.

  8. Click OK again to close the attribute expression editor dialog box.

    Use the following procedure to test the new loop function that has been added to the package.

    Test Lesson 2 tutorial package
    1. On the Debug menu, click Start debug ".

      The package will run. You can click the progress tab in the output window to verify the status of each loop. For example, you can see 1097 rows from the file currency_veb.txt added to the target table.

    2. After running the package, click "Stop debugging" on the "debug" menu ".

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.