ETL learning 5: creating a new integration services project

Source: Internet
Author: User
Tags microsoft sql server 2005 ssis

The first step to create a package in Microsoft SQL Server 2005 integration services (SSIS) is to create an integration services project. This project contains templates for objects such as data sources, data source views, and packages used in the data conversion solution.

Create a new integration services project
  1. On the Start Menu, point to allProgram"," Microsoft SQL Server 2005 ", and then click SQL server business intelligence studio.

  2. On the File menu, point to new and click project to create a new integration services project.

  3. In the "template" pane of the "new project" dialog box, select "integration services project ".

  4. In the "name" box, change the default name to SSIs tutorial.

  5. Click OK ".

    By default, an empty package named package. dtsx is created and added to the project.

  6. On the Solution Explorer toolbar, right-click package. dtsx, and then click "RENAME" to rename the default package to lesson 1. dtsx.

  7. (Optional) when the system prompts you to rename the package object, click "yes ".

 

 

In this task, a flat file Connection Manager is added to the package you just created. Through the flat file connection manager, the package can extract data from flat files. Using the flat file connection manager, you can specify the name and location of the file to be applied when the package extracts data from the flat file, and set the region andCodePage and file format, including column separator. In addition, you can manually specify the data type for each column. You can also use the "provide column type suggestions" dialog box to automatically map the extracted data columns to the integration services data type.

You must create a new flat file Connection Manager for each file format you want to use. Because this tutorial extracts data from multiple flat files with identical data types, you only need to add and configure a flat file connection manager in the package.

In this tutorial, the following attributes are configured in the flat file Connection Manager:

    • Column names because the flat file does not have a column name, the flat file connection manager creates the default column name. These default names cannot be used to identify the content represented by each column. To make these default names more useful, you need to change the default name to the name of the fact table that you want to load the flat file data.

    • Data mappings is the data type ing specified by the flat file connection manager. It is used by all flat file data source components that reference the Connection Manager. You can use the flat file connection manager or use the "provide column type suggestions" dialog box to manually map data types. In this tutorial, you will view the suggested ing in the "provide column type suggestions" dialog box, and then manually set the necessary ing in the "flat file Connection Manager Editor" dialog box.
Add a flat file Connection Manager
    1. Right-click any location in the Connection Manager area, and then click New flat file connection ".

    2. In the "Connection Manager name" Field of the "flat file Connection Manager Editor" dialog box, type sample flat file source data.

    3. Click Browse ".

    4. In the open dialog box, browse and find the sample data folder, and then open the samplecurrencydata.txt file. By default, the sample data of the tutorial is installed in the c: \ Program Files \ Microsoft SQL Server \ 90 \ samples \ integration services \ tutorial \ creating a simple ETL package \ sample data folder.

Rename columns in the flat file Connection Manager
    1. In the flat file Connection Manager editor dialog box, click Advanced ".

    2. In the properties pane, make the following changes:

      • Change the column 0 name attribute to averagerate.

      • Change the column 1 Name attribute to currencyid.
      • Change the column 2 Name attribute to currencydate.
      • Change the column 3 name attribute to endofdayrate.
      Note:
      By default, all four columns are initially set to the string data type [dt_str], and its outputcolumnwidth is 50.

       

       

Remap column data type
  1. In the flat file Connection Manager editor dialog box, click "recommended type ".

    Integration services automatically recommends the data type based on the first 100 rows of data, but you can change the recommended options to increase or decrease the sampling data and specify the default data type of integer or boolean data, you can also add spaces to the string column as the fill volume.

    However, do not make any changes now. Click Cancel to return to the advanced pane of the flat file Connection Manager editor dialog box to view the recommended column data type.

    In this tutorial, integration services is the data type shown in the second column of the following table in the samplecurrencydata.txt file. However, the data types required by the target columns (which will be defined in subsequent steps) are displayed in the last column of the following table.

    flat file column recommended type Target column target type

    averagerate

    float [dt_r4]

    factcurrencyrate. averagerate

    float

    currencyid

    string [dt_str]

    dimcurrency, currencyalternatekey

    nchar (3)

    currencydate

    date [dt_date]

    dimtime. fulldatealternatekey

    datetime

    endofdayrate

    float [dt_r4]

    factcurrencyrate. endofdayrate

    float

    The recommended data types for the currencyid and currencydate columns are incompatible with the Data Types of fields in the target table. BecauseDimcurrency. currencyalternatekeyThe data type of is nchar (3), and currencyid needs to be changed from string type [dt_str] to string type [dt_wstr]. In additionDimtime. fulldatealternatekeyIt is defined as the datatime data type. Therefore, currencydate must be changed from the date type [dt_date] to the database timestamp type [dt_dbtimestamp].

  2. In the properties pane, change the data type of the column currencyid from string type [dt_str] to Unicode string type [dt_wstr].

  3. In the "properties" pane, change the data type of the column currencydate from the date type [dt_date] to the database timestamp type [dt_dbtimestamp].

  4. Click OK ".

 

After the flat file Connection Manager for connecting to the data source is added, the next task is to add the ole db Connection Manager for connecting to the target. Through the ole db Connection Manager, the package can extract or load data from any ole db-compatible data source. You can use the ole db Connection Manager to specify the server, authentication method, and default database for the connection.

In this lesson, you will create an ole db Connection Manager that uses Windows authentication to connect to a local instance of adventureworksdb. Other components to be created later in this tutorial (such as finding the conversion and ole db target) will also reference the ole db Connection Manager created here.

Add and configure the ole db Connection Manager
  1. Right-click any location in the Connection Manager Area and click "create ole db connection ".

  2. In the "Configure ole db Connection Manager" dialog box, click "new ".

  3. In "server name", enter localhost.

    When localhost is specified as the server name, The Connection Manager connects to the default instance of Microsoft SQL Server 2005 on the local computer. To use a remote instance of SQL Server 2005, replace localhost with the name of the server to be connected.

  4. In the "log on to server" group, confirm that "use Windows Authentication" is selected ".

  5. In the "connect to Database" group's "select or enter database name" box, type or select adventureworksdw.

  6. Click test connection to verify whether the specified connection settings are valid.

  7. Click OK ".

  8. Click OK ".

  9. In the "Data Connection" pane of the "Configure ole db Connection Manager" dialog box, confirm that localhost. adventureworksdw is selected.

  10. Click OK ".

      After the Connection Manager is created for the source and target data, the next task is to add a Data Flow task to the package. A Data Flow task encapsulates the data flow engine that moves data between the source and target, and provides the function of converting, clearing, and modifying data when moving data. Most data extraction, conversion, and loading (ETL) processes are completed in Data Flow tasks.

      Note:
      Microsoft SQL Server 2005 integration services separates data streams from the control flow. Separating data streams from the control flow is one of the important differences between integration services and Microsoft SQL Server 2000 data transformation services.

       

       

      Add a Data Flow task
        1. Click the control flow tab.

        2. In the toolbox, expand control flow items and drag a Data Flow task to the design surface of the control flow tab.

        3. On the Control Flow Design surface, right-click the newly added data flow task, and then click Rename to change the name to extract sample currency data.

           

          A good practice is to provide a unique name for all components added to the design drawing. Considering ease of use and maintainability, the name should indicate the functions executed by each component. According to these naming guidelines, the integration services package can be self-described. Another way to describe the package is to use annotations. For more information about annotations, see .

       

       

      After the flat file source for extracting data from the source file is configured, the next task is to define the search conversion needed to get the values of currencykey and timekey. Query conversion joins data in a specified input column to a column in the referenced dataset for query. The referenced dataset can be the result of an existing table or view, a new table, or an SQL statement. Find the database that the conversion uses ole db Connection Manager to connect to the source data that contains the referenced dataset.

      In this tutorial, you will add the following two find conversion components to the package and configure them:

      • A conversion is to query the value in the currencykey column of the dimcurrency dimension table based on the value of the matched currencyid column in the flat file.

      • A conversion is to query the value in the timekey column of the dimtime dimension table based on the value of the matched currencydate column in the flat file.

    In either case, the search conversion uses the previously created ole db Connection Manager.

    Add and configure the currencykey lookup and conversion
    1. In the toolbox, expand "Data Flow conversion" and drag "Search" to the design surface of the "data flow" tab.

    2. Click the extract sample currency data flat file source and drag the green arrow to the newly added "Search" conversion to connect the two components.

    3. On the "data flow" design surface, right-click the newly added "Search" conversion, click "RENAME", and change the name to lookup currency key.

    4. Double-click lookup currency key conversion.

    5. In the "ole db Connection Manager" box in the "Find conversion Editor" dialog box, make sure that localhost. adventureworksdw is displayed.

    6. In the Use table or view box, type or select [DBO]. [dimcurrency].

    7. Click the columns tab.

    8. On the "available input columns" Panel, Drag and Drop currencyid to the currencyalternatekey on the "available search columns" panel.

    9. Select currencykey.

    10. Click OK ".

    Add and configure datekey lookup and conversion
    1. In the toolbox, Drag search to the data flow design surface.

    2. Click lookup currency key conversion, and drag the green arrow to the newly added "Search" conversion to connect the two components.

    3. On the "data flow" design surface, right-click the newly added "Search" conversion, click "RENAME", and change the name to lookup date key.

    4. Double-click lookup date key conversion.

    5. In the "ole db Connection Manager" box in the "Find conversion Editor" dialog box, make sure that localhost. adventureworksdw is displayed.

    6. In the Use table or view box, type or select [DBO]. [dimtime].

    7. Click the columns tab.

    8. On the "available input columns" Panel, Drag and Drop currencydate to the fulldatealternatekey on the "available search columns" panel.

    9. Select timekey.

    10. Click OK ".

     

    In this lesson, you have completed the following tasks:

      • A new Microsoft SQL Server 2005 integration services (SSIS) project is created.

      • Configure the Connection Manager required to connect the package to the source and target data.
      • A data stream is added, which extracts data from the flat file source, performs necessary search and conversion on the data, and configures data for the target.

    The package is now complete! The package is tested.

    Run Lesson 1 tutorial package

      1. On the Debug menu, click Start debug ".

        The package starts to run, and 1097 rows are successfully added to the factcurrency fact data table in adventureworksdw.

      2. After running the package, click "Stop debugging" on the "debug" menu ".

     

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.