SSIS advanced conversion task-import Column

Source: Internet
Author: User
Tags ole ssis

In the SSIS advanced conversion task-export column article, we mainly export the file columns in the database. Here we will discuss how to import files to the database, it is a pair of frequently used tasks with the export Column task. When we figure out what functions they implement, we will find that the original names are more appropriate. This type of conversion converts physical files in the system file path into table data in the database, and vice versa. The key to understanding the input Column task is to have at least one column in the input source to save the path of the file to be imported to the database. You also need to save the file and file path in the target column. This conversion still uses advanced editing.

It is not intuitive enough because it is an advanced editing, and it does not have the same settings as the wizard, but it will be very convenient to use it after you are familiar with it. In advanced editing, there is no such setting as merging two columns into one column. Therefore, if the file path and file name in the source data are separated, you need to use merge to connect them.

Import column example

Now let's take an example to import an image file to the adventureworks database. Create a new package named importcolumnexample, add a Data Flow task, drag and drop an import column transformation on the data flow interface, and set the package according to the following steps:

  1. Create path c: \ Users \ pictures \ mypicture
  2. Paste 3 smaller image files in path C: \ Users \ pictures \ mypicturewith the names 1.jpg, 2.jpg, and 3.jpg.
  3. Create a TXT file named filelist.txt in path C: \ Users \ pictures \ mypicture. the file content is as follows:
    Imagefilepath
    C: \ Users \ pictures \ mypicture \1. Jpg
    C: \ Users \ pictures \ mypicture \2. Jpg
    C: \ Users \ pictures \ mypicture \3. Jpg
  4. Run the following SQL statement to create a table
     Use  Adventureworks
    Go
    Create Table DBO. tblmyimages
    (
    [ Storedfilepath ] [ Varchar ] ( 50 ) Not Null ,
    [ Document ] Image
    )
  5. Use filelist.txt as the stream of files to be loaded into the database. Add a flat file sourceand filelist.txt as the data source. Right-click the flat file source and select Show advanced editor. It is different from the editing interface opened by double-clicking. If no setting wizard is available, you can directly set properties. It may be difficult to identify the issue, but in the worst case, You can directly Delete and reset the configuration. The advanced editing page is shown in Figure 1:
    Figure 1
  6. In this example, the flat file source advanced editing interface is similar to the import column transormation advanced editing interface. On the import and Output Properties tab page, there are two folder nodes: external columns and output columns. They both have a subnode: imagefilepath, the name of the data column to be output by the flat file source is the same as that of the data source in imagefilepath. We can also see that the attribute ID is 16 and externalmetadatacolumnid is 15. This indicates that the conversion task connects the source input and output. We noticed that there are two buttons: Add output and remove output, but this option cannot be configured in this task. Here, his role is to convert the characters in the file to the character stream. The column mappings label displays the ing relationship. If you change the column name in filelist.txt to myimagefilepath, the column will also be displayed as myimagefilepath.

    Figure 2 txtfilepath is another text file column that I added later.
  7. Connect flat file source and import column. Open the advanced edit option of the import Column task and select the input column label. The input stream is the output stream of the flat file source. Click Select imagefilepath to switch to the input and output propertitis interface, 6-6, there is an imagefilepath column in the input column set, but it is not in the output column set. You can ignore the input in the flat file source, but all the input here must have corresponding output. In fact, if there is no output, the following error message is displayed:
    Validation Error. Data Flow task: Import column [1]: The "input column" imagefilepath "(164)" references output column ID 0, and that column is not found
  8. Select the output columns folder and click Add column to add a new column named myimage. Note that the datatype attribute is image [dt_image]. This is because the input stream contains image files. Here, you can select dt_text, dt_ntext, or dt_image as the night course. Connect the input and output. Note that the ID attribute of the myimage column is 77. Set the filedatacolumnid attribute of the imagefilepath in the input column set to 77, otherwise, the following error message may be displayed:
    Validation Error. data Flow task: Import column [1]: The "output column" myimage "(207)" is not referenced by any input column. each output column must be referenced by exactly one input column. this means that myimage is not referenced by the input column, and each output must have a corresponding input. Note that if there are multiple columns, you should also set the mappedcolumnid attribute in the input columns.
  9. Add ole db desination, connect import column and ole db destination, edit ole db destination and set it to the table created above, and click mappings setting to set the column ing to the final running package, page 3:

    Figure 3

Finally, the data in the target table in the database is as follows:

Fullfilename document

---------------------------------------------------------

C: \ import \ images \ 1.jpg 0xffd8ffe120ee4578696600004920.a00...

C: \ import \ images \ 2.jpg 0xffd8ffe125fe4578696600004920.a00...

C: \ import \ images \ 3.jpg 0xffd8ffe1269b4578696600004920.a00...

(3 row (s) affected)

Use a loop to import Columns

 

In the real environment, you may rarely encounter the above situation. You may need to import a series of files into the database, and the paths of these files are neatly written into a text file. The actual situation is to get a file from an FTP folder, the folder in the clipboard, and load the file into the database. Here you need to build your own file path. You can use foreach loop iner to retrieve files. The key point is to find the volume stream that can pass the file path. Foreach loop container can get a series of neat files, but it cannot be converted into a data source. You can use SQL statements to generate neat file paths, but there is a better way here, read the files in the folder and save the file path to the database. Then, use ole db source to access the data and save the file in the database.

  1. Use the following statement to create a table in the database use adventureworks
    Go
    Create Table stgfilelist
    (
    [Fullfilename] [varchar] (50) not null,
    )
  2. Drag and Drop a foreach loop container in the control flow area. The foreach container enumerator attribute is set to foreach file enumerator, the folder attribute is set to c: \ Users \ pictures \ mypicture, and the files is set to .jpg, set the retrieve file name attribute to fully qualified
  3. Click variable mappings to create a variable in the package range of the string type, change the name to myfilepath, and SET index to 0 by default. This saves the names of all files in the path to the database. 4

    Figure 4
  4. Drag and Drop an execute SQL statement in foreach loop container and set the connection attribute to database adventure.
  5. Click the expression option, find the sqlstatementsource attribute, and write an SQL statement to write the path of the file retrieved by the current variable myfilepath to the database. The expression is as follows, including double quotation marks: "insert into stgfilelist select '" + @ [user: myfilepath] + "'"
  6. Connect the foreach loop iner with the data flow task in the previous exercise. It is directly used here and will not be described in detail.
  7. Add an execute SQL task to the task and delete the history. The statement is as follows:
     If     Exists  (  Select     *     From  SYS. Objects  Where    Object_id     =     Object_id  (N  '  [DBO]. [stgfilelist]  '  )  And  Type  In  (N  '  U  '  ))
    Begin
    Truncate Table DBO. stgfilelist
    End
    If Exists ( Select * From SYS. Objects Where Object_id = Object_id (N ' [DBO]. [tblmymp3] ' ) And Type In (N ' U ' ))
    Begin
    Truncate Table DBO. tblmymp3
    End
  8. Note that the import column has an error because the following error message is displayed when the data source is modified: validation error. data Flow task: DTs. pipeline: Input column "fullfilename" (336) has lineage ID 319 that was not previusly used in the data flow task
  9. Open the import column advanced editing interface and click Edit again. Invalid column reference information 5 is displayed. Here, you can select invalid ing and delete, and reselect ole db source as the source. Each column has a separate ID. The column name is similar to setting a new column in the drop-down list box in columns, and saving it. Open ole db destination to ensure that the data is correctly mapped to the data table.

    Figure 5
  10. the execution package shows that the result file is saved to the tblmyimages table in the database, and the path information is saved to the stgfilelist table. Result 6:
    figure 6

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.