SQL Server 2005: Construct the simplest Fuzzy Lookup pack

Source: Internet
Author: User
Tags sql ole pack reference table name

1. Open DTS Designer.

2. Create a new ETL project, add a new package, click the Data Flow tab, and then accept the add a Data flow item option.

3. On the data flow diagram, drag the OLE DB source and destination transformations from Toolbox, and then connect them by using an instance of a fuzzy lookup.

4. Point the OLE DB source to your new data by selecting a connection and an input table containing the problematic data. Your data must contain some string columns.

5. Double-click Fuzzy Lookup to open the custom user interface (UI). Select the connection and table that you want to convert from the Reference table name drop-down menu to point to the referenced data that is already stored.

6. On the Columns tab, drag the item you want to compare from the Available Input Columns (from the OLE DB source) to Available Lookup Columns (from the reference table). For example, you might want to compare the streetaddress in the input with the address in the referencing table.

7. Select the check box for all items in the Available Lookup Columns, and then click OK.

8. Point the OLE DB target to the connection to which you can write a new table, and then click New. Accept the default creation statement, and now you are ready to run the Blur lookup.

9. To run the package you just created, right-click its name in the Solution Explorer window, and then select Execute.

DTS Designer runs this package and provides detailed visual feedback about the pipeline. Depending on the size of the reference data, you may notice a delay when the fault tolerant index (Error-tolerant index,eti) is created. ETI is the main data structure used by the fuzzy lookup at run time.

After ETI is created, all input rows are processed and the results are written to the target. By displaying the number of rows processed by each component, DTS Designer provides you with feedback about the pipeline's progress. You can also place a dataviewer on the pipe by right-clicking the connector between the Blur lookup and the OLE DB target. This allows you to see in real time those rows where the blur lookup matches your input row. In addition to matching tuples, Fuzzy Lookup also outputs a percentage of confidence and similarity.



Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.