The difference between fuzzy search conversion and search conversion is that fuzzy match is used. Query and convert matching records in the reference table using the same connection. It returns the exact match in the referenced table. Or no content is returned. In comparison, fuzzy search conversion uses fuzzy match to return one or more matching items in the referenced table.
The referenced data source for fuzzy search conversion must be a database of SQL2000 or higher. In fuzzy match, only dt_wstr and dt_str data types can be used as input columns. Full match applies to all data types except dt_text, dt_ntext, and dt_image. The edit box for fuzzy search conversion has three tabs. Let's introduce them one by one.
1. Reference Table Tab
Attribute |
Description |
Oledb Connection Manager |
Use the list box to select an existing ole db Connection Manager or use the "new" button to create a new connection. |
Generate new index |
Specify the new index to be created during conversion for search |
Name of the referenced table |
Select the table to be referenced in the database |
Store New Indexes |
Select this option if you want to save the new search index. |
New Index name |
If you have selected the new storage Index option, you can set the name for this item. |
Maintain storage Indexes |
If you have selected the new storage Index option, you can select this option. |
Use existing Indexes |
Use existing indexes for search during conversion |
Existing index name |
Select the table to be referenced from the database |
2. Column Tab
Attribute |
Description |
Available input Columns |
Drag the input column to connect it to the available search column. These columns must have supported data types that match each other. Select a ing row and right-click it to edit the ing in the create link dialog box. |
Name |
View the names of available input Columns |
Transfer |
Specifies whether to include an input column in the converted output. |
Available lookup Columns |
Use these check boxes to select the columns for which you want to perform fuzzy search. |
Search Columns |
Select a search column from the list of available columns in the referenced table. Select the check box in the "available query columns" table to select a query column. When you select a column in the "available search columns" table, an output column is created, which contains the reference column values returned for each matching row. |
Output alias |
Enter an alias for the output of each query column. The default value is the name of the query column and a numeric index value is appended. However, you can select a unique descriptive name. |
3. Advanced card selection
Attribute |
Description |
Maximum number of matching outputs for each query |
Specify the maximum number of matching conversions returned for each input line. The default value is1.If the value is greater than 1, the output of the conversion may include multiple rows for each query, and some rows may be repeated. |
Similarity threshold |
Use the slider to set the similarity threshold at the component level. The closer this value is to 1, the closer the similarity between the search value and the source value must be to be considered a match. Because fewer candidate records need to be considered, increasing the threshold can increase the matching speed. |
Delimiter |
Specifies the delimiter used by conversion to split the column values into words. |
The following example describes how to use fuzzy search conversion. We use the TXT text provided in the previous section as the data source.
1. Create a table using the following T-SQL statement in the adventureworks database 2. Create a package named "fuzzylookupdemo" and drag and drop a Data Flow task into the control flow. Create a flat file connector named "employdata" on the Connection Manager interface to connect "E: \ Bi materials \ integrationservices \ empolydata.txt"
3. On the regular tab, set the format attribute to the un-aligned (ragged right) on the Right and the code page attribute to 1252 (ANSI-Latin I ). For other settings, see:
4. On the column tab, click the column boundary to place two split lines at the 5th characters and 28th characters respectively.
5. Click the Advanced tab to set the title's outputcolumnwidth to 50 and save the connection.
6. Drag and Drop a flat file data source on the data flow tab. The connector attribute of the flat file is set to employdata.
7. Drag and Drop a fuzzy search conversion task. Link to the flat file source, and the arrow points to the fuzzy conversion task.
8. Click the new button in Connection Manager properties on the reference tab of the fuzzy search conversion task. Create an ole db Connection Manager. Used to connect to the adventureworks database. For other settings, see
9. In advanced options, the similarity threshold is set to 0.5, and others are default values. For example, Set
10. For the attribute settings on the column tab and the Advanced tab, see the following two diagrams:
11. Drag and Drop an ole db target source and set the ole db Connection Manager to the Database Connector you just created. Set the data access mode to a table or try-fast loading and click New. the dialog box is displayed.
Make the following settings for the SQL script in the box:
Click OK.
12. Click the ing tab to make the following settings.
13. Run the package. For example:
14. You can view the results in the adventureworks database: