Sort data by SSIS

Source: Internet
Author: User
Tags ole ssis

There are two ways to sort data in SSIS, one using the sort component and the ORDER BY clause with SQL command.

One, sort by using the sort component

SortType: Ascending ascending, descending descending

SortOrder: The position of the row sequence, starting from 1, increments in turn,

Remove wors with duplicate sort values: If the row sequence repeats, whether to delete the duplicate rows, which differs from all the columns that distinct,distinct is output, check this option only to ensure that the row sequence (part of the output column) is not duplicated.

This property can be viewed and set from the Sort transformation Advanced Editor

Second, ORDER BY clause with SQL command to sort the data

STEP1, use OLE DB to provide sorted data, which must be sorted

Select *  from  with (NOLOCK) Order  by ASC desc

STEP2, open the Advanced Editor for OLE DB and view the input and Output properties tab

1, click OLE DB Source ouput, set the IsSorted property to True, set the property to true to not sort the data, just inform the downstream component that the output data is sorted.

If the IsSorted property is set to True, the actual data is not sorted and an error occurs when the package runs, so you must supply the sorted data (sort by using order by in the SQL clause)

2, click Output Columns, set the SortKeyPosition property of the Order by column_list one by one

The SortKeyPosition attribute has a sort position and direction two metadata:

Positive integers are sorted in ascending order, 0 means not a sequence, negative integers are sorted in descending order, numbers represent the sequence numbers of rows

For example, the following SQL statement

Select Col_1,col_2,col_3,col_4  from dbo. TableNameOrderASCdesc desc

The output columns needs to be set individually, Col_1,col_2,col_3,col_4 sortkeyposition
Since Col_1,col_2,col_3 is a row sequence, the sequence number increments from 1, and the col_4 is not a sequence, so the configuration of SortKeyPosition is as follows

Col_1 's sortkeyposition is 1, the first row sequence, and sorted in ascending order

Col_2 's SortKeyPosition is-2, second row sequence, and sorted in descending order

Col_3 's sortkeyposition is 3, the third row sequence, and sorted in ascending order

Col_4 's sortkeyposition is 0, not a row sequence.

MSDN Official Documentation

Sort Data for the merge and merge Join transformations

In integration Services, the merge and merge joins transformations require sorted data for their inputs. The input data must be sorted physically, and sort options must is set on the outputs and the output columns in the source Or in the upstream transformation. If the sort options indicate that the data is sorted, but the data was not actually sorted, the results of the merge or Mer GE JOIN operation is unpredictable.

You can sort this data by using one of the following methods:

    • In the source, use an ORDER by clause in the statement so is used to load the data.

    • In the data flow, insert a Sort transformation before the merge or merge Join transformation.

If the data is a string data, both the merge and merge Join transformations expect the string values to all been sorted by Using Windows collation. To provide string values to the merge and merge Join transformations that is sorted by using Windows collation, use the F ollowing procedure.

To provide string values that is sorted by using Windows collation
  • Use a Sort transformation to sort the data.

    The Sort transformation uses Windows collation to Sort string values.

    -or-

  • Use the Transact-SQL cast operator to first cast varchar values to nvarchar values, and then use the Transact-SQL ORDER by clause to sort the data.

    Important

    You cannot use the ORDER BY clause alone because the ORDER BY clause uses a SQL Server collation to sort string values. The use of the SQL Server collation might result in a different sort order than Windows collation, which can cause the Mer GE or Merge Join transformation to produce unexpected results.

S etting Sort Options on the Datathere is the important sort properties that must is set for the source or upstream TRANSFO Rmation that supplies data to the merge and merge Join transformations:
  • the issorted property of the output that indicates whether The data has been sorted. This property must is set to true.

           &N bsp;                 Important                        

    setting the value of the issorted property to true does not sort the data. This property is provides a hint to downstream and the data has been previously sorted.

  • the SortKeyPosition property of output columns that indicates whether a column is sorted, the column ' s sort order, and The sequence in which multiple columns is sorted. This property must is set for each column of sorted data.

If You use a Sort transformation to sort the data, the sort transformation sets both of these properties as required by th E Merge or Merge Join transformation. That's, the Sort transformation sets the issorted property of it output to True, and sets the Sortkeypositio n properties of its output columns.

However, if you don't use a sort transformation to sort the data, you must set these sort properties manually on the sour Ce or the upstream transformation. To manually set the sort properties on the source or upstream transformation, use the following procedure.

To manually set sort attributes on a source or transformation component
  1. In SQL Server Data Tools (SSDT), open the Integration Services project, which contains the package for you want.

  2. In Solution Explorer, double-click the package to open it.

  3. On The Data Flow tab, locate the appropriate source or upstream transformation, or drag it from the Toolbox to the Design surface.

  4. Right-click the component and click Show Advanced Editor.

  5. Click the Input and Output Properties tab.

  6. Click <component name> Output, and set the issorted property to True.

    Note

    If you manually set the IsSorted property of the output to True and the data is not sorted, there might be missing Data or bad data comparisons in the downstream merge or merge Join transformation When you run the package.

  7. Expand Output Columns.

  8. Click the column that you want to indicate are sorted and set its SortKeyPosition property to a nonzero integer value b Y following these guidelines:

      • The integer value must represent a numeric sequence, starting with 1 and incremented by 1.

      • A positive integer value indicates an ascending sort order.

      • A negative integer value indicates a descending sort order. (If set to a negative number, the absolute value of the number determines the column's position in the sort sequence.)

      • The default value of 0 indicates, the column is not sorted. Leave the value of 0 for output columns that does not participate in the sort.

    As an example of what to set the SortKeyPosition property, consider the following Transact-SQL statement that loads DAT A in a source:

    SELECT * from MyTable ORDER by ColumnA, ColumnB DESC, COLUMNC

    For this statement, the would set of the SortKeyPosition property for each column as follows:

      • Set the sortkeyposition property of ColumnA to 1. This indicates, ColumnA is the first column to being sorted and is sorted in ascending order.

      • Set the sortkeyposition property of ColumnB to-2. This indicates, COLUMNB is the second column to being sorted and is sorted in descending order

      • Set the sortkeyposition property of COLUMNC to 3. This indicates, COLUMNC is the third column to being sorted and is sorted in ascending order.

  9. Repeat Step 8 for each sorted column.

  10. Click OK.

  11. To save the updated package, click Save Selected Items on the File menu.

Sort data by SSIS

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.