Understand synchronization and asynchrony, blocking, semi-blocking and full blocking, and buffer caching concepts in the data flow Task
Components in the SSIS dataflow data flow can be divided into synchronous synchronization and asynchronous Asynchrony.
Synchronous Sync Components
The synchronization component has a very important feature-the output of the synchronization component shares the same cache as its input, that is, how many rows of data are entered into the output of the rows. In the process of synchronous conversion, enter a row, output a row, input and output synchronization, they occur at the same time.
Asynchronous Asynchronous components
The characteristic of an asynchronous component is that its output uses a new cache that does not reuse the input cache, which may have more records than input, or less than input. In the process of asynchronous conversion, the asynchronous component requires a new buffer to perform processing, such as to the Sort component, which must process the entire rowset in an operation. There are merge operations like merge, Merge Join, component to check multiple rows of each input first, and then merge input rows in sorted order. Includes the Aggregate component, which requires a new row to hold the calculated aggregate value.
So in contrast, in general, the synchronization component is faster than the asynchronous component because it can reuse the input cache, because the asynchronous component needs a new cache to complete the output.
Observation of the data source components, which are asynchronous types, because they need to create two caches, one is the output of the successful one is the error output, all the target destination components are synchronized.
In addition to the above two categories, they can also be divided into these three types: non-blocking non-blocking, semi-blocking blocking and fully-blocking total blocking.
Non-blocking non-blocking conversions
Non-blocking conversions are also referred to as row conversions and synchronous transformation components. The component receives a row of data, which is processed and then exported. Data rows are not created or deleted during the entire output process of a component. For example, loading 1000 pieces of data from upstream Source, after a non-blocking conversion component, each receive a row of data processing a row, after processing the data directly to the downstream component, will not wait until all 1000 pieces of data processed before handing over to the next component.
Semi-blocking semi-blocking conversion
Semi-blocking means that the conversion component will control the input line for a period of time, such as the component received the upstream 1000 rows of data, it is possible to receive 10 rows or 100 rows of data summarized once, the final output of this 10 rows or 100 rows of data, or output a row of summarized data, so The output is not processed immediately when a row of data is received. After processing this batch of data output, you can continue to accept other row data and process the output, and the semi blocking conversion is also the asynchronous transformation component.
Fully-blocking Total Blocking Conversion
The same is the same as the semi-blocking transformation, which belongs to the asynchronous transformation component. However, the full blocking conversion component controls all data, upstream 1000 rows of data, the entire blocking component needs to accept the full 1000 rows of data before processing the output. such as sort or Aggregate components, they need to accept all the data before they are sorted or aggregated, this is easy to understand.
Categorization in SSIS Data flow components