Q: I want to batch update a large table in 5,000 lines at a time, but I don't know how to split the data. The table does not contain an incremental number or an integer primary key. How can you maintain good performance while implementing data updates?
A: If you know which rows have not been updated, you can use simple predicates to exclude updated rows, and the rowcount setting can help you split the data in batches. The following code listing shows how to use this setting:
SET ROWCOUNT 1000
while (1=1) BEGIN
BEGIN TRANSACTION
Update...set ..., mylastupdate= ' Date ',... WHERE
Mylastupdate < ' date '
--Update 1000 does not update rows
IF @ @ROWCOUNT = 0
BEGIN
COMMIT TRANSACTION
Break
End
COMMIT TRANSACTION
End
When the specified number of rows is returned, rowcount causes SQL Server to stop query processing. This technique is useful because it avoids concurrent hits due to a large number of updates, and the less the number of rows in the update, the less likely it is that the update task will prevent other users from accessing the data. Combined with transaction log backups, this approach can also minimize the size of your transaction log.
If there is no mechanism to recognize the updated row, you can use cursors to traverse all data and submit each X value. However, cursors typically occupy server resources more often than statements based on collections.
-microsoft SQL Server Development team