In transactional replication, data synchronization delays are often encountered. Sometimes these delays are due to the execution of an update in publication, such as Update TA set col=? Where?, this update contains a huge amount of data. At the subscription end, this update is decomposed into multiple commands (one command per data row by default) and applied to the subscription. In the last resort, we need to skip this big business and let replication continue to run.
Now introduce some of the principles and specific methods of transactional replication
When the article of the publication database is updated, the corresponding log is generated, and log reader reads the logs and writes them to the MSrepl_transactions and MSREPL of the distribution database In _commands.
Each record in the MSrepl_transactions has a unique identity xact_seqno,xact_seqno the LSN in the corresponding log. Therefore, we can infer the order of their generation in publication database by xact_seqno, the generation time of the large number is late, and the generation time of small number is early.
Distributionagent contains two sub processes, reader and writer. Reader is responsible for reading data from the distribution database, writer writes the data read by reader to the subscription database.
Reader reads data from the distribution database (reading MSrepl_transactions tables and MSrepl_commands tables) through Sp_msget_repl_commands
The following is the Sp_msget_repl_commands parameter definition
CREATE PROCEDURE sys.sp_msget_repl_commands
(
@agent_id int,
@last_xact_seqno varbinary (),
@get_ Count tinyint = 0,--0 = no count, 1 = cmd and Tran (Legacy), 2 = cmd only
@compatibility_level int = 7000000,
@s ubdb_version int = 0,
@read_query_size int =-1
)
This stored procedure has 6 parameters, and in transactional replication, only the first 4 (and the values of the third and fourth parameters are fixed. 0 and 10000000 respectively). Here is an example:
Execsp_msget_repl_commands 46,0x0010630f000002a900ea00000000,0,10000000
@agent_id represents Distributionagentid, each subscription will have a separate distributionagent to process the data. Once you've brought in the @agent_id, you can find the subscription publication and all the article.
The @last_xact_seqno represents the LSN that was last passed to the subscription.
The general logic is that reader reads the Transaction_timestamp column of the MSreplication_subscriptions table in subscription database and gets the last LSN number of the update. The data in the distribution database is then read with an LSN greater than this number. Writer writes the read data to the subscription and updates the Transaction_timestamp column of the MSreplication_subscriptions table. Reader will then continue to use the new LSN to read the subsequent data and pass it on to the writer, which goes back and forth.
If we manually update the Transaction_timestamp column and set this value to the LSN of the large transaction that is currently executing, the Distribution Agent will skip the large transaction instead of reading it.
Here's an example to illustrate
Environment is as follows
Publisher:sql108w2k8r21
Distributor:sql108w2k8r22
Subscriber:sql108w2k8r23
The highlighted publication in the figure contains 3 ARITCLES,TA,TB,TC
Where TA contains 182.182 billion data, and then we do a bit of the operation
The UPDATE statement was made at 11:00,
Update TA set c=-11
Follow up on the table TA,TB,TC to perform some insert operations
Insert TB VALUES (0,0)
Insert TC VALUES (0,0)
After we start Replication Monitor, we find that there is a great delay, the Distribution Agent has been passing the data of a) operation
Execute the following statement in the subscription database to get the transaction number of the current latest record
declare @publisher sysname
declare @publicationDB sysname
declare @publication sysname
set @publisher = ' Sql108w2k8r22 '
Set @publicationDB = ' Pubdb '
set @publication = ' Pubdbtest2 '
select Transaction_timestamp From MSreplication_subscriptions
where
publisher= @publisher and
publisher_db= @publicationDB
and publication= @publication
In my environment, the transaction number is 0x0000014900004e9a0004000000000000
Return to distribution database, execute the following statement, and get the transaction number immediately following the large transaction. Replace the parameter with the data in your actual environment. (Note that if you encounter performance problems with the following statements, replace the arguments directly with values)
declare @publisher sysname declare @publicationDB sysname declare @publication sysname declare @transaction_timesta MP [varbinary] set @publisher = ' Sql108w2k8r21 ' Set @publicationDB = ' PUBLICATIONDB2 ' Set @publication = ' Pubtest ' ET @transaction_timestamp = 0x0000014900004e9a0004000000000000 select top 1 xact_seqno to MSrepl_commands with (NOLOCK) where Xact_seqno> @transaction_timestamp and article_id in (select article_id from MSarticles a inner join mspub Lications p on a.publication_id=p.publication_id and a.publisher_id=p.publisher_id and a.publisher_db=p.publisher_db I Nner join Sys.servers s on s.server_id=p.publisher_id where p.publication= @publication and p.publisher_db= @publication DB and S.name= @publisher) and publisher_database_id = (SELECT id from mspublisher_databases PD inner JOIN mspub Lications p on pd.publisher_id=p.publisher_id inner join sys.servers s on pd.publisher_id=s.server_id and Pd.publish
er_db=p.publisher_db where S.name= @publisher and p.publication= @publication and pd.publisher_db= @publicationDB) Order by xact_seqno
In my environment, the transaction number is 0x0000018c000001000171
Execute the following statement in the subscription database, skipping large transactions. Replace the parameter with the data in your actual environment
declare @publisher sysname
declare @publicationDB sysname
declare @publication sysname
@ Transaction_timestamp [varbinary]
set @publisher = ' Sql108w2k8r22 '
set @publicationDB = ' pubdb '
set @ publication= ' Pubdbtest2 '
set @transaction_timestamp = 0x0000018c000001000171
update msreplication_ Subscriptions set transaction_timestamp= @transaction_timestamp
where publisher= @publisher and publisher_db=@ PublicationDB and publication= @publication
When the execution is complete, open the Distribution Agent job.
You will then find that the transaction has been successfully skipped, TA will not be updated at the Subscriber, and subsequent updates will be gradually passed to the subscription and the delay disappears.