During SQL Server replication, theprocesscouldnotexecute & #39; sp appears.

Source: Internet
Author: User
Transaction Replication involves three jobs (proxies). 1. snapshots are executed only once when Initialization is required. 2. logs are read and data changes after configuration replication are read, and generate the Copy command, which is always running. 3. The distribution agent is responsible for synchronizing the Copy command to the subscription server, which is also running. 4. In addition, there are clear

Transaction Replication involves three jobs (proxies). 1. snapshots are executed only once when Initialization is required. 2. logs are read and data changes after configuration replication are read, and generate the Copy command, which is always running. 3. The distribution agent is responsible for synchronizing the Copy command to the subscription server, which is also running. 4. In addition, there are clear

Transaction Replication involves three jobs (proxies)

1. snapshot, which is executed only once when Initialization is required

2. Read logs, read data changes after configuration replication, and generate a copy command, which is always running

3. the distribution agent is responsible for synchronizing the Copy command to the subscription server, which is still running.

4. In addition, there are also jobs that clear expired commands (REPL-clear distribution) and jobs that clear snapshots (REPL-clear snapshots ).

Bytes --------------------------------------------------------------------------------------------------------------------

(1) Too many transaction commands cause the log reader to read the transaction command timeout (or the server performance may degrade, leading to the log reader to read the transaction command timeout)

Bytes --------------------------------------------------------------------------------------------------------------------

A few days ago, there was a problem with the replication of a server. The job of the log reader failed and failed after retry. In the history of the job, the error message only contains the following error message:

The process cocould not execute 'SP _ replcmds 'on 'server name'

After the replication monitor is turned on, the error message is still: the process cocould not execute 'SP _ replcmds 'on 'server name', but the following error message is added:

Error message:

Timeout has expired (Source: MSSQLServer, error code: 1003)

Let's take a look at the use of the sp_replcmds command.

Sp_replcmds: runs in a given database.Sp_replcmdsThe first client is considered as a log reader. Returns the command marked as a copy transaction. This stored procedure is executed on the Publishing Database of the Publishing Server.

It seems that sp_replcmds is a copy command. Combined with the above timeout error that has expired, we can think that the log reader times out when reading the log, that is, the SQL statement of the query run by the Log Reader times out during running. I think that my server has a performance problem recently, so it takes a long time to query, and it is not impossible. Of course, there may also be many MSrepl_commands commands. You can run the following code in the distribution library to query the data volume:

Select count (1) from MSrepl_commands with (nolock) -- currently, my database has nearly million data records, a large number, and other servers are 20 million ~ 30 W data volume.

After determining the cause, you need to change the configured timeout time a little longer so that the log reader can take the command marked as a copy transaction before the query timeout.

1. Start the replication Monitor

2. Right-click the proxy configuration file

3. Create a new proxy configuration file and change the querytimeout value to 65533. This is the maximum position.

Restart the Log Reader job to run the job normally.

For more information, see:

Http://blogs.msdn.com/ B /repltalk/archive/2010/07/13/using-verbose-history-agent-profile-while-troubleshooting-replication.aspx

Http://www.tuicool.com/articles/IV3QRr

Bytes --------------------------------------------------------------------------------------------------------------------

(2) REPL-clearing the lock table caused by a long running time of the distribution job, resulting in timeout for the log reader to read the transaction command

Bytes --------------------------------------------------------------------------------------------------------------------

Log reading fails all the time and reports the process cocould not execute 'SP _ replcmds 'on 'server name' error.

In this case, try to restart the SQL proxy of the server, and then the log reader can work normally for a period of time (I am experiencing a few hours), but it will fail every day, the Log Reader job fails. Because the subscription server may fail at night, but data needs to be obtained in time, the subscriber wrote a job to restart the SQL agent of the Publishing Server at three o'clock every day, in this way, the transaction replication can work normally.

The Publishing server is SQL server 2000. The script for restarting the SQL server agent is as follows:

-- Enable xp_cmdshell

Sp_configure 'show advanced options', 1

Reconfigure

Go

Sp_configure 'xp _ Your shell', 1

Reconfigure

Go

-- Restart the SQL server agent

-- Stop

Use master

Go

Xp_cmdshell 'net stop sqlserveragent'

-- Start

Use master

Go

Xp_cmdshell 'net start sqlserveragent'

However, this method is only a temporary method. After du Niang and Google, find the following reasons:

The reason for the previous release server to restart the Agent to copy data to the subscription server is as follows:

1. in step 2, the log reading operation stores the insert, update, and delete commands on the publishing server in the MSrepl_commands table. (For the "Step 2" in this sentence, please refer to the beginning of this article)

2. The job for clearing expired commands (REPL-clear distribution) runs every 10 minutes. The expired commands in the MSrepl_commands table are automatically cleared each time.

3. The REPL-clear distribution job is to call the sp_MSdelete_publisherdb_trans stored procedure to clear expired commands. The sp_MSdelete_publisherdb_trans stored procedure uses a cursor to retrieve only 2000 pieces of data each time and then delete it.

It is found that the MSrepl_commands table already contains 11106751 pieces of data. Therefore, the amount of data to be deleted is too large, resulting in REPL-clear that the distribution job has been running and thus deleting the data in the MSrepl_commands table, the MSrepl_commands table is often locked.

4. In this step, you need to insert the new command into the MSrepl_commands table, and the new command cannot be inserted because the delete operation is locked for a long time. The MSrepl_commands table is not locked until the 2000 data records obtained from the cursor are deleted and before the cursor retrieves 2000 data records again, in this case, the new command can be inserted into the MSrepl_commands table. If the table MSrepl_commands is locked for a long time, the log read operation will time out and fail, and re-run the job until the job fails after 10 retries. At this time, no new data will be copied to the Business Database on the subscription server.

5. Restart the SQL server agent. Log reading will restart the job and repeat the preceding steps. New data will be copied to the Service database on the subscription server.

Solution:

In the Stored Procedure sp_MSdelete_publisherdb_trans, the deletion logic is to retrieve 2000 pieces of data each time and then delete the data until the deletion is completed.

Change the 2000 data records of this stored procedure to 100000 data records each time. After a day's deletion operation, the data volume changes to 11106751 again. In this case, it takes only a few seconds for the REPL-clear distribution job to complete. The MSrepl_commands table is no longer locked, and other jobs can be smoothly executed.

Note:

In the Stored Procedure sp_MSdelete_publisherdb_trans, sp_MSdelete_publisherdb_trans is called to delete the stored procedure. Therefore, the modified stored procedure is actually sp_MSdelete_publisherdb_trans. After most of the data in the table MSrepl_commands is deleted, I changed the parameter 100000 back to 2000.

Some blogs of the Microsoft Asia Pacific Database Technical Support Group also detail the replication issues. For details, refer:

Http://blog.itpub.net/25175503/viewspace-705405/-performance troubleshooting tools

Http://blog.itpub.net/25175503/viewspace-705413/-Log Reader thread latency

Http://blog.itpub.net/25175503/viewspace-705826/ ---- log reader writer thread Delay

Http://blog.itpub.net/25175503/viewspace-705827/-distribution agent reader thread Delay

Http://blog.itpub.net/25175503/viewspace-706127/-distribution proxy writer thread Delay

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.