SQLServer Bug Analysis · Agent Link Leak analysis (reprint)

Source: Internet
Author: User
Tags server memory

Background

SQL Server Agent provides the user with regular administrative tasks as Windows services, which are called jobs, and how the scenario for applying mirroring solves the job synchronization problem, and AWS RDS does this by ignoring the user's maintenance job, If a switch user is required to create a job on the new principal side, the other is to keep the synchronization job on the mirror side, and to try to make the user unaware of any unnecessary maintenance actions after switching, but this practice can encounter very serious problems in some cases-memory exhaustion.

Troubleshooting analysis

First-time analysis

When the problem occurs, the errorlog of the instance occurs:

Error: 701, Severity: 17, State: 123.

and recorded Memorystatus, according to Memorystatus's information itself can be inferred, but since there is a field we may wish to use a DAC (dedicated Administrator Connection) to see:

SELECTTOP 10 [type],sum ( single_pages_kb) as [SPA MEM/KB],SUM(multi_pages_kb) AS [MPA MEM/KB]FROM sys.dm_os_memory_clerksGROUP BY typeORDER BY SUM ( single_pages_kb+multi_pages_kb) DESC ;

Memory consumption

The most occupied objectstore_sni_packet, Memoryclerk_sqlconnectionpool generally with the number of connections, network PACKET size;

selectc.session_id,c.net_packet_size,s.host_name as client_host_name,s.program_name,s.client_interface_namefrom sys.dm_exec_connections cjoin sys.dm_exec_sessions son c.session_id = s.session_idjoin sys.endpoints eon c.endpoint_id = e.endpoint_idorder by c.net_packet_size desc

Net_packet_size

select value_in_use from sys.configurations where configuration_id=1544select program_name,count(*) as conn_num from sys.sysprocesses where spid>50 group by program_name order by conn_num desc

Memory specifications

1w+ link hold, according sys.dm_exec_sessions.program_name to can be confirmed from SQL Server Agent, each session packet size is 4K, the instance of max server memory is 1G, so memory exhaustion;

So what's the 1W link doing? According to the sys.dm_exec_sessions.program_name exposure of the job_id we find the corresponding job, first look at what the job to do;

select * from msdb.dbo.sysjobs where job_id=0x825F84340AFD5B4BA1D5AD82A8E76C1A

Job Information

First inference

This part of the business logic I no longer post, the point is that the database used by Job_step is in the Mirror (RESTORING), because the mirrored db is not available, schedule is set too frequently, so SQL Server The agent failed to release this part of the link in time to cause memory exhaustion.

First time verification

Re-built a test environment to create a similar scenario but the problem did not replicate. Interested in the small partner can be tested under: Mirror + mirror-side job + job link Image Library and frequency 10s + job content unlimited (why not open to follow the profiler can be seen).

With profiler and session-related views, we can see that when Job-step's link db is set to mirrored db, it appears:

Error: 18456, Severity: 14, State: 38.

Indicates that the account verification was successful, but the database is not accessible or has insufficient logon rights, the SQL Server Agent retries multiple times, but eventually releases the link, which is inconsistent with the previous inference.

Profiler information

select program_name,count(*) as conn_num from sys.sysprocesses where spid>50 group by program_name order by conn_num desc

Connection

Second analysis

Back to the original site, we do the analysis; When we try to reproduce, we find that the complete link is not established, but how this 1W link is done:

select spid,program_name,loginame,login_time as time,db_name(dbid) as dbname from sys.sysprocesses where spid>50

dbname

According to the SPID, let's look at the last SQL executed by the non-freed link:

InputBuffer

(@P1 nvarchar(max),@P2 uniqueidentifier,@P3 int,@P4 int)DECLARE @logTextWithPreAndPost nvarchar(max) set @logTextWithPreAndPost = N‘‘ + @P1 + N‘‘; EXECUTE msdb.dbo.sp_write_sysjobstep_log @job_id = @P2, @step_id = @P3, @[email protected], @[email protected]

Look at msdb.dbo.sp_write_sysjobstep_log this stored procedure:

exec sp_helptext ' Sp_write_sysjobstep_log ' Text-------------------------------------------------------------------------------CREATE PROCEDURE sp_write_  Sysjobstep_log @job_id uniqueidentifier, @step_id int, @log_text NVARCHAR (MAX), @append_to_last int = 0ASBEGIN  DECLARE @step_uid uniqueidentifier DECLARE @log_already_exists int Set @log_already_exists = 0 Set @step_uid = (SELECT Step_uid from Msdb.dbo.sysjobsteps WHERE (job_id = @job_id) and (step_id = @step_id)) IF (EXISTS (SELECT * F ROM msdb.dbo.sysjobstepslogs WHERE step_uid = @step_uid)) BEGIN SET @log_already_exists = 1 EN  D--need Create log If "Overwrite is selected or log does not exists. IF (@append_to_last = 0) OR (@log_already_exists = 0) BEGIN--Flag is overwrite--if overwrite and log exists, D        Elete it IF (@append_to_last = 0 and @log_already_exists = 1) BEGIN-Remove previous logs entries EXEC Sp_delete_jobsteplog @job_id, NULL, @step_id, NULL END INSERT into msdb.dbo.sysjobstepslogs (log, log_size, Step_uid) VALUES (@log_text, datalength (@log_text), @step_uid) END ELSE BEGIN Declar     E @log_id INT--selecting TOP is just a safety net-there are only one log entry row per step. SET @log_id = (SELECT TOP 1 log_id from Msdb.dbo.sysjobstepslogs WHERE (step_uid = @step_uid) ORDER by log_id DESC)--Append @log_text to the existing log record.  Note If this-action would make the value of the the log column longer than-nvarchar (max) and then the engine      would raise error 599. UPDATE msdb.dbo.sysjobstepslogs SET Log. WRITE (@log_text, null,0), log_size = datalength (log) + datalength (@log_text), date_modified = getd Ate () WHERE log_id = @log_id END RETURN (@ @error)--0 means Successend

MSDN did not find an exhaustive Document, but reading the definition can also confirm that it is a stored procedure used by Job-step for advanced Configuration, which is to write job-step logs into table msdb.dbo.sysjobstepslogs. Depending on the parameters may be overwrite or append;

Log_to_table

Second deduction

Based on the previous information we can extrapolate that this 1w+ idle link is due to the internal update of the msdb log table after the completion of Job-step, the link is not released after the update is complete.

Second-time verification

The construction of the scene is basically the same as the first time, only one additional point is that Job-step has log to table turned on.

Validation successful

The results are stable and reproducible, a new link is added around 40s (Schedule 10s does not turn on step Retry attempts and Retry interval), according to Msdb.dbo.sysjobstepslogs.log_ Size and Msdb.dbo.sysjobstepslogs.log can confirm the log update frequency at 40s.

Open Profiler, constantly adjust schedule time, monitor msdb.dbo.sysjobstepslogs.log_size size, you can find that when the dispatch frequency is greater than 40s, the new link in accordance with the frequency increase, when the frequency is less than 40s, Added links by 40s one increase.

Conclusion

If the link defined in Job-step is a mirrored library (RESTORING) and a log to Table is configured, Then each log session (Msdb.dbo.sysjobstepslogs) does not automatically close, that is, the agent in this scenario there is a link leak.

SQLServer Bug Analysis · Agent Link Leak analysis (reprint)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.