Cause analysis of "lost" events when using Ring_buffer target for SQL Server extended events and Ring_buffer target potential issues

Source: Internet
Author: User
Tags memory usage xquery

Cause of the matter:
Troubleshoot deadlock problems on SQL Server, the first thing you'll think about is the extended events,
The first kind of solution, open the profile to wait for it, appear too low, as for the profile of the variant trace bar, old, has not been how to use.
The second option is to turn on TRACEON (DBCC TRACEON (3605,1204,1222,-1)) to write the deadlock to the error log, which is also a good choice.
However, it is thought that the system default extended event Sysem_health has captured the deadlock information (Sqlserver.xml_deadlock_report),
There is no need to go back to error log once, theoretically from system_health to get the deadlock information, so try to get deadlock information from System_health's ring_buffer target.

Troubleshooting Passes:
From the target in the ring_buffer of Sysem_health, the expected deadlock event information was not captured.
Why is Sysem_health "not able to capture" when it is fully confirmed that a deadlock has occurred (and is recently, or just)?
Helpless to open a separate event to capture the deadlock event session (capture Sqlserver.xml_deadlock_report only), target is stored in the Ring_buffer
After artificially creating a deadlock, examine the deadlock information captured by Sysem_health and the custom capture extension event
At this point the unexpected happens, and the custom extension event captures the deadlock information completely, while System_health still does not capture the corresponding deadlock information.
What's the situation?

The following custom extended event script

CREATEEVENT SESSION[Deadlock_monitor]  onSERVERADDEVENT Sqlserver.xml_deadlock_reportADDTARGET Package0.ring_buffer (SETMax_events_limit=( the), Max_memory=(4096)) with(Max_memory=4096Kb,event_retention_mode=Allow_single_event_loss,max_dispatch_latency=Tenseconds,max_event_size=0Kb,memory_partition_mode=None,track_causality=OFF, Startup_state= on) ALTEREVENT SESSION[Deadlock_monitor]  onSERVER State=start;

Deadlock captures the result, Sysem_health does not capture the expected deadlock event, although he includes the Sqlserver.xml_deadlock_report event

Since Sysem_health has two output target, a ring_buffer, one is target_file, helpless under the Event_file query captured deadlock information, here is no problem, normal capture to.


Therefore, it can be said that the system defaults to the Sysem_health extension event, the capture of the deadlock itself is not a problem, the problem is on the output target of the extended event Ring_buffer.
In the case of not filtering all extended events, the data parsed from the Ring_buffer has a feature that does not contain any event information for the most recent period of time.
That is, the event information that is solved in Ring_buffer is the event information for a period of time before the current time, does not contain all the event information, and all the event information of the most recent event.
Of course, you can say that Ring_buffer is a FIFO queue model, which should also leave new events, rather than parsing the latest event information.

Cause Analysis:
Referring to an article on Sqlskill, this article analyzes this problem in depth,
Simply put:
Ring_buffer does not have "lost" event information, as to why the resolution does not come out, to start from the Ring_buffer parsing mode, Ring_buffer extended events from sys.dm_xe_session_targets this DMV parsing,
Subject to sys.dm_xe_session_targets the target data column of this DMV target_data field can only hold about 4 MB of XML data limit.
When the event captured by Ring_buffer (binary data in memory) is converted to an XML format greater than (approximately) 4MB, other events over 4MB are "truncated",
The XML file parsed from Sys.dm_xe_session_targets takes precedence over the output of earlier events, so we anticipate that the most recent events are not visible.
Therefore, as in the previous scenario, there is a "missing" part of the event information and no recent event information.

In this case, using Ring_buffer as the target for the extended event has the potential for the following issues
1, the parsed result is unreliable (complete) and may not resolve to the most recent part of the event.
2, the following translation will also mention that Ring_buffer as target may burst memory, so use caution.
3, as mentioned below, SSMs UI is not good for events in Ring_buffer, and for Ring_buffer Target,ui It is just a show XML file that must be parsed by itself, not like Event_ Tabular presentation (readability) as in file
Try to avoid using Ring_buffer target in the extended time.

The following is the translation, the original address:

(title) Why I hate Ring_buffer target in an extended event

I have encountered many of the same problems with Ring_buffer target in the extended event.
I think I'll write a blog post explaining all the information I've taught about Ring_buffer target and the issues associated with it.
Since the release of SQL Server 2012 and the update of the new UI for extended events, I will never use Ring_buffer target again.
In fact, as the article title says, I really hate ring_buffer target, and in this article I'll explain why I hate ring_buffer target, and I want to persuade you to use File_target instead.

Missing events

This is the most common question I have explained about ring_buffer goals by email. In general, the following are typical problem descriptions:

I got the following code from an article in the SQL Server center, and it doesn't work. The problem I encountered was that when I ran the code, even though I knew that a deadlock event had just occurred in the application, it would not show any deadlock information.
It seems that I only see the older deadlocks in the System_health session, but never see the latest deadlocks. I opened trace 1222 and obtained the information in this way, so why not.

The reality is that the event is actually there and you can't see (the event you expect) because the sys.dm_xe_session_targets is the DMV limit.
The Target data column for this DMV can only output approximately 4 MB of XML data.
Bob Ward20009 explained the DMV's 4 MB format XML limit information in the CSS SQL Server engineer Blog.
To demonstrate the results of this limitation, let's look at the number of events included in the system health event session on the SQL Server SP1+CU7 server, and I can use the following query to view the information.

SELECTRing_buffer_event_count, Event_node_count, Ring_buffer_event_count-Event_node_count asEvents_not_in_xml from(    SELECTTarget_data.value ('(ringbuffertarget/@eventCount) [1]','int') asRing_buffer_event_count, Target_data.value ('count (ringbuffertarget/event)','int') asEvent_node_count from    (    SELECT CAST(Target_data asXML) asTarget_data fromSys.dm_xe_sessions assINNER JOINSys.dm_xe_session_targets asSt'System_health'             andSt.target_name=N'Ring_buffer') asN) asT

Ring_buffer_event_count is the EventCount attribute of the XML document returned by the Ringbuffertarget root element (Translator Note: Ring_buffer_event_ Count is the total number of events captured by Ringbuffertarget)
Event_node_count is the number of events in the Ingbuffertarget/event nodes returned by sys.dm_xe_session_targets in this DMV (the difference between the two is the number of so-called missing events)
Here you can see a total of 5,000 events in Ring_buffer target, (because) the System_health session is set at 5000 based on the 2012 new max_events_limit option.
However, only 3,574 events were output by the XML in the DMV, with 1426 remaining events still unavailable (invisible, unresolved), even though they were stuck in memory.
The sys.dm_xe_session_targets XML file takes precedence over the output of earlier events, so we anticipate that the most recent events are not visible.

The translator note, the following is the value of System_health in the Ring_buffer max_events_limit option set at 5000:

We can observe the memory consumed by the extended event target and the XML binary data in the DMV, using the following query

SELECTTarget_data.value ('(ringbuffertarget/@memoryUsed) [1]','int') asBuffer_memory_used_bytes,ROUND(Target_data.value ('(ringbuffertarget/@memoryUsed) [1]','int')/1024x768.,1) asbuffer_memory_used_kb,ROUND(Target_data.value ('(ringbuffertarget/@memoryUsed) [1]','int')/1024x768/1024x768.,1) asBUFFER_MEMORY_USED_MB,datalength(Target_data) asXml_length_bytes,ROUND(datalength(Target_data)/1024x768.,1) asxml_length_kb,ROUND(datalength(Target_data)/1024x768./1024x768,1) asXML_LENGTH_MB from (SELECT CAST(Target_data asXML) asTarget_data fromSys.dm_xe_sessions assINNER JOINSys.dm_xe_session_targets asSt'System_health'  andSt.target_name=N'Ring_buffer') astab (TARGET_DATA)

Here we can see that the binary data consumes approximately 1.7MB of memory, but once serialized into XML, the size of the file becomes about 4.7MB, which is larger than the binary data space
The essence of the problem is that the feature of extended event generation determines his compact binary format, but serialized formatted XML adds storage space for these events.

System_health event sessions are especially prone to collecting up to 5,000 events but RING_BUFFER,DMV can actually only output a small subset of event sessions.
Most likely, sp_server_diagnostics_component_output and Xml_deadlock_report have quite a large event (occupied space), Because the size of the XML that these two events return depends on the specifics of when they trigger the condition.

I found that when a problem occurred on a busy server, the size of the Sp_server_diagnostics_component_output event exceeded 512KB.
Therefore, when one of the events is included in the output XML, the data returned by the DMV to the Ring_buffer target may be subject to significant limitations.

No support for UI

This may be the biggest reason I don't use ring_buffer as an output target in my daily work.
The Extended events UI does not support the decomposition of the information contained in the Ring_buffer target, and the only function of the UI is to display the XML output by the Target_data column of the sys.dm_xe_session_targets DMV.

This means that to use the data, you must open the XML and scan the events, or write XQuery to parse the XML into tabular form, which requires you to know the events, columns, and operation definitions used in the event session to actually access the data.
If I am working on a short-term data collection and do not want it to be saved to a file in the file system on SQL Server 2012, I simply select Live view to stream the data back to the list in the UI.
In this case, I don't have to deal with XML and can quickly find the information I'm interested in.
For any long-term task, and even to view the information in the System_health event session, I use File_target,ui to read and handle events without manually performing any XQuery.

Observe the use of memory

It is unheard of for me to be very careful about how to configure Ring_buffer targets on a production server.
Two weeks ago, Andy Galbraith encountered an error with all the connections reported to the 701 system with low memory,
After analysis, Andy found that in memory 16GB, max memory (max server memories) configured to 11000MB server, Memoryclerk_xe memory clerk occupy 10GB of RAM,
The problem is that an extended event is configured to collect maximum (max_events_limit) 1,000,000 events, but no maximum memory limit is configured.
So the use of memory is based on the number of events collected by the extended event, and there is no maximum memory usage limit, so it can use unlimited memory, in case of limited memory, causing problems on the server.

Cause analysis of "lost" events when using Ring_buffer target for SQL Server extended events and Ring_buffer target potential issues

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.