SQL Server XML performance optimization (Best Practices)

Source: Internet
Author: User
Tags bulk insert microsoft sql server 2005 xquery

1. XML Data Model
The performance of XML storage and query processing depends on the database design and structure and granularity of XML data. Whether to use the XML data model.

Whether there is semi-structured data, the document structure and hierarchical markup language data must be retained, or a variable structure.

1.1 mark Markups
The same data can be marked in different ways, depending on the content (element value) and metadata (attribute value ). More specific

The easier it is to read element names, and it is helpful for generating an effective query plan. Lengthy tags increase storage costs. For example:
A.
<Item type = "book"> <title> Writing Secure Code </title> </item>
<Item type = "DVD"> <title> The Godfather </title> </item>
To query a book, enter/item [@ type = "book"].
B.
<Book> <title> Writing Secure Code </title> </book>
<DVD> <title> The Godfather </title> </DVD>

If you want to query the book, you only need to write:/book, which makes it more efficient and easier, and reduces the primary XML index (

(The item layer is missing ). This is the relative General Mark (Generic

Markups.

For typed XML (typed XML), you can also reduce it to two layers:
<Book title = "Writing Secure Code"/>
<DVD title = "The Godfather"/>

Such as <DVD> <title> The Godfather </title> </DVD>, which is called element-centric markup.

For/DVD [title = "The Godfather"]
In The format of <DVD title = "The Godfather"/>, attribute-centric markup. The Query format is/DVD [@ title =

"The Godfather"] to reduce a JOIN.

1.2 Typed and Untyped XML (Typed and Untyped XML)
Untyped XML is not expressed in XML schema. It is saved as a Unicode string in SQL Server.

Storage. For their operations, data needs to be converted to the corresponding type. For example (/book/price) [1]> 19.99, <price> is converted

Decimal process. A large number of similar comparisons consume a lot of resources, which leads to the importance of type information in XML schema.
Type information works in the following aspects:
A. Insert the updated XML data to verify whether it complies with the schema and then store it in binary format for faster conversion.
B. the typed values are stored in the XML index.
C. typed data can also reduce the number of range scans. For example, the sequence number [1] In (/book/price) [1] specifies the <price>

It is unnecessary when it is a single value.

1.3 attributes
You can use udfs to obtain computed columns ). You can add an index to a computed column. Because the calculated column is expected to be good, the query speed is faster.
Example:
Books all have ISBN numbers. The process of using ISBN as a separate calculation column is as follows:
A. Define the function for obtaining the ISBN number
Create function udf_get_book_ISBN (@ xData xml) RETURNS varchar (20)
WITH SCHEMABINDING
BEGIN
RETURN @ xData. value ('(/book/@ ISBN) [1]', 'varchar (20 )')
END
B. Add a computed Column
Create table docs (id int primary key, xCol XML)
Alter table docs add isbn as dbo. udf_get_book_ISBN (xCol)
C. Add a non-clustering index
Create index COMPUTED_IDX ON docs (ISBN)
D. Write Query
If the ISBN column is not used, the query is as follows:
SELECT xCol
FROM docs
WHERE xCol. exist ('/book/@ ISBN [. = "0-2016-3361-2"]') = 1
If the ISBN column is used
SELECT xCol
FROM docs
Where isbn = '0-2016-3361-2'

You can also create a Property Table)

2. load XML data in batches
-- The bcp in, bulk insert, and OPENROWSET methods can be used, because they have nothing to do with this case and are skipped for the moment.

3. XML Index
3.1 normal index
We recommend that you create a primary XML index (primary XML index), which is actually a B + tree. You can also create secondary XML indexes (secondary XML indexes) for PATH, PROPERTY, and VALUE ).
A. PATH INDEX
Applicable to path expressions similar to/book [@ ISBN = "0-2016-3361-2"]. The longer the path, the more effective it is.
B. PROPERTY Index
Applicable to multi-properties queries in XML
C. VALUE Index
Applicable to queries with subaxes (// operator) and wildcards (similar to/book [@ * = "novel "])
The cost of maintaining indexes also needs to be considered as a whole.

-- Some XML updates have nothing to do with this case and are skipped for the moment.
-- The XML full-text index has nothing to do with this case and is skipped for the moment.

4. Query Optimization
4.1 Use the exist () method to check whether the exist exists.
Use exist () instead of value () whenever possible ()
For example:
SELECT *
FROM docs
WHERE xCol. exist ('(/book/title/text () [. = "Writing Secure Code"]') = 1
Ratio
SELECT *
FROM docs
WHERE xCol. value ('(/book/title) [1]', 'varchar (50) ') = 'writing Secure Code'

4.2 optimize XML blobs (Binary XML)
More tempDB files provide better scalability.
Reduce extra XML data type conversion

4.3 specify Singleton Elements (single-piece element)
For typed XML, the default value is singleton.
For non-typed XML, add [1], xCol. query ('/book/title') is automatically converted to xCol. query ('(/book/title) [1]').

4.4 text aggregation for non-typed XML
(/Book/title [. = "Writing Secure Code"]) [1] All text nodes under <title> need to be aggregated
If <title> element has only one text node, you can use (/book/title/text () [1] [. = "Writing Secure Code"] to make it more effective.

4.5 parameterization of XQuery and xml dml expressions
XQuery and xml dml are not automatically parameterized. It is best to use SQL: column () or SQL: variable ().

4.6 ordinal number and estimated Optimization
Reduce the statement branches, such as/book [@ ISBN = "1-8610-0157-6"]/author/first-name, it is best to use a format like/book [author/first-name = "Davis.
Move the ordinal number to the end of the path./book [1]/title [1] is equivalent to (/book/title) [1]
Use Context Node)
Example:
SELECT *
FROM docs
WHERE xCol. exist ('/book [@ subject = "security"]') = 1
The/book and/book/@ subject are evaluated respectively, and the latter is checked to see if the value "security" is included"
If the following expression is used:
SELECT *
FROM docs
WHERE xCol. exist ('/book/@ subject [. = "security"]') = 1
Only/book/@ subject will be evaluated

Dynamic query has nothing to do with this case and is skipped for the moment.

The specific case will be detailed after I finish this case.

References:
Performance Optimizations for the XML Data Type in SQL Server 2005
Http://technet.microsoft.com/en-us/library/ms345118.aspx
XML Best Practices for Microsoft SQL Server 2005
Http://technet.microsoft.com/en-us/library/ms345115.aspx

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.