An in-depth explanation of the XQuery query optimization of the Dream database

Source: Internet
Author: User
Tags xpath xquery

XML (extensible Markup Language Extensible Markup Language) The self-describing nature of data makes it highly structured, has a good storage format, and XML documents are as easy to network transfer as HTML. As a result, XML is widely used as a key criterion for data definition, data exchange and data sharing.

At present, the main research direction of XML database is divided into two parts, one is native XML database system, also known as native or pure XML database, such as Tamino, Ipedo and so on, and the other is xml-enabled database system, That is, in the existing relational database system or object-oriented database system based on the expansion of XML support, the advantage of this database is to make full use of the existing very mature relational database technology. An important part of extending XML support on relational database systems is the implementation of XQuery queries.

The DM XML Support project of Da Meng Company is based on the Dream relational database, expands the support of XML, studies and implements the optimization technology of XQuery query based on relational database, and realizes the efficient XQuery query.

XQuery Related Concepts

XQuery is the query language of XML, and the relationship between XQuery and XML documents is quite related to SQL and relational data. One concept that is closely related to XQuery is XPath, which is the recommended specification for the consortium. XPath is the path language of an XML document that is used to address document data in an XML document. XQuery is an extension of XPath. The latest version of these two specifications is XPath 2.0 and XQuery 1.0, which is the strict syntactic subset of the latter and has become a recommended standard.

XPath expressions are positioned in an XML document (navigate) through several "path steps". For each step, an "axis" (axes) is used to describe the document nodes (and the subtree of these nodes) contained in the intermediate result forest in this step, and the XPath expression contains 13 axes, where the children and descendant-or-self axes commonly used are abbreviated as /and//.

Flwor expressions are one of the important extensions of XQuery to XPath, a combination of for,let,where,order,return expressions. For,let expressions bind a sequence to a variable in different ways, which can be the sequence of nodes returned by an XPath path expression, the Where,order expression filters and sorts the output of these sequences, and the return expression outputs a sequence of results, This sequence can be a path expression with a bound variable, and the return expression supports the construction of an XML fragment.

An overview of XML query optimization

Traditional data optimization has a mature theory and method: It is commonly used based on cost optimization. If you apply this idea to the XQuery query optimization of XML, we can see some problems.

(1) There is no perfect query algebra standard; the optimization theory of the inverse relational database, because it has very mature relational algebra language, can support query semantics and query optimization well, so relational database can become the mainstream data management way.

(2) There is no exact cost estimate, because the difference between the structure and distribution of XML data and relational data determines the necessity of exploring the model for estimating the cost of XML query.

(3) There is not enough statistic information, enough accurate statistic information is the basis of ensuring the validity of query optimization, and lack of sufficient statistic information is an important factor that causes the error of estimating and actual situation.

At present, XML query optimization mainly has the following technologies:

Then write a Java object file Testlob.java on these 2 large segments, defining the type Clob and Blob property fields string and byte[] types, which correspond to the string type in Java because CLOB is processing a big text type. A blob is the getter and setter method that handles some of the 2 properties that are stored in binary streams without a strictly defined large file so that it uses the byte[] type, and the corresponding code is as follows:

1. An expression rewrite

Typically, XML queries are expressed in the form of trees, also known as tree-mode queries (pattern trees query PTQ). The minimization of queries is the process of simplifying the query tree. In Mengxiao, the PTQ optimization of XQuery is divided into two steps: First, it is not based on any external information syntax optimization, then based on the external information of XML documents, such as Dtd,schema, semantic optimization. Based on this processing step, the optimization technique can be divided into DTD based or DTD independent. The basic idea of the overriding expression is to convert the XQuery expression to a certain rule and get a more efficient expression, which is actually a simplification of the query pattern tree. The research thinks that the ability of optimization independent of DTD is very limited, the introduction of DTD can have more optimization space, but the current algorithm has the scope to improve the thoroughness and efficiency of the optimization.

2. Constructing query algebra

At present, many kinds of XML query algebra have been put forward by scholars at home and abroad. These query algebras are more or less borrowed from some ideas in relational algebra. such as timber tax, the domestic orientxa, as well as xal,xom,opal,sal. At present, the emphasis of XML algebra is on the expression of query semantics, there is no or little consideration of query optimization based on algebra, and there is no uniform standard, and there is little generality among various algebras.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.