Storing XML in a relational database

Source: Internet
Author: User
Tags date define definition ibm db2 one table sybase table name unique id
xml| Data | Database Introduction

There are various ways to solve the problem of effectively and automatically transforming XML into and out of relational databases. Database vendors such as IBM, Microsoft, Oracle, and Sybase have developed assistive tools to transform XML into database tables. The various solutions are as follows.


The Oracle XML SQL Utility models an XML document element into a set of nested tables. A nested element is modeled by using an Oracle object data type. The "Sql-to-xml" transformation constructs an XML document by using the table referenced by the object data type and one to one association between the nested elements. "Xml-to-sql" may require improvements in the data model (from relational to object-relational) or reconstruct the original XML document.
IBM DB2 XML Extender allows you to store XML documents as BLOB-type objects or to decompose into a set of tables. The latter has to be transformed into XML collection, defined in XML 1.0 syntax.
Microsoft solves the problem by extending SQL-92 and intervening in OPENXML rowset.
Sybase Adaptive Server is involved in Resultsetxml Java classes as the basis for processing XML documents in two directions.


In this article, we will discuss these vendor's solutions in detail. Thereafter, we will try to answer the following questions:


Can we adjust and simplify the problem?
What is the right approach in a heterogeneous database environment?
I will use the following glossary as an example.


<!--Simple Type-->

<! ELEMENT CURRENCY1 (#PCDATA) >
<! Attlist CURRENCY1 e-dtype nmtoken #FIXED "string"
E-dsize nmtoken #FIXED "3" >

<! ELEMENT CURRENCY2 (#PCDATA) >
<! Attlist CURRENCY2 e-dtype nmtoken #FIXED "string"
E-dsize nmtoken #FIXED "3" >

<! ELEMENT AMOUNT (#PCDATA) >
<! Attlist AMOUNT e-dtype nmtoken #FIXED "decimal" >

<! ELEMENT settlement (#PCDATA) >
<! Attlist settlement E-dtype nmtoken #FIXED "Date" >

<! ELEMENT Bankcode (#PCDATA) >
<! Attlist bankcode e-dtype nmtoken #FIXED "string" >

<! ELEMENT Bankacct (#PCDATA) >
<! Attlist bankacct e-dtype nmtoken #FIXED "string" >

<!--derived type-->

<! ELEMENT account (Bankcode, BANKACCT) >

<! ELEMENT FXTrade (CURRENCY1, CURRENCY2, AMOUNT, settlement, account) >






Oracle xml-sql Utility (XSU)

SQL to XML mappings
Oracle converts an object reference chain from a database into a hierarchy of XML documents. In the object-relational database, the field account in table FXTrade is modeled as an object reference of type AccountType:

CREATE TABLE FXTrade
{
CURRENCY1 CHAR (3),
CURRENCY2 CHAR (3),
AMOUNT NUMERIC (18,2),
Settlement DATE,
Account AccountType//object reference
}

CREATE TYPE AccountType as OBJECT
{
Bankcode VARCHAR (100),
Bankacct VARCHAR (100)
}


Generate the corresponding XML document from the given object-relational model (using the "SELECT * from FXTrade") as follows

<?xml version= "1.0"?>
<ROWSET>
<row num= "1" >
<CURRENCY1>GBP</CURRENCY1>
<CURRENCY2>JPY</CURRENCY2>
<AMOUNT>10000</AMOUNT>
<SETTLEMENT>20010325</SETTLEMENT>
<ACCOUNT>
<BANKCODE>812</BANKCODE>
<BANKACCT>00365888</BANKACCT>
</ACCOUNT>
</ROW>
<!--additional rows ...-->
</ROWSET>


Extracting XML from a database
The following example is taken from Oracle's XSU documentation, with appropriate replacements for SQL statements and using Oracle's pure Java JDBC driver.

First, an instance of Oraclexmlquery is established, followed by a query that represents the result as the form of the XML document above. Similarly, you can extract the form of an XML document into the DOM, in which case you can invoke Qry.getxmldom () instead of getxmlstring ().

Import oracle.jdbc.driver.*;
Import Oracle.xml.sql.query.OracleXMLQuery;
Import java.lang.*;
Import java.sql.*;

Test XML document as a class of String
Class Testxmlsql {

public static void Main (string[] args)
{
try {
Establish a connection
Connection conn = getconnection ("Scott", "Tiger");

Create a query class
Oraclexmlquery qry = new Oraclexmlquery (conn,
"SELECT * from FXTrade");

Get XML string
String str = qry.getxmlstring ();

Print XML output
System.out.println ("The XML output is:\n" + str);

Always close query to release all resources
Qry.close ();
catch (SQLException e) {
System.out.println (E.tostring ());
}
}

Get a connection to a given user name and password
private static Connection getconnection (String username,
String password)
Throws SQLException
{
Registering the JDBC driver
Drivermanager.registerdriver (New
Oracle.jdbc.driver.OracleDriver ());

Establish a connection to the OCI8 driver
Connection conn =
Drivermanager.getconnection (
"Jdbc:oracle:thin: @dlsun489:1521:orcl",
Username, password);

Return conn;
}
}


Storing XML in the database
In the example, we use Oraclexmlsave to store our XML document into the object relational model; InsertXML method to do actual data inserting.

Import java.sql.*;
Import Oracle.xml.sql.dml.OracleXMLSave;
public class Testxmlinsert
{
public static void Main (String args[])
Throws SQLException
{
Connection conn = getconnection ("Scott", "Tiger");
Oraclexmlsave sav = new Oraclexmlsave (conn, Scott). FXTrade ");
Assume that's user passes in this document as 0-arg
Sav.insertxml (Args[0]);
Sav.close ();
}

...
}


If the XML is synchronized with the object-relational model in the database, then everything works, but what if it's different? In this case you have two options.

Adjust Object-Relational model-you can construct a modifiable object-relational view to accomplish multiple table modifications, or, alternatively, use XSLT to break down an XML document into a set of "flat" subdocuments. XSU does not allow property values to be stored; it suggests that you convert attributes to elements.

Summary of Oracle xSU
Modeling XML to SQL mapping with the following object-relational model construction rules: mapping each nested XML element to an object reference of the appropriate type. The mapping rules are implicitly embedded in the database model.

The Java API consists of class oraclexmlquery and Oraclexmlsave.




IBM DB2 XML Extender

SQL to XML mappings
IBM's XML Extender provides two methods of access and storage for using DB2 as an XML warehouse:


XML columns: Storing and retrieving the entire XML document as DB2 column data
XML collection: Decomposing an XML document into a set of relational tables, or synthesizing an XML document from a set of relational tables.


The DTD is stored in a DTD warehouse, called a DB2 table in Dtd_ref, and its schema name is "Db2xml". Each DTD in the Dtd_ref table has a unique ID. The mapping between a database table and the structure of an XML document is defined by means of a data access definition (DAD) file. DAD references a processed document DTD to provide a bridge between the XML document, its DTD, and the mapping rules on top of the database tables.

Here is an example of DAD.

<?xml version= "1.0"?>
<! DOCTYPE DAD SYSTEM "DAD.DTD" >
<DAD>
<dtdid>fxtrade. Dtd</dtdid>
<validation>YES</validation>
<Xcollection>
<prolog>?xml version= "1.0"?</prolog>
<doctype>! DOCTYPE FXTrade FXTrade. DTD </doctype>
<root_node>
<element_node name= "FXTrade" >
<RDB_node>
<table name= "FXTrade"/>
<table name= "Account" key= "ID"/>
<condition>
FXTrade. Account=account.id
</condition>
</RDB_node>
<element_node name= "CURRENCY1" >
<text_node>
<RDB_node>
<table name= "FXTrade"/>
<column name= "CURRENCY1" Type= "CHAR (3)"/>
</RDB_node>
</text_node>
</element_node>
<element_node name= "CURRENCY2" >
<text_node>
<RDB_node>
<table name= "FXTrade"/>
<column name= "CURRENCY2" Type= "CHAR (3)"/>
</RDB_node>
</text_node>
</element_node>
<element_node name= "AMOUNT" >
<text_node>
<RDB_node>
<table name= "FXTrade"/>
<column name= "AMOUNT" type= "DECIMAL (18,2)"/>
</RDB_node>
</text_node>
</element_node>
<element_node name= "Settlement" >
<text_node>
<RDB_node>
<table name= "FXTrade"/>
<column name= "settlement" type= "DATE"/>
</RDB_node>
</text_node>
</element_node>
<element_node name= "Account" >
<element_node name= "Bankcode" >
<text_node>
<RDB_node>
<table name= "Account"/>
<column name= "Bankcode" type= "VARCHAR"/>
</RDB_node>
</text_node>
</element_node>
<element_node name= "Bankacct" >
<text_node>
<RDB_node>
<table name= "Account"/>
<column name= "Bankacct" type= "VARCHAR"/>
</RDB_node>
</text_node>
</element_node>
</element_node> <!--end of account element-->
</element_node> <!--end of FXTrade element-->
</root_node>
</Xcollection>
</DAD>


DAD defines mappings between XML documents and relational database columns by using Element_node to Rdb_node associations. The Element_node FXTrade at the top level is defined as a connection between the table FXTrade and account, with the field ID in the Account table as the primary key. The child element CURRENCY1 is mapped to the field CURRENCY1 in the table FXTrade, and so on. Note that the account element does not contain any rdb_node-this is not required, because the connection is already defined in the preceding and FXTrade. The child elements of account, Banccode, and BANKACCT are defined separately in the corresponding columns in the Account table. Atomic XML elements are marked as Text_node in DAD. In the above example, all elements except FXTrade and account are atomic.

Extracting XML from a database
Process XML document synthesis and decomposition through stored procedures: stored Procedures Dxxgenxml () Extract XML documents from the database; stored procedures Dxxshredxml () store XML documents into the database.

The main input parameters for Dxxgenxml () are


DAD: Stored in the form of a DB2 CLOB data type;
Result Table name: The constructed XML document is forwarded to this table.


The other input parameters specify the maximum number of rows to return, and the <RDB_node> <condition> element Discard (override) option. The output parameters include the actual number of rows returned, the return code, and the return message.

A detailed example of a stored procedure that can be found within a C program in "IBM DB2 Universal Database XML Extender administration and programming, Version 7".

Storing XML in the database
Putting an XML document into a database is done through the stored procedure Dxxshredxml ().

The input parameters for Dxxshredxml () are


DAD: Stored in the form of a DB2 CLOB data type;
Enter an XML document: stored as a DB2 xmlclob data type.


Its output parameter is a return code and a return message.

Summarize
The xml-sql mapping rule is specified by means of a data access definition (DAD) file, which is an XML document. The DB2 XML Extender management facility includes a means of constructing DAD files for each persistent DTD.

Further enhancements will include the new Xml-sql conversion syntax, which will use the XML transformation language and is a subset of the XSLT for the consortium.




Microsoft SQL Server 2000

SQL to XML mappings
The Sql-to-xml and Xml-to-sql mapping rules for SQL Server use different syntax. The details of the mappings are discussed in the following extraction and storage paragraphs.

Extracting XML from a database
The mapping between database columns and XML elements or attributes is defined by the as alias in the SELECT:

< database columns > as [element name! nesting level! attribute name!]

as follows, the top layer of the document is given level 1. By default, the mapped column data is on the property value. You can use the indicator "element" To change this default setting.

The process of generating XML from a database is two steps.

Step 1. Create a as-alias that you want to output the atomic elements in the XML; the alias defines the parent/child connection between the elements. The following table shows an alias for our example document.

FXTrade/* level=1 *
CURRENCY1 [fxtrade!1! CURRENCY1]
CURRENCY2 [fxtrade!1! CURRENCY2]
AMOUNT [fxtrade!1! AMOUNT]
Settlement [fxtrade!1! Settlement]
Account/* level=2 *
Bankcode [account!2! Bankcode]
BANKACCT [account!2! BANKACCT]


Step 2. Define the output tree structure in SQL. Define each level of the tree through the SELECT statement, and then combine the various levels into the tree by means of the UNION all statement. The level-1 SELECT statement involves the atomic element name at all levels. Each SELECT statement involves a tree-level tag and its parent tag. A single record in the result set corresponds to the root, as defined in the first SELECT statement below.

SELECT
1 as Tag,
NULL as Parent,
NULL as [fxtrade!1! CURRENCY1],
NULL as [fxtrade!1! CURRENCY2],
NULL as [fxtrade!1! AMOUNT],
NULL as [fxtrade!1! Settlement],
NULL as [account!2! Bankcode],
NULL as [account!2! BANKACCT]
From
FXTrade
UNION All
SELECT
2,
1,
FXTrade. CURRENCY1,
FXTrade. CURRENCY2,
FXTrade. AMOUNT,
FXTrade. Settlement,
Account. Bankcode,
Account. Bankacct
From
FXTrade, Account
WHERE
FXTrade. account = Account.id
ORDER BY [account!2! Bankcode],
[account!2! BANKACCT]
For XML EXPLICIT, ELEMENTS


For XML constructs an XML document by parsing the markup and as-aliases in a grouped rowset. Keyword EXPLICIT Select the most flexible, user-defined pattern for constructing an XML document. Another mode AUTO constructs an XML document by applying a default rule. Keyword ELEMENTS model SQL columns at the element level; otherwise, the default is to model the SQL column at the attribute level.

Storing XML in the database
Using OPENXML to store an XML document, it is a new rowset function, similar to a table or view. You can use OPENXML to insert or update or SELECT into the destination table. The OPENXML simplified syntax is shown below:

OPENXML (<xml document handle, < path Pattern>, < logo >)
With (Mode | table)
The process of storing XML documents is three-step.


Using stored procedure sp_xml_preparedocument, you get an XML document handle by compiling the XML document into an internal DOM representation.
Constructs a pattern by associating the schema field with an atomic XML element. Define XML elements by path pattern (absolute base path) plus relative element path. An element-centric mapping is indicated by a flag value of 2. You can use an existing table to replace a pattern with a field name that is equivalent to an XML name.
Use stored procedures sp_xml_removedocument to remove compiled XML documents from memory.


These steps are shown in the following examples.

DECLARE @idoc int
DECLARE @doc varchar (1000)
SET @doc = '
<FXTRADE>
<CURRENCY1>GBP</CURRENCY1>
<CURRENCY2>JPY</CURRENCY2>
<AMOUNT>10000</AMOUNT>
<SETTLEMENT>20010325</SETTLEMENT>
<ACCOUNT>
<BANKCODE>812</BANKCODE>
<BANKACCT>00365888</BANKACCT>
</ACCOUNT>
</FXTRADE> '
--Establishes an internal DOM representation of an XML document.
EXEC sp_xml_preparedocument @idoc OUTPUT, @doc
--Executes a SELECT statement that uses the OPENXML rowset provider.
SELECT *
From OPENXML (@idoc, '/fxtrade/account ', 2)
With (
CURRENCY1 CHAR (3), '. /@CURRENCY1 ',
CURRENCY2 CHAR (3), '. /@CURRENCY2 ',
AMOUNT NUMERIC (18,2), '.. /@AMOUNT ',
Settlement DATETIME, '.. /@SETTLEMENT ',
Bankcode VARCHAR, ' @BANKCODE ',
Bankacct VARCHAR (MB), ' @BANKACCT ')
EXEC sp_xml_removedocument @idoc


Summarize
You do not use symmetric syntax for extraction and archiving of Microsoft SQL Server 2000,xml documents. Extraction expands a select-clause by using a for XML construct. Storage involves a rowset function OPENXML, analogous to a table or view. The extraction mapping rule is a parent/child contact between XML document elements that is based on (a) the markup that is involved in the specified tree level and (b) the field associated with the table. The store reconstructs the XML document into a flat pattern or table, and uses XPath notation to define a field-element association.




Sybase Adaptive Server

SQL to XML mappings
Sybase uses an XML document type ResultSet to describe the XML document metadata (element name, type, size, and so on) and the actual row data. The following excerpt from the hypothetical Fxtradeset.xml:

<?xml version= "1.0"?>
<! DOCTYPE ResultSet SYSTEM "RESULTSET.DTD" >
<ResultSet>
<ResultSetMetaData>
<columnmetadata
...
Getcolumnlabel= "CURRENCY1"
Getcolumnname= "CURRENCY1"
Getcolumntype= "12"
.../>
...
</ResultSetMetaData>
<ResultSetData>
<Row>
<column name= "CURRENCY1" >GBP</Column>
...
</Row>
</ResultSetData>
</ResultSet>


The ResultSet DTD does not seem to allow the definition of nested elements.

Extracting XML from a database
The Java class Resultsetxml has a constructor that accepts an SQL query as an argument, and then the Getxmlltext method extracts an XML document from the result set:

JCS.XML.RESULTSET.RESULTSETXML RSX = new JCS.XML.RESULTSET.RESULTSETXML
("SELECT * from FXTrade", <other parameters>);
Fileutil.string2file ("Fxtradeset.xml", Rsx.getxmltext ());


Storing XML in the database
The Resultsetxml Class Builder also accepts an XML document as an argument. The method Tosqlscript then generates a sequence of SQL statements to insert/update from the result set to the specified table.

String xmlstring = fileutil.file2string ("Fxtradeset.xml");
JCS.XML.RESULTSET.RESULTSETXML RSX = new JCS.XML.RESULTSET.RESULTSETXML
(xmlstring);
String SqlString = Rsx.tosqlscript ("FXTrade", <other parameters>)


Summarize
The extraction and storage of XML documents are inherently symmetric. The store does not seem to allow more than one table to be modified. Extracts the results of a SQL query into a document with a flat structure.




Manufacturer comparison

Vendor Mapping Rules single table/multiple table conversion means symmetric extraction/storage
implicit in Oracle; if the XML document and the object-relational model match by constructing the object-relational model multiple specified Java class symmetry
IBM data access definition file multiple specified stored procedure symmetry
Microsoft SQL Extensions; Rowset functions multiple tables are used for extraction; a single table is used to store asymmetric OPENXML by using SQL constructs for XML and rowset
Sybase result set DTD single table; query can contain multiple tables by using Java class symmetry


The common features among vendors are:


XML persistence is based on a special basis, that is, there is no general facility to store any XML document, and a special mapping is required if the XML document uses a new grammar;
Storage often requires data processing, such as reformatting values/dates according to the area used, and using XSLT for XML data collation.





A strategy that can be substituted

The tasks that XML documents store in the database can be divided into stages? First, storing XML in a normal database structure without applying any mapping rules, and secondly, providing a framework for subsequent processing? The advantage of this approach is that there is a common collection agent (acquisition agent) for virtually any XML document. The remainder of this article only proposes this workaround.

Under this strategy, the XML document is saved as a normal tree structure--Raw tree data--for further application processing. The framework for further data processing is a set of database objects (stored procedures), similar to the DOM API, used to manipulate data through traditional SQL processing.

The benefits of using the opposite frame approach to the fully automated approach are that database groups (population) are often controlled by the application logic distributed between "pure" application code, transaction processing, database layer (layer), and stored procedures, without mentioning that some tables may reside in a remote database.

In addition, in a heterogeneous database environment, there is a clear benefit of having a unified tool to accomplish the same tasks for different databases.

The author's solution, Objectcentric Solutions x-persistence Utility, implements the strategy discussed here.

The original tree structure is defined by the following table.

1) tree node definition

CREATE TABLE Nodedef (
NodeID udd_id not NULL,//Unique node ID
Dimid udd_id not NULL,//Tree class: XML
Name Udd_name not NULL,//node name
Value Udd_paramvalue NULL,//Node values
Value_type Udd_type NULL,//value type
Power Udd_power NULL,//node tree level
IsRoot udd_logical not NULL,//Top level node flag
Descriptor Udd_descriptor NULL,//DOM node type
Color Udd_color NULL//Informal data
)


2 parent-child relationship between nodes

CREATE TABLE Nodelinks (
ParentID udd_id not NULL,//parent node
childID udd_id not NULL//child node
)





Case

Storing XML documents requires calling the XML2SQL program in the database:

Xml2sql <xml document file name >

Extract the XML document from the database as a tree structure using the following stored procedure:


get_tree_roots < tree species >--Extract all document root IDs for a given forest
Get_tree < root id>--Extract all nodes of a given root (document)
Get_children < node, level >--Extracts all child nodes at a specific level for a given node


Implementation Details:


The current platform includes: Sybase,ms SQL Server.
Platform evaluated: ORACLE,DB2 and PostgreSQL.
The utility is built on top of the Xerces XML parser.
Database installation involves only adding tables, and navigating database tables into a tree-structured stored procedure.


X-persistence Utility is currently used as the core of xml-based data acquisition agents in Open Finance Laboratory, a customizable integration platform for financial applications such as securities management, liability management, and risk management. Open Finance Laboratory accepts FpML as a descriptive language for interest rate export. X-persistence Utility is currently used as a FPML/FRA data acquisition agent.



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.