SQL Server 2000 and XML for SQL Server 2000 Web versions (SQLXML) provide three ways to store XML data: XML Bulk Load and Updategrams, the two client technologies use annotated outlines to specify the ing between XML document content and database tables. OpenXML is a server-side technology, it allows you to define a Link View in an XML document. With the relationship view of OpenXML, you can use T-SQL code to query the data in the XML document and store the results in your SQL Server database.
Each of these three technologies is designed for a specific purpose. XML Bulk Load stores data from a large XML file in SQL Server; Updategrams executes SQL Server Data Optimization updates (optimization updates are unlocked, in this update, the system checks whether other users have changed the data after reading the data. OpenXML provides a familiar method for accessing XML data links.
OpenXML is the most flexible among the three technologies because it provides a programming model (T-SQL) before storing XML data in the SQL Server database, you can use this programming model to write business rules or execute computing logic on XML data. However, because OpenXML is a Server-based technology, if you frequently use it or have a large number of documents, it will reduce the performance of SQL Server. However, if you use Microsoft. net framework component, you can use ADO.. NET dataset bypasses these performance and scalability restrictions, ADO. NET dataset gives you a powerful technology-to store XML data in SQL Server contains a complete programming model.
Ing data sets, data tables, and XML
You can use a dataset to generate XML query results from SQL Server. By providing a relational data cache that can be used on clients and intermediate computers, datasets can load and maintain a variety of data sources (including SQL Server, other relational databases, and XML).
When you load a dataset from an XML document, the dataset must map the data stored in the Hierarchical XML representation to the relational representation of the dataset. For example, if your XML document contains a list of Order elements, and it has a nested LineItem element as a child element, this document is usually mapped to the Orders and LineItems data tables in the relational performance. The purpose of this ing is the same as that of OpenXML's method of constructing the relationship view on XML documents using Xpath queries. However, unlike the Xpath specification, a dataset has its own data ing method.
The dataset uses the XML outline definition (XSD) outline to map data from the XML document to the relational data cache of the dataset. The dataset provides two methods for you to specify an outline for ing XML data. First, you can reference an XSD outline that defines the elements, attributes, and relationships used in the XML document. Alternatively, you can deduce the outline directly from the structure of the document. In other words, you can create an outline for a dataset by checking the structure and content of the XML document.
When you reference the XSD outline, the dataset uses the relationships between elements, attributes, and elements defined in this outline to construct the data tables, data columns, and data relationships in the relational data cache, you can use this data cache to store the mapped XML data. When talking about the structure or outline in the relational data cache, it is generally called the data cache form. When a dataset is processed in the outline, it applies a set of rules, which are similar to the default ing rules used by Updategrams and XML Bulk Load when no annotation is specified in the ing outline, a dataset uses this rule to create a table for storing the mapped XML data. The dataset ing rules are as follows:
· Composite elements-elements that contain other elements or attributes-are mapped to tables.
· Attributes and simple-valued subelements-elements that only contain data and do not contain other elements or attributes-are mapped into columns.
· The data type is mapped from the XSD type to the. NET type.
Inference is a fast and convenient way to load XML documents into a dataset. Tables, columns, and relationships are automatically created by "introspection", and "self-measurement" is the process of checking the structure and content of XML documents by dataset. Although the use of reasoning significantly reduces your programming burden, it also makes your implementation unpredictable, because a small change to the XML document may cause the dataset to create different forms of tables. These changes may cause unexpected interruptions to your application. Therefore, I recommend that you reference an outline for an application to restrict the use of reasoning when creating a prototype.
Now let's take a look at how to use the outline to create a client data cache that can be used to update the SQL Server database.
Ing XML orders
Assume that you are writing an application that accepts user orders. The order is in XML format and its XSD outline 1 is defined. This outline defines three composite types: customer data, order data, and linear data items for orders. A top-level Customer element defines the root of the XML document. This closed system defines the relationship between elements: the Order element contains a LineItem element, and the Customer element contains an Order element. Figure 2 shows an XML document instance that conforms to the outline defined in figure 1.
Figure 1: XSD outline
<? Xml version = "1.0" encoding = "UTF-8"?>
<Xs: schema targetNamespace = "urn: Sep2003Example" elementFormDefault = "qualified"
Xmlns = "urn: Sep2003Example"
Xmlns: xs = "http://www.w3.org/2001/XMLSchema">
<Xs: complexType name = "OrderType">
<Xs: sequence>
<Xs: element name = "OrderID" type = "xs: integer"/>
<Xs: element name = "LineItem" type = "LineItemType"/>
</Xs: sequence>
</Xs: complexType>
<Xs: complexType name = "LineItemType">
<Xs: sequence>
<Xs: element name = "ProductID" type = "xs: int"/>
<Xs: element name = "Quantity" type = "xs: int"/>
<Xs: element name = "UnitPrice" type = "xs: decimal"/>
</Xs: sequence>
</Xs: complexType>
<Xs: complexType name = "CustomerType">
<Xs: sequence>
<Xs: element name = "CustomerID" type = "xs: string"/>
<Xs: element name = "Order" type = "OrderType"/>
</Xs: sequence>
</Xs: complexType>
<Xs: element name = "Customer" type = "CustomerType">
</Xs: element>
</Xs: schema>
Figure 2: an XML document example
<? Xml version = "1.0"?>
<Customer xmlns = "urn: Sep2003Example">
<CustomerID> ALFKI </CustomerID>
<PO> 9572658 </PO>
<Address>
<Street> One Main Street </Street>
<City> Anywhere </City>
<State> NJ </State>
<Zip> 08080 </Zip> </Address>
<Order>
<OrderID> 10966 </OrderID>
<LineItem>
<ProductID> 37 </ProductID>
<UnitPrice> 26.50 </UnitPrice>
<Quantity> 8 </Quantity>
<Description> Gravad lax </Description>
</LineItem>
<LineItem>
<ProductID> 56 </ProductID>
<UnitPrice> 38.00 </UnitPrice>
<Quantity> 12 </Quantity>
<Description> Gnocchi di nonna Alice </Description>
</LineItem>
</Order>
</Customer>
The C # code displayed in List 1 uses the ReadXmlSchema method to load the outline in Figure 1 into a dataset called orderDS. ReadXmlSchema creates three data tables that correspond to the Customer, Order, and LineItem elements defined in the outline. Therefore, you can verify that this outline creates an expected table in the relational data cache. The printDSShape method writes the name of each table to the console, followed by the list of columns and the Data Type of each column.
List 1: C # code for establishing a relational data cache
Using System;
Using System. Collections;
Using System. Data;
Using System. Data. SqlClient;
Using System. Xml;
Public class XMLMap
{
Public static void Main ()
{
// Create a dataset and read outline
DataSet orderDS = new DataSet ("CustOrder ");
OrderDS. ReadXmlSchema ("CustOrderLitem. xsd ");
// Print the dataset form
PrintDSShape (orderDS );
// Read an order in XML format into a dataset
OrderDS. ReadXml ("Order. xml", System. Data. XmlReadMode. IgnoreSchema );
// Print the data in the dataset
PrintDSData (orderDS );
// Insert business rules and database update logic here
}
Private static void printDSShape (DataSet ds)
{
Foreach (DataTable dt in ds. Tables)
{
Console. WriteLine ("{0}", dt. TableName );
// Print the column name and type
Foreach (DataColumn dc in dt. Columns)
Console. WriteLine ("\ t {0} \ t {1}", dc. ColumnName, dc. DataType. ToString ());
}
}
Private static void printDSData (DataSet ds)
{
Foreach (DataTable dt in ds. Tables)
{
Console. WriteLine ("\ n {0}:", dt. TableName );
// Print the column header
Foreach (DataColumn dc in dt. Columns)
Console. Write ("{0} \ t", dc. ColumnName );
Console. WriteLine ("");
// Output data
Foreach (DataRow dr in dt. Rows)
{
Foreach (DataColumn dc in dt. Columns)
System. Console. Write ("{0} \ t", dr [dc]);
System. Console. WriteLine ("");
}
}
}
}
Check the column name carefully. Although the Customer_Id and Order_Id columns are not specified in the outline, they still appear in the data table. ReadXmlSchema automatically adds these columns to the dataset. The dataset uses these columns as external keys to simulate the relationship between the Customer Element and Its Order element, and between the Order Element and Its LineItem element. In typical cases, XML uses nested relationships to replace external keys. Therefore, datasets automatically generate their own primary keys and external keys between data tables, and store them in these columns.
Please also carefully check the data type in Figure 3-The dataset has mapped the data type from the XML outline data type to the corresponding. net data type. When you load an XML document into a dataset, the dataset converts each value from XML to the corresponding. NET type.
Figure 3: generated data types and records
Customer
CustomerID System. String
Customer_Id System. Int32
Order
OrderID System. Int64
Order_Id System. Int32
Customer_Id System. Int32
LineItem
ProductID System. Int32
Quantity System. Int32
UnitPrice System. Decimal
Order_Id System. Int32
Customer:
CustomerID Customer_Id
ALFKI 0
Order:
OrderID order_Id Customer_Id
10966 0 0
LineItem:
ProductID Quantity UnitPrice order_Id
378 26.5 0
56 12 38 0
After the outline is loaded into a dataset, all you need to do to complete the link ing is to load the XML data into the dataset. The ReadXml method in list 1 opens the file named Order. xml, as shown in file 2. Next, it reads the data in the file into the data table in the dataset you just read the outline. Your XML order can now be accessed through a dataset.
To demonstrate how to access data in a dataset, the printDSData method in list 1 is used to navigate to a data table. The column name is displayed for each table, followed by all rows in the table. Figure 3 shows that the Customer_Id and Order_Id columns added to the dataset by the ReadXmlSchema method are automatically generated.
Note that the three elements PO, Address, and Description in Order. xml are not mapped to the data table. The data is ignored because the outline you provide to the dataset does not contain these elements. When the dataset establishes the shape of the relational data cache and loads XML data, it simply ignores data not described in the outline. Even if the XML order you receive from the customer contains unexpected extra data, this simple feature allows your code to work properly.
Create an application that uses data cache
Now you know how to use a dataset to create a relational data cache for XML data. You can use this technology to implement an application that executes business logic and updates SQL Server. Business logic is relatively straightforward when you use a dataset programming model. ADO. NET provides several options for you to update data in SQL Server, including using data adapters, writing your own queries, and executing stored procedures. It is easy to map XML data into a relational model by using a dataset. The rest is yours.