XML attributes in Ado.net

Source: Internet
Author: User
Tags empty end final sql net object model object serialization string
Ado|xml

There is no doubt that XML and some of its related technologies (including XPath, XSL transformation, and XML schemas) are the foundation of Ado.net. Compared with ADO, the interoperability of the Ado.net object model is greatly improved, but in fact, XML is the key element that plays an important role. In ADO, XML is just a (non-default) I/O format for preserving the contents of a disconnected recordset. The involvement of XML in the construction and interaction of ado.net is much deeper. The following points can be used to summarize the stronger interaction and integration of ado.net with XML:

• Object serialization and remoting

• Dual Programming interface

XML-driven batch update (SQL Server 2000 only)

In Ado.net, you can save objects to an XML document and restore objects from an XML document in several ways. In summary, this ability belongs only to the DataSet object, but can be extended to other container objects with minimal code. Saving objects such as DataTable and DataView to XML can essentially be considered a special case of dataset serialization.

In addition, the Ado.net and XML classes provide a unified intermediate API that programmers can use by synchronizing a dual programming interface. You can access and update data by using either the node-based layering method of XML or the tabular DataSet-relational method based on columns. You can switch to the XML DOM at any time from the data set representation of the data, and vice versa. The data is synchronized, and any changes you enter in one of the models are immediately reflected in the other model and can be seen. In this article, I'll discuss ado.net to XML serialization and XML data access, which is the first two points in the list above. Next month, I'll focus on XML-driven batch updates-one of the coolest features you get from SQL Server Extensions XML (SQLXML 2.0).

Datasets and XML
Like any other. NET object, the DataSet object is stored in memory in binary format. However, unlike other objects, datasets are always remoting and serialized in a special XML format called DiffGram. The dataset is automatically rendered as DiffGram when it enters the bounds of the application domain or the physical boundary of the computer. On the target side, the dataset is silently rebuilt as a binary object and can be used immediately. An application can use the same serialization functionality in a set of methods, with a pair of methods clearly prominent. They are ReadXml and WriteXml. The following table shows the methods that you can use to work with XML data sets in reading and writing.


GetXml
• Returns a string that is the XML representation of the data stored in the dataset

• Does not include any schema information


GetXmlSchema
• Returns a string that is the XML schema information for the dataset


ReadXml
• Populating a DataSet object with specified XML data read from a stream or file


ReadXmlSchema
• Load the specified XML schema information into the current DataSet object


WriteXml
• Write XML data (or schema) that represents the dataset

• can write to stream or file


WriteXmlSchema
• Writes a String that is the XML schema information for the dataset

• can write to stream or file

As shown in the table above, when working with datasets and XML, you can manage data and schema information as separate entities. You can accept the XML schema from the dataset and use it as a string. You can also write it to a disk file or load it into an empty DataSet object. In comparison to the methods listed in the previous table, the DataSet object also contains two XML-related properties-namespace and Prefix. Namespace determines the XML namespace that is used to qualify the scope of XML attributes and elements when they are read to the dataset. The prefix that is a namespace alias is stored in the Prefix property.

Back to the top of the page
Building datasets from XML
The ReadXml method populates DataSet objects that are read from multiple sources, including instances of disk files,. NET streams, or XmlReader objects. This method can handle any type of XML file, but if the XML file has a fairly irregular, unstructured structure, there may be problems when rendering in columns and rows.

The ReadXml method has several overloads, all of which are very similar. They accept the XML source and optionally the XmlReadMode value as a parameter. For example:

Public XmlReadMode ReadXml (String, XmlReadMode);

This method creates a relational schema for the dataset based on the specified read mode and whether the schema already exists in the dataset. The following code fragment shows the typical code used to load a dataset from XML.

StreamReader sr = new StreamReader (fileName);
DataSet ds = new DataSet ();
Ds.   READXML (SR); Defaults to Xmlreadmode.auto
Sr. Close ();

When you load the contents of an XML source into a dataset, READXML does not merge rows whose primary key information matches. To merge an existing dataset with a dataset loaded from XML, you must first create a new dataset and then merge the two datasets using the merge method. During a merge, the rows to be overwritten are those that have a matching primary key. You can also use an alternative method to merge existing DataSet objects with content that is read from XML, through the DiffGram format (discussed later in detail).

The following table explains the multiple read modes supported by READXML. You can set them by using the XmlReadMode enumeration.


IgnoreSchema
Ignores any embedded schemas and relies on the existing schema of the dataset

ReadSchema
Read any embedded schemas and load data and schemas

InferSchema
Ignores any embedded schemas and infers schemas from XML data

DiffGram
Read the DiffGram and add the data to the current schema

Fragment
Reads and adds an XML fragment until the end of the stream

The default read mode is not listed in the table, this is Xmlreadmode.auto. When this mode is set, or if no read mode has been explicitly set, the ReadXml method examines the XML source and selects the most appropriate option.

If an XML source is found to be DiffGram, the source is loaded as a DiffGram. If the source exactly contains an embedded schema or a reference to an external schema, ReadXmlSchema is used to load the source. Finally, if no schema information exists in the XML source, the ReadXml method infers the schema using the dataset's InferXmlSchema method. The relational structure of a dataset (that is, a schema) consists of tables, columns, constraints, and relationships. Let's take a look at what happens when you set each of these patterns.

The Xmlreadmode.ignoreschema option causes the method to ignore any embedded schemas or referenced schemas. Therefore, the data is loaded into the existing dataset schema, and any data that is not appropriate will be discarded. If no schema exists in the dataset, no data is loaded. Note that the empty dataset does not have schema information. Remember that if the XML source is in DiffGram format, the IgnoreSchema option will have the same effect as XmlReadMode.DiffGram.

No schema in the DataSet, no data would be loaded
DataSet ds = new DataSet ();
StreamReader sr = new StreamReader (fileName);
Ds. ReadXml (SR, Xmlreadmode.ignoreschema);

The XmlReadMode.ReadSchema option is valid only for embedded schemas and does not recognize external references. It can add a new table to the dataset, but throws an exception if any of the tables defined in the embedded schema already exist in the dataset. You cannot use the ReadSchema option to change the schema of an existing table. If the dataset does not contain a schema (that is, the dataset is empty) and there is no embedded schema, no data is read and loaded. READXML can only read embedded schemas that are defined using XML schema definition language (XSD) or Xml-data reduced (XDR). No document type definition (DTD) is supported.

If the Xmlreadmode.inferschema option is set, READXML infers the schema directly from the structure of the XML data and ignores any embedded schemas that may exist. The data is loaded only after the schema has been inferred. You can extend an existing schema by adding a new table or adding a new column to an existing table, depending on the situation. You can use the InferXmlSchema method of the dataset to load the schema from the specified XML file into the dataset. In some ways, you can control the XML elements that are processed during the schema inference operation. With the signature of method InferXmlSchema, you can specify a set of namespaces whose elements will be excluded from inference.

void InferXmlSchema (String fileName, string[] rgnamespace);

DiffGram is the XML format that Ado.net uses to save the state of a dataset. Similar to SQLXML's Updategram format, DiffGram includes both the current state and the original state of the data row. When you load DiffGram using READXML, rows with matching primary keys are merged. You can use the XMLREADMODE.DIFFGRAM flag to explicitly instruct READXML to take effect on DiffGram. When using the DiffGram format, the destination dataset must have the same schema as DiffGram, otherwise the merge operation will fail and an exception will be thrown.

If the XmlReadMode.Fragment option is set, the dataset is loaded from the XML fragment. An XML fragment is a valid XML that identifies elements, attributes, and documents. The XML fragment of an element is the markup text of a fully qualified XML element (node, CDATA, processing instruction, comment). The fragment of the property is the property value, and the fragment of the document is the entire collection of content. If the XML data is a fragment, the root-level rule of the fully formatted XML document is not applied. Fragments that match an existing schema are appended to the appropriate table, and fragments that do not match the schema are discarded. ReadXml reads from the current position to the end of the stream. The xmlreadmode.fragment option should not be used to populate an empty and missing schema dataset.

Back to the top of the page
Serializing a DataSet object to XML
The XML representation of a dataset can be written to a file, stream, XmlWriter object, or string using the WriteXml method. An XML representation can include schema information, or it may not include schema information. The actual behavior of the WriteXml method can be controlled by the XmlWriteMode parameters that you can pass. The value in the XmlWriteMode enumeration determines the layout of the output. A dataset representation includes tables, relationships, and constraint definitions. If you do not choose to use the DiffGram format, rows in the table of the dataset are written to only the current version. The following table summarizes the write options that XmlWriteMode can use.


IgnoreSchema
Write dataset content as XML data with no schema

WriteSchema
Writing data set content with embedded XSD schemas

DiffGram
Write dataset content as DiffGram, including original and current values

XmlWriteMode.IgnoreSchema is the default option. The following code shows a typical way to serialize a dataset to XML.

DS is the DataSet
StreamWriter sw = new StreamWriter (fileName);
Ds.   WriteXml (SW); Defaults to XmlWriteMode.IgnoreSchema
Sw. Close ();

Several factors affect the final structure of the XML document created from the DataSet object. These factors include:

•-diffgram or unformatted hierarchical representation of the current content using the XML overall format

• Does the schema information exist

• Nesting Relationships

• How table columns are mapped to XML elements

The DiffGram format is a special XML format that I will further explain later. It does not include schema information, but preserves row state and row errors. Thus, it seems to be able to form a closer representation of a real-time instance of a dataset.

Schema information is always written as an embedded XSD if it exists in the dataset being created. You cannot use it as an XDR, DTD write, or add a reference to an external file. If a name has not been specified for the root node of the generated XML file, the name or NewDataSet of the dataset is accepted. The following code fragment is an example of the XML representation of a DataSet object consisting of two tables, Customers and Orders, whose relationship is formed by CustomerID fields.

<MyDataSet>
<xs:schema .../>
<Customers>
<CustomerID>1</CustomerID>
<FName>John</FName>
<LName>Smith</LName>
</Customers>
<Customers>
<CustomerID>2</CustomerID>
<FName>Joe</FName>
<LName>Users</LName>
</Customers>
<Orders>
<CustomerID>1</CustomerID>
<OrderID>000A01</OrderID>
</Orders>
<Orders>
<CustomerID>1</CustomerID>
<OrderID>000B01</OrderID>
</Orders>
</MyDataSet>

It is difficult for you to determine the relationship between the two tables based on the code listed above. Some information about this is set in the <xs:schema> tree, but beyond that, there is no other information to help infer the conclusion. If the relationship you set on the CustomerID field is grouped into text, it can be expressed as a-all orders issued by a given customer. The XML tree above does not provide an immediate representation of this information. To change the order of the nodes when there is a data relationship in the dataset, you can set the Nested property of the DataRelation object to True. As a result of this change, the XML code derived from the change is as follows:

<MyDataSet>
<xs:schema .../>
<Customers>
<CustomerID>1</CustomerID>
<FName>John</FName>
<LName>Smith</LName>

<Orders> <CustomerID>1</CustomerID> <OrderID>000A01</OrderID></Orders>< Orders> <CustomerID>1</CustomerID> <OrderID>000B01</OrderID></Orders>
</Customers>
<Customers>
<CustomerID>2</CustomerID>
<FName>Joe</FName>
<LName>Users</LName>
</Customers>
</MyDataSet>

As you can see, all orders are now concentrated under the corresponding client subtree.

By default, in an XML table, the columns are rendered as node elements. However, this is only a setting that can be adjusted on a per-column basis. The DataColumn object has a property named ColumnMapping that determines how the column is rendered in XML. The ColumnMapping property accepts the values in the MappingType enumeration listed below.


Element
Map to XML node elements:

<CustomerID>value</CustomerID>



Attribute
Map to XML node properties:

<customers customerid=value>

Hidden
Do not appear in XML data unless you are using the DiffGram format

simplecontent
Map to Simple text:

<Customers>value</Customers>

If the XML output format is DiffGram, the Hidden mapping type is ignored. In this case, however, the DiffGram representation of the column contains a special property that marks the column as initially hidden for XML serialization. The simplecontent mapping type is not always available and can be used only when there are columns in the table.

Back to the top of the page
DiffGram format
DiffGram is simply an XML string written to a specific schema that represents the contents of the dataset. It is by no means a. NET type. The following code fragment shows how to serialize a DataSet object to DiffGram.

StreamWriter sw = new StreamWriter (fileName);
Ds. WriteXml (SW, XmlWriteMode.DiffGram);
Sw. Close ();

The resulting XML code is placed in the <diffgr:diffgram> node and contains up to three different data sections, as follows:

<diffgr:diffgram>
<MyDataSet>
:
</MyDataSet>

<diffgr:before>
:
</diffgr:before>

<diffgr:errors>
:
</diffgr:errors>
</diffgr:diffgram>

The first section of the DiffGram is mandatory, representing the current instance of the data. It's almost exactly the same as the XML output you can get from normal serialization. The main difference between the two is that the DiffGram format never includes schema information.

The data section includes the current values for rows in the dataset. The original rows, including the deleted rows, are stored in the <diffgr:before> section. Only the modified or deleted records are listed here. Newly added records are listed only in the data instance because they do not have the preceding references to which they are linked. The rows in these two sections are tracked with a unique ID. These lines represent the increment between the original version of the dataset and the current version.

Finally, in the <diffgr:errors> section, list any messages associated with the pending errors on the line. Similarly, in this case, the row is tracked using the same unique ID that discusses whether the change is to be made. The DiffGram node can be marked with special properties to correlate elements across different sections (data instances, changes, and errors).


Diffgr:haschanges
The row has been modified (see related lines in <diffgr:before>) or inserted.

Diffgr:haserrors
The row has an error (see Related rows in).

Diffgr:id
Determines the id:tablename+rowidentifier used for cross section coupling rows.

Diffgr:parentid
Determines the ID of the parent row used to identify the current row.

Diffgr:error
The error text that contains the line in <diffgr:before>.

Msdata:roworder
The ordinal position of the line in the tracking dataset.

Diffgr:hidden
Determines which msdata:hiddencolumn= are marked as hidden??? ... The column.

The Ado.net framework only provides explicit XML support for DataSet objects. However, it is not particularly difficult to convert DataView or DataTable to XML. In both cases, you must use a temporary dataset as a container for the rowset to be saved as XML. The code necessary to save the DataTable as XML is simple.

void Writedatatabletoxml (String fileName, DataTable DT)
{
Duplicate the table and add it to a temporary DataSet
DataSet dstmp = new DataSet ();
DataTable dttmp = dt. Copy ();
DSTMP.TABLES.ADD (DTTMP);

Save the temporary DataSet to XML
StreamWriter sr = new StreamWriter (fileName);
Dstmp.writexml (SR);
Sr. Close ();
}

Each Ado.net object can only be referenced by a container object, and for this simple reason, it is important to replicate the DataTable object. You cannot have the same instance, for example, a DataTable object belongs to two different DataSet objects.

Unlike a DataTable object, DataView is not a standard part of a dataset, so in order to save it to XML, you should convert DataView to a Table object. This process can be implemented using the following code fragment:

void Dataviewtodatatable (DataView dv)
{
Clone the structure of the table behind the view
DataTable dttemp = dv. Table.clone ();
Dttemp.tablename = "Row"; This is arbitrary!

Populate the table with rows in the view
foreach (DataRowView DRV in DV)
Dttemp.importrow (DRV. ROW);

Giving a custom name to the DataSet can help
Come up with a clearer layout but are not mandatory
DataSet dstemp = new DataSet (DV. Table.tablename);

ADD The new table to a temporary DataSet
DSTEMP.TABLES.ADD (dttemp);
}

The first step is to clone the structure of the table behind the processed DataView object. Next, iterate through all the records in this view and add the corresponding rows to the temporary DataTable. The DataTable is then added to the staging dataset, and the DataTable is serialized. You can also try to provide a table name to the dataset and provide a custom format for the entire XML output. For example:

<TableName>
<Row>
<column1>?? </Column1>
:
</Row>
<Row>
:
</Row>
<Row>
:
</Row>
</TableName>

Back to the top of the page
XmlDataDocument class
The XML and Ado.net framework provides a unified model for accessing data expressed in the form of XML and relational data. Among them, the key XML classes are XmlDataDocument, and the DataSet is the key Ado.net class. Specifically, XmlDataDocument inherits from the base class XmlDocument and differs from the base class only in terms of the ability to keep the DataSet object synchronized. When synchronizing, the target of the DataSet class and the XmlDataDocument class is the same rowset, and you can apply the changes through two interfaces (nodes and relational tables), and make the two classes see the changes immediately. Basically, datasets and XmlDataDocument provide two sets of methods for the same data. Therefore, you can apply XSLT transformations to relational data, query relational data through XPath expressions, and use SQL to select XML nodes.

You can bind a DataSet object and a XmlDataDocument object together in several ways. The first method is to pass a Non-empty DataSet object to the constructor of the XmlDataDocument class.

XmlDataDocument doc = new XmlDataDocument (dataset);

Similar to its base class, XmlDataDocument provides an XML DOM method that uses XML data, so it is very different from XML readers and writers. The following example shows another way to synchronize the two objects, which is to create a valid Non-empty DataSet object from a Non-empty instance of the XML DOM.

XmlDataDocument doc = new XmlDataDocument ();
Doc. Load (FileName);
DataSet = Doc. DataSet;

You can use the XmlDataDocument DataSet property to turn an XML document into a DataSet object. This property instantiates and fills the DataSet object, and returns the object. When you first access the DataSet property, the dataset is associated with the XmlDataDocument. Methods Getelementfromrow and Getrowfromelement switch between the XML form of the data and the relational view. To view the XML data from a relational perspective, you must first specify the schema that you want to use for the data map. This can be achieved by invoking the ReadXmlSchema method on the same XML file. Alternatively, you can manually create the tables and columns that are required in the dataset.

However, there is also a way to synchronize XmlDataDocument and DataSet objects, which is to populate them separately when they are empty. For example:

DataSet DataSet = new DataSet ();
XmlDataDocument xmldoc = new XmlDataDocument (dataset);
XmlDoc. Load ("File.xml");

Keeping two of objects synchronized can provide unprecedented flexibility, as mentioned earlier, where you can move between records using two distinct navigation types. In fact, you can use SQL-like queries against XML nodes, as well as XPath queries on relational rows.

Not all XML files can be successfully synchronized with the dataset. In order to remain synchronized, an XML document must have a regular tabular structure that can be mapped to a relational structure, with the same number of columns in each row in the relational structure. When the XML document is rendered as a DataSet object, it loses any XML-specific information that it may already have and does not have a corresponding portion of the relationship. This information includes comments, declarations, and processing instructions.

Back to the top of the page
Summary
In Ado.net, XML is not just a simple output format for serializing content. You can use XML to serialize the entire contents of a DataSet object, but you can also select the actual XML schema and control the structure of the resulting XML document. You can monitor the contents of a dataset, including tables and relationships, to accept schema information from the final document, and even to use DiffGram formatting.

More features can be provided when ado.net interacts and integrates with XML. In particular, in. NET, you can provide and take advantage of two equally independent views of the same data, which follow different logical data representations.

Back to the top of the page
Dialog bar: Use GetChanges for batch update
I've found that the DataSet programming interface provides a method called GetChanges, which returns a smaller dataset that populates only those updated rows in all the included tables. So, this makes me think that using this smaller dataset instead of that raw dataset can improve performance. However, in the last article you mentioned a situation where the name and provenance of the article I can't remember, saying that this situation has caused some unresolved exceptions. So my question is, can you explain more clearly how the GetChanges method of datasets is used in batch updates?

The ado.net batch update is based on a loop that iterates through the rows on the specified table. The code checks the status of the row and determines which action to perform. The Loop acts on the dataset and datasheet that you supply as a parameter to the adapter's method Fill. If you call Fill on the original dataset or on a smaller dataset returned by GetChanges, the results will be roughly the same. This will result in the lowest degree of optimization, and only to reduce the cycle length of the role.

During the batch update process, data rows are processed in the order of the middle tier to the data server. There is no snapshot of data that is sent to the database at once or as a single block of data. In fact, in this case, using GetChanges will get much more optimized code.

The key parameter that determines how many important operations are performed during a batch update is the number of rows that were modified. This parameter does not change, whether you are using the original dataset or the dataset returned by GetChanges.

Conversely, if you have batch updates to a dataset returned by GetChanges, you may experience serious problems when a conflict is detected. In this case, the rows processed before the failed rows are submitted normally, but they are not on the original dataset! To ensure application consistency, you must accept changes on the rows that are committed, as well as changes on the original dataset. This code is completely self-contained. In summary, if you use the raw dataset, the batch update code is much simpler.

Back to the top of the page
Dino Esposito is Wintellect's ado.net expert and trainer and advisor, working in Rome, Italy. Dino is a contributing editor for MSDN Magazine and a contributor to the cutting Edge column. He also often writes to Developer network Journal and MSDN News. Dino is to be launched by Microsoft Press? Building Web Solutions with asp.net and ado.net?? The author of a book is also one of the founders of http://www.vb2themax.com/. If you want to contact Dino, you can send e-mail to dinoe@wintellect.com.



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.