XML features in ADO. net

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Http://www.51bylw.com/article/6460.html

Ado | XML

Undoubtedly, XML and some related technologies (including XPath, XSL transformation and XML Schema) are the basis of ADO. net. Compared with ADO, the interoperability of ADO. Net object models has been greatly improved. In fact, XML is the key element that plays an important role for this purpose. In ADO, XML is only a (non-default) I/O format used to retain the content of the disconnected record set. XML is much more involved in building and interacting with ADO. net. The following points can be used to summarize the stronger interaction and integration between ADO. NET and XML:

• Object serialization and remote processing

• Double Programming Interface

• XML-driven batch Update (for SQL Server 2000 only)

In ADO. net, you can save objects to XML documents and restore objects from XML documents in several ways. In short, this capability only belongs to the DataSet object, but can be extended to other container objects with the least code. Saving an object such as datatable and dataview to XML is essentially a special case of dataset serialization.

In addition, ADO. NET and XML classes provide a unified intermediate API, which can be used by programmers through synchronous double programming interfaces. You can use XML node-based hierarchy or column-based table-based dataset relationships to access and update data. You can switch from the dataset representation to the xml dom at any time, and vice versa. The data is synchronized, and any changes you enter in one of the models are immediately reflected in the other model and can be seen. In this article, I will discuss serialization from ADO. Net to XML and XML data access, that is, the first two points in the above list. Next month, I will mainly discuss XML-driven batch Update-
One of the coolest features you get from SQL Server 2000 XML extensions (SQLXML 2.0.

Dataset and XML
Just like any other. Net object, the DataSet object is stored in the memory in binary format. However, unlike other objects, datasets are always processed and serialized remotely in a special XML format named DiffGram. When a dataset enters the application domain boundary or the physical boundary of a computer, it is automatically displayed as DiffGram. On the target end, recreate the dataset as a binary object without prompt and use it immediately. Applications can use the same serialization function through a set of methods, and one of the methods is obviously very prominent. They are readxml and writexml. The following table shows the methods you can use to read and write data sets using XML.

Getxml
• Returns a string, which is the XML Representation of the data stored in the dataset.

• Does not include any Architecture Information

Getxmlschema
• Returns a string that is the XML schema information of the dataset.

Readxml
• Fill the DataSet object with the specified XML data read from the stream or file

Readxmlschema
• Load the specified XML schema information to the current DataSet object

Writexml
• Write XML data (or schema) indicating the dataset)

• You can write streams or files.

Writexmlschema
• Write a string, which is the XML schema information of the dataset.

• You can write streams or files.

As shown in the table above, when using dataset and XML, you can manage data and Architecture Information as different entities. You can accept the XML schema from the dataset and use it as a string. You can also write it to a disk file or load it to an empty DataSet object. The DataSet object also contains two XML-related attributes-namespace and prefix. Namespace determines the XML namespace that is used to limit its range when the XML attributes and elements are read to the dataset. The prefix used as the namespace alias is stored in the prefix attribute.

Back to Top
Build a dataset from XML
The readxml method fills in dataset objects read from multiple sources, including instances of Disk Files,. Net streams, or xmlreader objects. This method can be used to process any type of XML files. However, if the XML file has a non-table structure that is quite irregular in structure, when it is presented in the format of columns and rows, of course, some problems may occur.

The readxml method has several reloads, all of which are similar. They accept XML sources and optional xmlreadmode values as parameters. For example:

Public xmlreadmode readxml (string, xmlreadmode );

This method creates a relational architecture for a dataset Based on the specified read mode and the existing architecture in the dataset. The following code snippet shows the typical code used to load a dataset from XML.

Streamreader sr = new streamreader (filename );
Dataset DS = new dataset ();
DS. readxml (SR); // defaults to xmlreadmode. Auto
Sr. Close ();

When the content of the XML source is loaded to the dataset, readxml does not merge the rows whose primary key information matches. To merge an existing dataset with a dataset loaded from XML, you must first create a new dataset and then merge the two datasets using the merge method. During the merge, the rows to be overwritten are those with matched primary keys. You can also use another method to merge existing dataset objects with the content read from XML, that is, in DiffGram format (which will be discussed in detail later ).

The following table describes multiple read modes supported by readxml. You can use the xmlreadmode enumeration to set them.

Ignoreschema
Ignore any embedded architecture and rely on the existing architecture of the dataset

Readschema
Read any embedded architecture and load data and architecture

Inferschema
Ignore any embedded architecture and infer the architecture from XML data

DiffGram
Read DiffGram and add the data to the current architecture

Fragment
Read and add XML fragments until the end of the stream

The default read mode is not listed in the table, which is xmlreadmode. Auto. When this mode is set, or when no read mode is explicitly set, the readxml method checks the XML source and selects the most appropriate option.

If the XML source is DiffGram, the source is loaded as DiffGram. If the source contains an embedded architecture or references to an external architecture, readxmlschema is used to load the source. Finally, if no schema information exists in the XML source, the readxml method uses the inferxmlschema method of the dataset to deduce the schema. The relational structure (schema) of a dataset consists of tables, columns, constraints, and relationships. Next, let's take a look at what will happen when setting each mode.

The xmlreadmode. ignoreschema option causes this method to ignore any embedded architecture or referenced architecture. Therefore, the data is loaded into the existing dataset architecture, and any unsuitable data is discarded. If there is no architecture in the dataset, no data is loaded. Note that the empty dataset has no schema information. Remember, if the XML source is in DiffGram format, the ignoreschema option will have the same effect as xmlreadmode. DiffGram.

// No schema in the dataset, no data will be loaded
Dataset DS = new dataset ();
Streamreader sr = new streamreader (filename );
DS. readxml (Sr, xmlreadmode. ignoreschema );

The xmlreadmode. readschema option is only valid for the embedded architecture and does not recognize external references. It can add a new table to the dataset, but if any table already exists in the dataset defined in the embedded architecture, an exception is thrown. You cannot use the readschema option to change the schema of an existing table. If the dataset does not contain an architecture (that is, the dataset is empty) and there is no embedded architecture, no data is read or loaded. Readxml can only read embedded architectures defined by XML Schema Definition Language (XSD) or XML-Data Driven CED (XDR. No document type definition (DTD) is supported ).

If the xmlreadmode. inferschema option is set, readxml will deduce the architecture directly from the structure of the XML data and ignore any possible embedded architecture. Data is loaded only after the architecture is inferred. You can add a new table as needed or add a new column to an existing table to expand the existing architecture. You can use the inferxmlschema method of the dataset to load the schema from the specified XML file to the dataset. To some extent, you can control the XML elements processed during schema inference operations. Using the signature of inferxmlschema, you can specify a group of namespaces whose elements will be excluded from the inference.

Void inferxmlschema (string filename, string [] rgnamespace );

DiffGram is the XML format used by ADO. Net to save the dataset status. Similar to SQLXML's Updategram format, DiffGram also includes the current and original statuses of data rows. When DiffGram is loaded using readxml, rows with matched primary keys are merged. You can use the xmlreadmode. DiffGram flag to explicitly indicate that readxml takes effect for DiffGram. When the DiffGram format is used, the target dataset must have the same architecture as the DiffGram. Otherwise, the merge operation fails and an exception is thrown.

If the xmlreadmode. Fragment option is set, the dataset is loaded from the XML fragment. An XML snippet is a valid XML that identifies elements, attributes, and documents. The XML segment of an element is the markup text that fully limits the XML elements (nodes, CDATA, processing instructions, and annotations. The fragment of the attribute is the attribute value, and the fragment of the document is the entire content set. If the XML data is a segment, the root-Level Rules of XML documents in full format are not applied. Fragments that match the existing architecture are appended to the appropriate table. fragments that do not match the existing architecture are discarded. Readxml reads the end of the stream from the current position. Xmlreadmode. Fragment
The option should not be used to fill in an empty and schema-less dataset.

Back to Top
Serialize A DataSet object to XML
The XML Representation of a dataset can be written to a file, stream, xmlwriter object, or string using the writexml method. The XML representation can contain either schema information or schema information. The actual behavior of the writexml method can be controlled by the xmlwritemode parameter that you can pass. The values in the xmlwritemode enumeration determine the output layout. Dataset representation includes table, link, and constraint definition. If you do not select the DiffGram format, the rows in the dataset table are only written to the current version. The following table lists the write options that can be used by xmlwritemode.

Ignoreschema
Write the dataset content as XML data without Architecture

Writeschema
Write Data Set content with embedded XSD Architecture

DiffGram
Write the dataset content as DiffGram, including the original value and current value.

Xmlwritemode. ignoreschema is the default option. The following code shows a typical method of serializing a dataset into XML.

// Ds IS the dataset
Streamwriter Sw = new streamwriter (filename );
DS. writexml (SW); // defaults to xmlwritemode. ignoreschema
Sw. Close ();

Several factors affect the final structure of the XML document created from the DataSet object. These factors include:

• The overall XML format used-DiffGram or the unformatted hierarchical representation of the current content

• Existence of Architecture Information

• Nested relationship

• How to map table columns to XML elements

The DiffGram format is a special XML format that I will describe later. It does not include schema information, but retains the row status and row errors. Therefore, it seems to constitute a closer representation of the real-time dataset instance.

If the schema information exists in the created dataset, it is always written as an embedded XSD. You cannot write it as XDR or DTD, or add references to external files. If you have not specified a name for the root node of the generated XML file, accept the name of the dataset or newdataset. The following code snippet is an example of the XML Representation of the DataSet object composed of two tables. The two tables are MERs and orders, and their relationships are formed by the customerid field.

<Mydataset>
<Xs: schema.../>
<Customers>
<Customerid> 1 </customerid>
<Fname> JOHN </fname>
<Lname> Smith </lname>
</Customers>
<Customers>
<Customerid> 2 </customerid>
<Fname> Joe </fname>
<Lname> Users </lname>
</Customers>
<Orders>
<Customerid> 1 </customerid>
<Orderid> 000a01 </orderid>
</Orders>
<Orders>
<Customerid> 1 </customerid>
<Orderid> 000b01 </orderid>
</Orders>
</Mydataset>

It is difficult for you to determine the relationship between the two tables based on the Code listed above. Some information about this is set in the <Xs: schema> tree, but there is no other information to help deduce this conclusion. If you sum up the relationships set on the customerid field into text, you can table all orders issued by a given customer. The preceding XML tree does not provide an immediate representation of this information. When there is a data relationship in the dataset, you can set the nested attribute of the datarelation object to true to change the node sequence. As a result of this change, the XML Code obtained from the change is as follows:

<Mydataset>
<Xs: schema.../>
<Customers>
<Customerid> 1 </customerid>
<Fname> JOHN </fname>
<Lname> Smith </lname>

<Orders> <customerid> 1 </customerid> <orderid> 000a01 </orderid> </orders> <customerid> 1 </customerid> <orderid> 000b01 </ orderid> </orders>
</Customers>
<Customers>
<Customerid> 2 </customerid>
<Fname> Joe </fname>
<Lname> Users </lname>
</Customers>
</Mydataset>

As you can see, all orders are now concentrated under the corresponding customer tree.

By default, columns in an XML table are displayed as node elements. However, this is only a setting that can be adjusted based on each column. The datacolumn object has an attribute named columnmapping, which determines how the column is presented in XML format. The columnmapping attribute accepts values in the mappingtype enumeration listed below.

Element
Map to XML node elements:

<Customerid> value </customerid>

Attribute
Ing to XML node attributes:

<Customers customerid = value>

Hidden
Not Displayed in XML data unless the DiffGram format is used

Simplecontent
Map to simple text:

<Customers> value </customers>

If the XML output format is DiffGram, the hidden ing type is ignored. However, in this case, the column's DiffGram representation contains a special attribute, which marks the column as initially hidden for XML serialization. The simplecontent ing type is not always available and can be used only when the table has columns.

Back to Top
DiffGram format
DiffGram is only an XML string written based on the specific architecture that represents the dataset content. It is by no means a. Net type. The following code snippet shows how to serialize A DataSet object to DiffGram.

Streamwriter Sw = new streamwriter (filename );
DS. writexml (SW, xmlwritemode. DiffGram );
Sw. Close ();

The generated XML code is placed in the <diffgr: DiffGram> node, which contains up to three different data sections, as shown below:

<Diffgr: DiffGram>
<Mydataset>
:
</Mydataset>

<Diffgr: Before>
:
</Diffgr: Before>

<Diffgr: errors>
:
</Diffgr: errors>
</Diffgr: DiffGram>

The first section of DiffGram is mandatory, indicating the current instance of data. It is almost the same as the XML output that you can get from normal serialization. The main difference between the two is that the DiffGram format never includes architecture information.

This data section contains the current value of the row in the dataset. The original rows, including the deleted rows, are stored in the <diffgr: Before> section. Only modified or deleted records are listed here. Newly Added records are only listed in the data instance because they do not have the aforementioned references to be linked. The rows in these two sections use a unique ID for tracking. These rows represent the increment between the original and current versions of the dataset.

Finally, in the <diffgr: errors> section, list any messages related to hanging errors on the row. Similarly, in this case, use the same unique ID that discusses whether to change to track rows. DiffGram nodes can be marked with special attributes to relate elements across different sections (Data instances, changes, and errors.

Diffgr: haschanges
This row has been modified (See related rows in <diffgr: Before>) or inserted.

Diffgr: haserrors
This row has an error (See related rows in ).

Diffgr: ID
Determine the ID used for cross-section coupling rows: tablename + rowidentifier.

Diffgr: parentid
Identifies the ID of the parent row of the current row.

Diffgr: Error
Contains the error text of the row in <diffgr: Before>.

Msdata: roworder
Position of the row sequence in the trail dataset.

Diffgr: hidden
Are you sure you want to mark it as a hidden msdata: hiddencolumn = ???... .

The ADO. NET Framework only supports explicit XML for dataset objects. However, it is not particularly difficult to convert dataview or able to XML. In both cases, you must use a temporary dataset as the container for the row set to be saved as XML. The Code required to save the datatable as XML is simple.

Void writedatatabletoxml (string filename, datatable DT)
{
// Duplicate the table and add it to a temporary Dataset
Dataset dstmp = new dataset ();
Datatable dttmp = DT. Copy ();
Dstmp. Tables. Add (dttmp );

// Save the temporary dataset to XML
Streamwriter sr = new streamwriter (filename );
Dstmp. writexml (SR );
Sr. Close ();
}

Each ADO. Net object can only be referenced by one container object. For this simple reason, copying a able object is very important. You cannot have the same instance. For example, a able object belongs to two different dataset objects.

Different from the datatable object, dataview is not a standard component of the dataset. Therefore, to save it to XML, you should convert dataview into a table object. This process can be implemented using the following code snippet:

Void dataviewtodatatable (dataview DV)
{
// Clone the structure of the table behind the view
Datatable dttemp = DV. Table. Clone ();
Dttemp. tablename = "row"; // This is arbitrary!

// Populate the table with rows in the view
Foreach (datarowview DRV in DV)
Dttemp. importrow (DRV. Row );

// Giving a custom name to the dataset can help
// Come up with a clearer layout but is not mandatory
Dataset dstemp = new dataset (DV. Table. tablename );

// Add the new table to a temporary Dataset
Dstemp. Tables. Add (dttemp );
}

The first step is to clone the structure of the table after the processed dataview object. Next, traverse all records in this view and add the corresponding rows to the temporary able. Then, add the datatable to the temporary dataset and serialize the datatable. You can also try to provide the table name to the dataset and customize the format to the XML output. For example:

Back to Top
Xmldatadocument class
The XML and ADO. Net frameworks provide a unified model for accessing data in the form of XML and relational data. The key XML class is xmldatadocument, while dataset is the key ADO. Net class. Specifically, xmldatadocument inherits from the base class xmldocument and is different from the base class in terms of its ability to maintain synchronization with the DataSet object. During synchronization, the dataset class and xmldatadocument class are targeted at the same row set, and you can apply changes through two interfaces (nodes and Relational Tables, these two classes can also be seen immediately. Basically, Dataset
And xmldatadocument provide two methods for the same data. Therefore, you can apply XSLT conversion to relational data, query relational data using XPath expressions, and use SQL to select XML nodes.

You can bind a DataSet object to an xmldatadocument object in several ways. The first method is to pass a non-empty DataSet object to the constructor of the xmldatadocument class.

Xmldatadocument Doc = new xmldatadocument (Dataset );

Similar to the base class, xmldatadocument provides an xml dom method for using XML data. Therefore, it is very different from the XML reader and writer. The following example shows another method for synchronizing the two objects. This is to create a valid non-empty DataSet object from a non-empty instance of the xml dom.

Xmldatadocument Doc = new xmldatadocument ();
Doc. Load (filename );
Dataset dataset = Doc. dataset;

You can use the dataset attribute of xmldatadocument to convert an XML document into a DataSet object. This attribute instantiates and fills the DataSet object and returns this object. When you first access the dataset attribute, the dataset is associated with xmldatadocument. The getelementfromrow and getrowfromelement Methods switch between the XML form of data and the Link View. To view XML data from a relational perspective, you must specify the schema to be used for data ing. You can call readxmlschema for the same XML file.
Method to achieve this goal. Alternatively, you can manually create required tables and columns in the dataset.

However, there is also a method to synchronize xmldatadocument and dataset objects, that is, when they are empty, fill them separately. For example:

Dataset dataset = new dataset ();
Xmldatadocument xmldoc = new xmldatadocument (Dataset );
Xmldoc. Load ("file. xml ");

Synchronization of two objects provides unprecedented flexibility. As mentioned above, you can use two completely different navigation types to move between records. In fact, you can use SQL-like queries for XML nodes and XPath queries for Link rows.

Not all XML files can be synchronized with datasets successfully. To maintain synchronization, the XML document must have a regular table structure that can be mapped to the relational structure. In the relational structure, each row has the same number of columns. When an XML document is rendered as a DataSet object, it will lose any XML-specific information, which may be the information they already have and have no link to the corresponding part. The information includes comments, declarations, and processing instructions.

Back to Top
Summary
In ADO. net, XML is not just a simple output format used to serialize the content. You can use XML to serialize the entire content of the DataSet object, but you can also select the actual XML architecture and control the structure of the obtained XML document. You can monitor the content of a dataset, including tables and relationships, accept the architecture information obtained from the Final Document, and even adopt the DiffGram format.

More features are available when ADO. Net interacts and integrates with XML. In particular, in. net, you can simultaneously provide and use two identical but independent views of the same data, which follow different Logical Data representations.

Back to Top
Dialog Box: Use getchanges for batch update
I have found that the dataset programming interface provides the getchanges method, which returns a smaller dataset that only fills in the updated rows in all included tables. Therefore, this makes me think that using this smaller dataset instead of that original one can improve performance. However, as you mentioned in the previous article, I cannot remember the name and source of the article, saying that this situation has led to some unknown exceptions. Therefore, my question is, can you clearly describe the use of the getchanges method of the dataset in batch update?

Ado. Net batch update is a loop that traverses rows in a specified table. The code checks the status of the row and determines the operation to perform. This loop acts on the data set and data table of the method fill that you provide to the adapter as a parameter. If you call fill for the original dataset or for a smaller dataset returned by getchanges, the results are roughly the same. This will lead to the lowest degree of optimization, and only play the role of reducing the cycle length.

During batch update, data rows are processed in the order from the intermediate layer to the data server. There is no data snapshot that is sent to the database at one time or as a single data block. In fact, in this case, using getchanges will get much optimized code.

The key parameter that determines how many important operations are performed during batch update is the number of modified rows. This parameter is not changed whether you are using the original dataset or the dataset returned by getchanges.

Conversely, if you batch update the dataset returned by getchanges, you may encounter serious problems when a conflict is detected. In this case, the rows processed before the failed rows are submitted normally, but they are not in the original dataset! To ensure application consistency, you must accept changes on submitted rows and changes on the original dataset. This code is completely independent. All in all, if you use the original dataset, it is much easier to update code in batches.

Back to Top
Dino Esposito is a wintellect ADO. Net expert and training instructor and consultant working in Rome, Italy. Dino is a Special Editor of msdn magazine and a writer of the cutting edge column. He often writes to Developer Network Journal and msdn news. Is Dino coming soon from Microsoft press? Building Web solutions with ASP. NET and ADO. Net ?? Author of a book, also http://www.vb2themax.com/
One of the founders. If you want to contact Dino, you can send an email to the dinoe@wintellect.com.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More