XML-new world of report data

Source: Internet
Author: User
Tags xpath

With the popularity of the B/S system and the deep application of XML technology, more and more data packets are stored and discarded in the XML coat. Many of the data comes from the database, but after some processing, the data is more streamlined and closer to the application. If the report tool can use the XML data, it can reduce the database query and report data calculation operations, because the creator of the XML data has completed these operations. Therefore, XML is the new world of report data.

Traditional reporting tools can only process relational databases and basically cannot process other data. With the development of the times, some report tools, coupled with the ability to process XML documents, need to program, write and configure plug-ins, such reportsProgramThe structure is complex and there are many interfaces. You also need to write programs according to various specifications. If there are many XML document formats, you need to write many plug-ins, and the amount of report development is not small.

If you implement a common method to process XML documents and obtain report data. When customizing a report template, You need to educate the Report Engine on how to obtain data from XML documents. You do not need to program all or most of the XML documents to reduce the amount of report development.

So how can we deal with XML documents with complex tree structures?

As you know, there are two ways to process XML documents: one is the DOM mode and the other is the stream mode. Convenient Dom processing, but slow memory consumption, fast stream processing speed, saving memory but inconvenient to use. For the. NET platform, the system. xml. xmldocument object is used to process XML documents in Dom mode, and xmlreader is used to process XML documents in stream mode.

The W3C International Standards Organization designed XML documents to facilitate the storage and exchange of small data packets without considering data redundancy. Therefore, if there is a huge XML document in the actual system, it can be considered inappropriate to use the XML technology in most cases. Therefore, I think the report tool should not process a large XML document. On this premise, we use Dom to process XML documents for convenience.

In. when an XML document is loaded using xmldocument in. net, an XML Object Tree Structure with xmldocument as the root node is formed, the method for obtaining data is naturally the use of xpath technology. In the XML tree structure, XPath uses a node as the starting node and uses a specific description to move the path to another node. Generally, it moves to its lower-level node, for example, moving to a subnode or sunnode, or moving to a certain attribute.

The traditional report data source model has two layers. Even if it is extended to XML documents, it can only be processed once, starting from the root node and using an XPATH to obtain the field value. This can only be processed once, and XML is discarded after processing. However, XML documents need to be further processed. In this case, the traditional two-layer report data source model is not enough.

In order to be able to perform intensive work on XML documents, we need to break through the traditional two-layer data source structure to a multi-layer report data source model. In a multi-layer data source node, each node is mapped to a node in the XML document, and its Child Nodes map to other nodes in XML using the XPath path. After such recursive loops, multi-layer data sources can be mapped to any node in the XML document. The multi-layer data source model is actually a data source tree. Therefore, processing XML documents is to pin the two trees together on some nodes, while XPath is a nail. Therefore, you need to pay attention to the continuity of the XPath configuration of data source nodes at all levels. If the XPath settings of a data source node are incorrect, it is like it is pinned to an XML node or simply pinned to an empty location. Then it is bound to all child nodes.

In practical applications, because XML documents are not generated for reports, the report tool may leave the XML document to obtain more report data, in this case, you may need to jump from the XML document to other XML documents, or return to the database to continue executing SQL queries. This tests the flexibility of the report data source model.

We all know that the RSS document is an XML document. Here we use the RSS document of the blog Park as an example to illustrate the process of Reading Report data from XML. First, investigate the structure of the RSS document, the RSS document URL of the blog home page is http://www.cnblogs.com/rss.aspx, the root node is RSS, and then there is a channel subnode, the following contains the basic information of the RSS document, then there are several item nodes listing allArticle. The following is the basic information of the article on the item node. The content of the WFW: commentrss subnode is the URL of the RSS document for the article reply information. You can define the ing between the following report data sources and RSS documents based on the RSS document structure loaded by the URL.

Here, the rssxml document has a three-layer structure and requires dynamic loading of XML documents for more in-depth processing. Therefore, the traditional two-layer structure is certainly not enough and must adopt a multi-layer report data source structure. The process is quite complex. The steps are as follows:

    1. load the XML document at the http://www.cnblogs.com/rss.aspx as the main XML document, generate a system. xml. xmldocument object, and take the XML document object as the starting point for processing.
    2. Use XPath "RSS" to traverse all XML nodes that match the path. Obviously, only one node is processed, and the current position is moved to the RSS node.
    3. Use XPath "channel/Title" at the current node to obtain the website title, use "channel/Link" to obtain the website address, and use "channel/description" to obtain the website description, "channel/pubdate" gets the document release time.
    4. use "channel" to traverse all XML nodes that match the path from the current node. Obviously, only one node is processed, and the current location is moved to the "channel" node.
    5. all item subnodes under the current node of the variable, and set them to the current node in sequence.
    6. use "title" from the current node to get the article title, use "Link" to get the article address, use "author" to get the author, and use "pubdate" to get the release time, "Description": Get the article content, "slash: Comment": Get replies, "WFW: commentrss": Get the URL to reply to the rssxml document.
    7. when processing a "WFW: commentrss" node, the program loads the XML document pointed to by the node data based on a specific setting, that is, the response rssxml document to the current article is loaded. And traverse all the nodes that match the "RSS/channel/item" of the loaded XML document, and set them to the current node in sequence.
    8. from the current node, use "author" to get the reply author, use "pubdate" to get the reply time, and use "Description" to get the reply content.
    9. because HTML Code is saved at the description node in the rssxml document, therefore, you also need to parse the HTML code and extract the plain text content.

From the above steps, we can see that each node in the data source structure is pinned to a node in the XML document, and in the node in the reply list, the program executes the jump of the XML document, from the master XML document to the reply rssxml document, which processes the tree structure. Therefore, it is a recursive operation. Many status information is automatically saved by the System Call Stack and does not need to be saved by the program.

If the report program can directly connect to the blog garden database, you can also execute the XML document from the author node to the database to directly query the database and obtain some registration information of the author. In fact, every node in the tree structure of the data source can jump from XML to XML, from XML to database, and from database to XML. This greatly extends the flexibility of obtaining report data.

If an information system is a pure XML application, the report tool can skip to collect data in numerous XML documents without querying the database, it is like the monk of Shaolin Temple fighting on the plum blossom pile without touching the ground, so there is no need to worry about the following Java soil or. net soil. At this time, all database operations and business logic are run in the background, and report tools do not need to be concerned. As long as the underlying system security is reliable, the report module is secure and reliable, and the underlying system is modified no matter how it is modified, the report module does not need to be modified as long as the XML document format remains unchanged. For a very complex report data source that exceeds the customization capability of the report tool, you can program to provide an XML document for the Report Program. In the past, the system directly provided complex report data to the Report Program through APIS, but now it provided complex report data to the Report Program in the form of "enumeration" Through XML documents. At this time, the system structure is safer and the boundaries are clearer, reflecting the guiding ideology of XML WebService. This idea of data acquisition is not limited to report data acquisition, but can also be applied to other fields.

The application system discussed above is limited to the B/S system, but it can be imagined that the C/S system has been transformed and XML data documents can be provided to the report tool in some way.

From the above discussion, we can see that XML is indeed the new world of report data, and my report tools have initially reflected this idea. However, this idea may be a bit radical and immature. I hope you can give more advice.

Xdesigner software Studio

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.