How to effectively store XML, manage XML, and improve XML processing efficiency has always been a fundamental challenge to constrain XML enterprise-level applications. For the first time, DB2 V9 effectively solved the above three fundamental challenges. IBM
This milestone Technology in the information management field is called DB2 purexml ". This article will discuss with readers: after applying the DB2 V9 purexml technology and eliminating the basic technical barriers of XML applications, what can we do for enterprise applications using XML?DB2 V9 solves XML storage, management, and efficient processing problems
I believe that XML is a familiar term for most readers. XML has the advantages of Self-descriptive (understandable), flexibility, platform independence, and standardization. Therefore, many IT elites and organizations are exploring how to apply XML, an excellent genetic technology, to various enterprise fields.
The first problem for XML to be widely used at the enterprise level is technical specifications and standards. In February 1998, W3C officially released XML standards and continuously improved their standards. So far, the XML technical specifications and standards have been quite mature.
However, XML is widely used in enterprise applications and there are still some basic problems that have plagued IT personnel: how to effectively store XML? How to better manage XML? How to Improve XML Information processing efficiency, including queries and partial updates? These basic problems cannot be solved. xml cannot be implemented in enterprise applications, or, at least, cannot be implemented satisfactorily ". For example, XML has been widely used in data exchange in the past few years. Since then, there has been a "public language" between IT systems and systems ". However, people have not found a way to efficiently store and manage these "Public languages, just as the original man has not found a way to record the original ecology and manage the language for a long time after having the language-"text ".
Humans are full of wisdom, and IT people are even more so. When the original man did not invent the text, he used a simple or indirect method such as "Sticky" to record the language of communication between the Administrator and the person. In the past few years, IT personnel have also tried some simple or indirect ways to store and manage XML, which are summarized as follows:
Traditional Method 1: simply store XML in a file system. This method can be said to be very simple, basically no management and efficiency. The small number of XML files is barely acceptable, and enterprise applications of a certain scale are intolerable. I have witnessed a company's applications put thousands of XML documents in a folder, and then use Java applications to query, update, and delete information. The efficiency, complexity of application maintenance, and permission management can be described as "terrible.
Traditional Method 2: store XML into large object fields of a relational database (DBMS. In fact, this method is only to move the XML storage location in "Traditional method 1" from a simple file system to a large field in the database, and does not substantially improve the manageability and operability of XML. Furthermore, we know that managing large objects is not the strength of DBMS. For example, in most DBMS, large objects cannot enter the database memory buffer pool as structured data. Therefore, the efficiency is greatly reduced.
Traditional Method 3: Break Down XML into multiple Relational Tables of relational databases. If the XML document itself is relatively simple, but a little more complex, it may be necessary for an XML document to map to several Relational Tables. If unfortunately, if the XML format changes frequently, the maintenance of the ing relationship becomes more complex. This is one of the problems. Second, this method will eliminate the integrity of the XML document itself. It is not suitable for some applications that need to ensure the integrity of the XML document. Third, this decomposition method will consume a lot of CPU and memory resources of the database server, which may cause the database system to face the risk of insufficient system resources.
Traditional Method 4: store XML in an XML-only database. This special XML-only database greatly improves the manageability and operability of XML. However, applying such non-mainstream databases that have not yet been verified by the industry to enterprise systems is no surprise. Secondly, this XML-only database basically does not support relational data, and the history of relational data over the past 20 years has penetrated into all aspects of enterprise applications, therefore, this XML-only database is also highly criticized in terms of protection investment.
The release of DB2 V9 opens up a new world for XML storage and management. In terms of storage, DB2 V9 efficiently stores XML in the original ecology, and supports SQL and XML access methods for application access. Furthermore, all relational operation methods and database tools can be used on top of XML, such as index mechanisms, import and export, associated queries, high-speed Batch loading, and system optimization. IBM calls this technology "purexml ". With the purexml technology, XML is no longer a second-class citizen in relational databases. DB2
V9 is a real dual-engine database that supports both relational data and XML.
Figure 1. "dual-engine" Processing Method in DB2 V9
So after applying the DB2 V9 purexml technology and eliminating the basic technical barriers of XML applications, what can we do for enterprise applications using XML? As a matter of fact, I cannot, and I have no intention of putting the application value of purexml in the enterprise to the fullest lifting. The following is a summary of the application fields that I have seen so far. The real purpose is to inspire readers to explore the value of DB2 purexml.
One of the application fields: information exchange and sharing
XML is the first enterprise domain to be used for information exchange and sharing. Many industries have begun to develop XML-based data exchange and information sharing standards. For example, ACORD (XML standard for insurance industry), fixml (XML-based financial information exchange protocol), fpml (XML for financial products), and HL7 (XML standard for medical and health care) ixretail (XML standard for the retail industry), XBRL (Business Report and accounting XML), newsml (news and release XML )...... Of course, information exchange and sharing are not because of XML. However, with XML
Information exchange and sharing between IT systems is more standardized, while being understandable and flexible, with a "common language ". Taking fixml as an example, the old fix standards are based on simple text, with almost no comprehensibility and flexibility. The new fixml standards have good comprehensibility and flexibility due to the use of XML, 2:
Figure 2. Comparison between FIX and fixml
Now, with the DB2 purexml technology, the capabilities of these data exchange platforms are greatly enhanced, and the XML of these exchanges can be centrally managed by the DB2 database. Applications can easily send XML to "information highway" (Enterprise Bus) in a flexible format, or obtain XML from the enterprise bus. 3:
Figure 3. XML Information Integration and exchange
Application Field 2: As a new data model
Why do we need a new data description model like XML? The reason is that many years of practice proves that the pure relational database E-R model is too strict, the structure is solidified, it is difficult to adapt to the complexity, flexibility, hierarchy, and individual differences of information. The following are some examples that I have come into contact.
Complex information
For example, electronic medical records in the medical and health field, detailed descriptions of Chinese herbal medicine products (up to thousands of attributes) in global trade, and customer data of banks ...... Taking electronic medical records as an example, the complete electronic medical records of a hospitalized patient usually include: complex information such as past medical history, general examination, specialist examination, course of disease, doctor's advice, surgical notice, preoperative summary, postoperative summary, and discharge summary. It is quite difficult to store such complex information using traditional relational database tables. It usually requires complex associations between dozens or even hundreds of tables. The table structure design is very complex and hard to understand, lack of integrity. If you use XML to describe, you usually only need one or more tables. The design is simple, the structure is clear, and maintenance is convenient. Therefore, more and more
His application developers are using DB2 purexml technology to build their electronic medical records and other applications. 4:
Figure 4. use XML to indicate information in electronic medical records
Figure 5. xml electronic medical records
Flexible and variable information
For example, employee contact information, flexible form information, supplier and customer information. Such information is very vulnerable to structural changes. For example, there may be only one employee phone number in the old system in the past few years. With the rapid popularization of mobile phones, employees may have multiple contact numbers. At this time, the cost of modifying a structured table is very high, and if the contact information is in XML format, it is easy. As shown in:
Figure 6. Store the contact information in XML
Apparent hierarchical feature information
For example, the materials list information of the automobile industry (often used to describe parts and suppliers of an automobile at several levels), the passenger ticket information of civil aviation service companies, and so on. If we use the E-R model to model the information with obvious hierarchical features, there will inevitably be several levels, and many large tables are often associated with the query, the efficiency is often very low.
Figure 7. Data Association using XML
Sparse data resulting from individual differences
Why does sparse data appear in the relational mode? I think the root cause is that the structure of the relational table is solid, and the number of fields of each individual (Data row) must be the same, while there are usually large differences between individuals. However, if XML is used to describe such information, there is no such problem.
Figure 8. Use XML to avoid redundant data
Application Area 3: document management and knowledge management
Leverage fine-grained search capabilities and powerful association capabilities for document management and knowledge management.
Fine-grained retrieval capability
A general full-text search engine can only tell the user the document in which he/she wants the information, but cannot tell the user which chapter or section of the specific document. An enterprise document (such as technical documents, rules and regulations, documents, reports, and so on) usually contains dozens, hundreds, or even thousands of pages, which is inconvenient for users, therefore, the "fine-grained search" requirement arises, that is, the document that can search results do not contain, and the chapters and paragraphs of the document must be displayed. However, if you use XML to describe the document in detail and use XQuery for fine-grained retrieval, you can meet your requirements.
Figure 9. query XML Information Using XQuery
Powerful Association capabilities
In the knowledge management system, the association between information is very important information. I have been in touch with a knowledge management system for large-scale international sports events, which emphasizes the relevance of information. For example, there is a very close connection between Venue Information and competition project information. Different competition projects have different requirements for venues, and they are also closely linked with ticket information, the VIP seats are defined differently for different competition projects and venue structures. There is no doubt that using XML to describe such highly correlated information is a good choice, because Link is XML's strength.
Application field 4: more flexible form applications
In October 14, 2003, the W3C announced the release of The XForms 1.0 standard, which is the cornerstone of a new generation of Web-based forms. Traditional HTML forms do not separate the "purposes" and "representations" of forms, whereas XForms do not, it separates the role of a form from the form representation. This allows a form to have more flexible representations and supports multiple display devices. XForm consists of three parts: model, instance data, and user interface.
Figure 10. Layered Information
These eforms are stored in the DB2 V9 database in the original form of XML and seamlessly integrated into the entire business process of enterprises and institutions.
Figure 11. Use the original XML Information
Application field 5: content push (RSS)
Although many people have criticized Web 2.0 for being "a group of people watching the Emperor's New Clothes" in the past, Web 2.0 has come to us in a mixed voice, and is quietly changing our lives. When talking about Web 2.0, you will naturally think of RSS, blog, wiki, XML, Ajax, and other technologies. This article does not discuss Web 2.0. In this chapter, I will talk about one of the most important RSS Web 2.0 applications and how DB2 V9 can improve RSS information management and application development efficiency.
Really Simple Syndication (RSS) is a simple way to share content between websites (also known as "aggregate content "). Each website (RSS provider) provides an RSS feed, which is then selectively aggregated by an RSS aggregation platform (such as a web 2.0 website or desktop tool) based on users' preferences.
For example, the IBM website provides RSS feed for technical support of various products. For details, see [url = # resources] reference information management product family examples in resource [/url. Then, we can aggregate this RSS feed to various aggregation platforms. For example, it is my favorite "Sina point reader ", I aggregated RSS feeds from DB2 technical support, domestic news, international news, local news, financial real estate, and other sites to this reader, every day, I can easily read the information I care about.
Figure 12. RSS reader
In order to achieve unified subscription, RSS has a unified standard. In fact, RSS is a type of XML, which complies with the XML 1.0 standard. To make it easier to apply RSS styles, RSS provides standard elements and their representations.
With DB2 V9, the RSS provider can perform more efficient addition, deletion, modification, and retrieval operations on its published RSS feed. The RSS reader application can use DB2 V9 to centrally manage each RSS feed subscribed to by users.
Figure 13. Publish RSS as Web Services
You can even use DB2 V9 to directly publish these RSS feeds into a Web Service for easier application integration.
Application field 6: Making user interfaces more personalized
Because of the inherent hierarchy and nesting relationship among various elements (such as Windows, menus, and word menus) that constitute the graphic user interface, the relationship between elements and attributes in XML documents is very similar, it is natural to use XML to describe the graphic user interface. The most direct advantage of using XML to describe a user interface is to make the user interface more personalized.
In this chapter, I will focus on using XML to customize the custom interface of software products and make the user interface more personalized.
Currently, IT application developers in many industries in China are extremely competitive. The direct result of fierce competition is that the profit margin of each project is greatly reduced. How can we ensure that the customer's project budget is used to increase the profit margin? A natural idea is to change project development to product development. After being productized and successfully implemented in multiple customers, the cost of each project will be greatly reduced.
However, the specific needs of each customer vary widely. For example, the information system of general hospitals is very different from that of specialized hospitals. There are many differences between these requirements, such as data models and processes. However, the biggest difference is the user interface. Almost every customer has his/her own preferred interface style.
We are very pleased to see that some domestic powerful developers have found an effective method in product and customer customization, that is, using XML to describe the user interface, these XML files use DB2 V9 for efficient management. As shown in, after a user logs on, the XML-GUI personalized loading module gets the user's personalized interface customization information (XML) from DB2 V9, and then presents the user with a personalized interface. In this way, the same product can have different Display Interfaces in different enterprises. In addition, users of different roles and levels in the Enterprise will have different interfaces. In addition, the end user can also customize certain menus or styles.
Figure 14. personalized user interface Loading Process
Summary
DB2 V9 has made revolutionary breakthroughs in XML storage, management, and processing efficiency. This article lists what DB2 V9 can do for enterprise applications:
- Information exchange and sharing;
- As a new data model: complex information, flexible and changeable information, information with obvious hierarchical features, and sparse data brought about by individual differences;
- Document management and knowledge management (using XML fine-grained retrieval capabilities and powerful association capabilities );
- Build more flexible form applications;
- Web 2.0 applications (such as RSS)
- Make the user interface more personalized