xml| Data | database
Extensible Markup Language (Xml,extensible Markup Language) is now becoming the preferred format for various data, especially documents. Because it has the ability to tag different fields (field), the search becomes simpler and more dynamic, making the files that the enterprise prepares to throw into the wastebasket into a treasure trove of data mining. XML frees the content from the presentation format so that the material can be reused multiple times. As a result, the same content can be used for news releases, white papers, brochures, demos, and Web pages, respectively. For businesses that need to integrate incompatible systems, XML can act as a public transport tool and transmit data in a neutral format. In addition, XML can handle a variety of data, including text, images, and sounds, and can be extended by the user to handle any particular type of data.
The nature of XML makes it a common language for both online and offline data.
The question now is how to manage XML tagged data. A promising approach is to store, retrieve, and manipulate XML using a database, that is, to complete search, analysis, updating, and output work in an environment that is easier to manage, more systematic, and more familiar, in a framework of data revenue from XML markup.
Here are two different points of view: The pure faction thinks that only the database that stores XML in the original XML format can be called an XML database, others believe that as long as the XML is stored and checked out, and that it is a database itself, it is an XML database without having to consider how the data is deposited. Let's put aside the two factions for the moment, where the XML database is not stored in XML format, called "xml-enabled database", if the internal data stored in XML format, it is called "Native-xml database."
Using existing database types and products rather than storing XML in the original format comes from the following considerations: first, common relational databases and object-oriented databases are well known, and Native-xml databases are new; second, people are familiar with existing relational databases and object-oriented databases, They are also aware of their behavior and performance, so they are reluctant to turn to Native-xml databases because their performance, especially scalability, has not yet been tested; Finally, it is safer for the enterprise to select a relational database and an object-oriented database. There is no need to risk using the new Native-xml database.
Fortunately, you don't have to take any chances. There are already xml-enabled databases that work well with XML, and this is the responsibility of time-tested relational databases and object-oriented databases. After they receive the XML, the databases decompose them into fields and store them in the usual way, and when the XML is retrieved, the fields are spliced to the original state.
The Content@xml developed by Xyvision Enterprise Solutions, based in Reading, Massachusetts, is a content management system that can store XML files in any popular relational database. The advantage is that the content-based collaborative work can be carried out and multi-channel content output is carried out. A technology publisher chose Content@xml, who said that XML compresses their 2-week work to a few minutes, "the system accepts XML materials and gives you the results of any format you want." ”
Lotus's Domino database can also process XML, and its XML toolkit even lets you create and process content as you would in a native-xml database.
When processing XML data in a relational database, it can be converted using Third-party middleware, one of which is called Xml-dbms, a Java Database connectivity (JDBC) tool that transmits data between XML documents and databases.
However, the use of relational databases and object-oriented database storage XML has also been criticized. For example, it is argued that one of the most compelling attributes of XML is its hierarchical structure, while relational databases map XML into relational tables, thereby turning the XML structure into flat rows and columns. In addition, when you encounter large or complex documents, it takes a considerable amount of processing time to convert the XML back and forth between the databases, thereby reducing the speed at which Web pages are generated.
At present, the Native-xml database has begun to appear, although people do not yet fully trust it, but there are some native-xml databases are beginning to be favored in practical applications. In addition, the mainstream database vendors may also launch their own Native-xml database products at the right time.
The first, and probably the most famous, commercial Native-xml database is a tamino developed by Software AG in California, USA. In addition to storing and accessing XML, Tamino has a number of features, including open Database connectivity, Unicode requirements, HTTP communications, and the ability to process non-XML data. "Tamino is especially useful for organizations that need to consolidate information from a variety of platforms and formats and distribute it to business partners or customers," a Gartner report says. ”
According to the introduction, Tamino has direct XML retrieval and special search capabilities, its query language is powerful and short, can enter any depth, so that the SQL pales.
Other native-xml databases include Dbxml, Excelon and xhive/db, which were developed by Dbxml Group LLC, Excelon, and the Connection Factory company of the Netherlands. Oddly, one of the main criticisms of the Native-xml database is performance issues. It has been foreseen that, when the search for information is at the end of a large document, the Native-xml database can only trek to the last, because of the lack of other mechanisms, and relational and object-oriented databases can be divided into small chunks of the time to search, of course, much faster.
Of course, these difficulties are not insurmountable, as long as the documents are indexed separately when they are stored. Tamino has this indexing capability, which makes up for the lack of large document searches, and native-xml storage eliminates unnecessary conversion operations. Currently, Tamina has versions of Windows NT, Windows 2000, Solaris, and SCO Unix, as well as Linux and some mainframe versions in the future.
At present, many mainstream database vendors are integrating XML support into their products or providing tools that can use XML in their databases. IBM provides XML Extender for DB2 to allow users to store XML documents in DB2, and provides new capabilities to assist users with XML documents; Microsoft SQL Server 6.5 and 7.0 also carried out XML extensions, it is learned that SQL in the future to add XML output options to other systems to send information. Oracle also has a powerful XML indexing engine.
Some experts expect that these database vendors will soon launch their own Native-xml database to meet the needs of web-based e-business for XML data processing.
All in all, the need for XML is growing, and new applications include an Internet search engine that uses XML tags, an e-commerce system that must quickly output results, electronic data interchange with XML tags, data reuse, and content personalization. As part of the process of dealing with the above application, the need for XML databases will also grow rapidly.