From HTML to XML

Source: Internet
Author: User
Tags format define html tags ibm db2 xml parser xsl
XML XML (Extensible Markup Language, Extensible Markup Language) is one of the most popular network technologies, known as "second-generation Web language" and "the cornerstone of Next Generation Network application". Since it was proposed, almost all the big companies in the industry have been supported, not inferior to the same time when the HTML was proposed by the heat. So, while the XML is not yet popular, hurry to learn to save the future job.

Disadvantages of HTML
To talk about XML, we have to first talk about "the first generation of Web language"--html. The HTML is old enough to be eliminated (exaggerated a bit, but not after HTML 4.0 and replaced by XML-defined XHTML). Poor us, we were just able to make use of HTML, but had to be forced to discard. Why, then? HTML to drive the development of the WWW in these years, can be said to be. Want to do something on the Internet, almost no HTML. But HTML has a fatal disadvantage, is: only suitable for people and computer communication, not suitable for computer and computer communication.

As you all know, HTML is a lot of tags to define the content of the document in what form it is now before us, in other words, HTML is a "display description" language that simply describes how a Web browser should lay out text, graphics, etc. on a page without the most important thing on the Internet- Describe the meaning of the information itself. The text and graphics that are displayed through HTML are easy to understand, and it's hard to get a computer to understand the meaning of the words in those tags.

For example, we design a program that automatically goes to major online shopping malls to bring back the latest price. But the problem is that every online store may write the name and price of a product in a Web page with a set of their own, such as: Price, and B with Price, and a more complex table. So how does our program know, what kind of tags inside the thing is to grasp the price information? Another example: in HTML, AppleIt represents the word "apple" in bold form in a Web browser, and does not indicate what Apple stands for. Is Apple computer? or something? This results in HTML not revealing the meaning of the information in the file.

Another problem with HTML is that the collection of tags is fixed and the user cannot add meaningful tags to them. And the size of the major browsers are different, so that we can use HTML to do the Web page will be the normal display of all browsers, we can only use the Internet to define the tags we have to create a Web page.

In today's network world, with the development of E-commerce and web-based applications, a lot of information needs to be processed quickly. In fact, most of the information on the Internet is initially stored in a well-formed database, and the information is stored in the appropriate fields, such as employee profiles, name, gender, department, etc. For the "John" data, the computer is able to know the name of an employee according to where it is stored. However, once the data has been paged out and transformed into HTML via CGI, ASP, JSP, PHP, and so on, the original meaningful data becomes a combination of HTML tags that have no specific meaning. Users must use their "brains" to parse the data and then "manually" record and process them, apparently processing information slowly. If we were able to pass data from the original structure originally stored in the database to the computer, it would certainly speed up the processing. Obviously, the use of HTML can not be done, and because the computer system, operating system and the database used by different, different computers to understand each other's database format is very difficult and very cumbersome, in order to enable a variety of different computers to exchange information between, it seems that HTML is necessary. How to solve this problem?

Welcome to XML
Using XML can solve the above challenges. The consortium describes XML as follows: "XML describes a class of data objects called XML documents, and partly describes the behavior of the computer programs that handle them." XML is an application instance of SGML or a restricted form. Structurally, XML documents conform to SGML document standards. "Like HTML, XML is also a text-based markup language, both from SGML (standard generalize Markup language, standard generic markup language, an old markup language, originally used in the publishing industry, and very complex, Based on the very expensive applications developed by SGML, which are used only by a handful of large companies and government departments, XML preserves the functionality of SGML 80% and reduces the complexity of 20%, making it cheaper to develop XML-related applications that allow XML to go into "ordinary people's homes".

Hint: HTML is only suitable for people and computer communication, not suitable for computer and computer communication.
The difference between XML and HTML is that XML allows us to freely define markup to represent meaningful document content based on the document we want to represent, such as: we can define a meaningful tag (in Chinese) with the document name 〉〈/document name. In XML, we only need to pay attention to the contents of the document, and the presentation of the document to CSS (cascading style sheet) and XSL (extensible Style language) to complete, if the XML file is only used to exchange information between computers and computers, only need an XML file, If you want to display the information in an XML file in some form, such as through a browser, you can refer to a style sheet file to define how the browser displays the information in the XML file. and XML does not have a fixed set of tags like HTML, which is actually a language for defining languages, meaning that users of XML can define infinite tags to describe any data element in the document, breaking through the constraints of the HTML fixed tag set, Organize the content of the document into a rich and complex complete information system. XML has three main elements: schema (Schema), XSL (Extensible Stylesheet Language Extensible Style language), and XLL (extensible Link language extensible link language). Schema defines the logical structure of XML documents, define the relationships between elements in an XML document, attributes of elements and attributes of elements and elements, and it can help XML parsers verify that XML document tags are legitimate; XSL is the language used to specify the representation of XML documents, similar to CSS XLL further expands the simple links already on the current web.

XML is a language that defines languages, and there are now several markup languages that have been created primarily as XML specifications, such as: Chemical Markup Language (CML: Defines how chemical formula is described, displays it on a Web page), Mathematical markup Language (MathML: Displays complex mathematical formulas as Web pages in the browser), Synchronized Multimedia Integration language (SMIL: How to present multimedia information to the WWW).

The parser for an XML file (a tool that checks the XML file for structural errors, splits the markup in the XML file, and reads the correct information) is mostly written in the Java language, so that whenever the computer supports a Java virtual machine, can support XML (almost all computers support Java virtual machines). Therefore, even heterogeneous systems, do not have to worry about reading the other party's data, we all use the XML file as the media to transmit data, because as long as the other computer on the appropriate XML parser, you can correctly read the information. Now, database products from several major database vendors, such as Oracle 8i, Informix, IBM DB2, and so on, are beginning to support XML, and data from the input database can be easily translated into XML, even directly into XML. There are predictions that future electronic documents will be the world of XML.

Tip: XML preserves the functionality of SGML 80% and reduces the complexity of 20%, making it cheap to develop XML-enabled applications that allow XML to go into "ordinary people's homes".
Learning XML is not difficult, because the XML specification is simple, the entire standard prints out only dozens of pages, and XML is written in the same language as HTML, with tags enclosed in 〈〉 notation. More conveniently, we are able to create XML tags in Chinese, for example, we can create tags such as "Price 〉...〈/price", the content within this tag, which means the price of something. Imagine if all the big online malls use such XML tags to illustrate the meaning of the words on their web pages, then we can have an automated software to crawl the things we are interested in, such as: we want to know the following information about XML books, the software will automatically crawl the pages of 〈xml books ... 〈/xml books in the mark of the field, this is how convenient.

XML has outstanding performance, it has four characteristics: excellent data storage format, scalability, highly structured and easy network transport. Because XML can define its own markup for a particular user's application, it enables XML to be used in a variety of industries for information exchange, providing solutions that are unique to different industries.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.