XML was quickly entered into the history stage in February 1998 when XML was approved by the consortium, known as "the technology that suddenly emerges". It is called the successor of HTML and is also known by some as the future lingua-hybrid of structured data exchange.
As XML emerges from the gloom at the start of its consortium, it may be inevitable that the new data format will begin to produce the wrong concept as quickly as it attracts enthusiasts. In this column, I'll clarify some of the myths about XML before they become permanent misconceptions.
Myth 1:xml is a common achievement led by Microsoft
XML is a common achievement, but not Microsoft's. In fact, XML was presented by a group of labeled language experts, organized by Sun Microsystems, to develop a form of the time-honored ISO standard SGML for the web.
Microsoft is really a major contributor to XML work, but so are other big companies (Sun, Hewlett-Packard, Netscape, Adobe and Xerox), as well as major SGML vendors and system integrators (Arbortext,inso, Softquad,grif, Texcel and Isogen), representatives of academic groups (NCSA and Text Encoding initiative), early adopters (Datachannel and vignette), and one of the world's leading SGML experts, James Clark, He is W3CSGML's technical leader.
With regard to XML, it is amazing that all of these people and organizations put their personal and corporate agendas aside and collaborate to create an inherently open standard that is driven entirely by the needs of the user. These requirements include:
L extensibility to define new tags as needed.
L structure to represent arbitrary levels of complexity data.
L checksum to check the structure correctness of the data.
L Media Independence to publish content in a variety of formats.
L Vendors and platforms are neutral to handle any document that meets the standards with standard commercial software or even simple text tools.
While I have to admire Microsoft's familiarity with and marketing of XML concepts, XML does not belong to Microsoft. XML belongs to the world.
Myth 2:xml is an extension of HTML
Early generalizations of XML have led many to believe that XML is simply a way to extend HTML by adding new tags. In fact, XML and HTML are at a completely different level in the markup language. HTML is a markup language-a set of standard delimiters that can be placed in a document to indicate the role of a particular part of a document. For example, everything between <H2> and </H2> in an HTML document is understood as a secondary document title.
Markup language
The experience of the markup language is limited to the web. People are often surprised to learn that HTML is just one of many standard markup languages that have been developed for many years in a particular industry. For example, the aviation industry has a label language for aircraft maintenance manuals, known as ATA-2100, the semiconductor industry has a markup language for circuit data, called PCIs, and the computer industry has a markup language for software documents called DocBook.
Some of these markup languages are used longer than HTML, many of which face different aspects of the problem to be addressed. For example, consider this paragraph of HTML:
This is a text that is likely to belong to the title above. </P>
A similar docbook might look like this:
<SECT2>
<TITLE> Level Two title </TITLE>
<PARA> This is a text that definitely belongs to the title above.
We know this because they are all contained within the same SECT2 element.
</PARA>
</SECT2>
Although these different markup languages have their differences, all of these, including HTML, are similar in three ways.
• Each defines a set of standard tags with standardized and standardized usage rules--in other words, a standardized syntax.
• Each is designed to work best for a particular type of document or data.
L Both use the 12-year-old International Text Processing standard SGML to define their standard set of tags and syntax.
All of these languages look similar because they all use the familiar angle brackets and inherit from SGML's specific reference syntax.
SGML Layer
According to the above description, it should be clear that SGML itself belongs to a conceptual layer different from any individual markup language defined by SGML, and that the distinction between SGML and a particular markup language is often summed up as SGML is a meta language rather than a language. This is a very imprecise generalization. SGML is not as abstract as a real meta language, such as the Bacchus/NAR Paradigm (BNF), which is used to define programming languages. However, calling SGML a meta language really captures the key: SGML is not a specific markup language; It is a language that defines markup languages.
The key factor in understanding XML is that it belongs to the SGML layer, not to the HTML layer. XML is a simplified form of SGML, rather than an extended form of HTML. The difference between XML and SGML is that the designer of XML removes many of the advanced features of SGML, which make it difficult to implement a fully SGML parser in a Web browser.
But the basic idea is the same: XML is a technology that allows you to create a different markup language for an unlimited number of different purposes. The key to XML-and why it became so pervasive-is that all the different special-purpose languages that can be defined by it can be parsed by a standardized handler that is small enough to embed in every web browser.
People who do not understand this distinction often come to the conclusion that an application that is sensitive to XML will allow them to simply sprinkle new markup around their HTML documents. Trying to "extend" HTML in this way will make the mess we've gotten even more confusing.
Myth 3:xml can drive your own web browser
Remember that the HTML concept is a markup language that contains a relatively small set of standard tags, and is related to some number of standardized behaviors. The XML concept is an infinitely large set of possible tags, and the related behavior is simply not standard. The code of conduct must come from somewhere else. When published, this is usually a stylesheet, but in other areas it can be as flexible as javabeans, or as specialized as an industry-standard protocol, where programmers write standardized applications.
Syntax is not semantics
XML proponents summarize this as XML definition syntax rather than semantics. Some theorists object that this simplistic expression ignores the semantic connection between XML syntactic objects and the XML data (such as elements and attributes) that form them. However, the slogan "syntactic rather than semantic" tries to emphasize more, simpler: Unlike HTML tags, XML tags have no predefined meaning. The meaning or behavior must be provided by the operating conditions of the program, or by the script or by the declarative conditions of the style sheet, or even the old and easy to use plain text.
This confusion is evident when the expected XML user asks pitifully how XML is displayed on their Web browsers. The answer is that it does not show-at least not on its own.
To emulate what is now done with HTML in a browser, you must provide HTML as a whole but difficult to manage as a whole: you must provide the content of a document (represented in XML) and the processing of it, which must be defined by the program (with a script) or declared (with a style sheet).
Style sheet
There is currently a lack of a stylesheet language that is both strong enough for XML and easy to use, which prevents XML as a universal use of web documents. The style sheet language developed for HTML cascading style sheets (CSS) can be used to apply styles to XML documents, but it does not have the ability to make transformations and build structures (such as catalogs) that are typically required based on XML publishing.
Document style semantics and the specification language (the Documents styles semantics and specification Language)-the ISO stylesheet language designed for SGML-have the capabilities required for advanced publishing projects. But Dsssl (with "whistle") has a syntax based on schema programming language, and many people find it difficult to learn. It also lacks a rich declaration layer, which makes it almost impossible to ensure that the independently developed stylesheet editor can interoperate.
This is the entry point for the extended style language (XSL). As part of the initial XML grand plan, XSL is a new language that combines the functionality of DSSSL and the simplicity of XML, as well as the "style attribute" terminology established by cascading style sheets. A web-enabled XSL Working group, established in January 1998, is busy defining the language that makes xml-based web-based publishing possible.
Although a final XSL specification will take almost a year, the first draft XSL work has now been published on the Web Web site for the http://www.w3.org/TR/WD-xsl. As it enters the new century, this early norm deserves careful attention by anyone who wants to make an electronic announcement.
Myth 4:xml just for data.
Since we do not yet have a strong enough stylesheet language to allow XML to illustrate its superiority as a publishing method, the first batch of XML applications is based on what it can do: transferring structured data.
A single, readable syntax
By serially arranging any kind of structured data-including related data-in a way that enables it to be handled and displayed with simple, ubiquitous standardized tools, XML provides us with a single, human-readable syntax. The larger implications of a standard, manageable serial data format are unimaginable, but they will obviously have a huge impact on e-commerce. In addition, it seems clear that electronic commerce will eventually become synonymous with business in the general sense.
XML is to data, just as Java is to a program, and it will make data irrelevant to platforms and vendors. This capability is driving a wave of XML middleware applications, which will begin to be visible in early 1999. However, the ability of XML to support data and metadata exchange should not distract us from the original design intent of XML. The designer of XML considers not only the transport layer of a data, but also a common media-independent publishing format, which will support users of all levels of technology in each language.
Media-Independent publications
Media-Independent publishing is actually a much more difficult issue than data exchange. In fact, it can be said that the requirement of the general release is a superset of the requirement of data exchange. The advent of XSL will make the common release solution possible, with little recognition of the consequences.
The key to understanding the revolutionary potential of XML is that it is only part of a more ambitious picture. XML itself can provide a standardized exchange format for databases and spreadsheets. This is good. But XML and XSL can also replace existing word processing and desktop publishing formats. It can actually give us a single, fully internationalized format with almost unlimited print and online publishing capabilities that are fully interoperable across all products and platforms. This means far beyond the data exchange and the Web.
What does a standardized publication mean to users
The combination of XML and XSL may be more complex and difficult to handle than today's HTML, so it will initially be used by experts who manually handle large and specialized publishing applications. These applications will require the highest degree of automation and media independence-newspapers, business directories, encyclopedias, catalogues, TV listings, and so on.
Only when common word processing and desktop publishing programs begin to store files in combination of XML and XSL, rather than proprietary formats, will standardized processing begin to expand outward from this expert's specialized group. This is not a technical issue, but an economic one, since the big manufacturers of publishing tools historically relied on proprietary formats to limit their users. It is only when ordinary users begin to realize the benefits of a standardized, open format, and when they begin to demand support for them, that the vendor will switch to it.
The benefits of a standardized data and expression format are irresistible. They include:
• Full interoperability of content and style between applications and platforms;
L The creator of the content is out of the manufacturer's control of production tools;
L users choose their own freedom to view the content.
• Easy to create powerful tools for dealing with large scale content;
• A fair competitive venue for independent software developers;
L truly international publishing on all media.
I believe that users ' awareness of these benefits will ultimately force vendors to support standardized methods, just as users ' need to access the Internet forces vendors to support the Web.
As a consequence, the relationship between many kinds of desktop software application producers and consumers will be established, which will prove to be of great benefit to us all. This would mean the end of a handful of big companies ' grip on the market, and, perhaps more importantly, the end of a handful of big powers ' grip on the market. The result will be better communication between the product and the human race.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.