The structure and syntax of XML introductory refinement (1)

Source: Internet
Author: User
Tags cdata closing tag empty end format definition html tags version xml parser
xml| syntax

Tools for creating XML files
An XML file, like an HTML file, is actually a text file. It is clear that you will soon understand that the most common tool for creating XML files is the same as HTML,
It's "notepad." In addition to Notepad, there are, of course, more convenient tools, such as XML Notepad, XML Pro, clip! XML Editor
, one of the features of these tools is the ability to check that the XML file you are building conforms to the XML specification. However, these tools are now available in English only, and
Need to be paid to use. Of course, you can still use FrontPage, Dreamweaver and other tools, but not very convenient to use. With the gradual popularization of XML,
I believe that in the near future, there will also be a very useful tool for creating XML files.

An example of an XML file
Now let's use Notepad to create our XML file. First look at an XML file:

Example 1


〈?xml version= "1.0" encoding= "gb2312"?



Name 〉xml Introductory Refinement 〈/name

Author John 〈/author

Price currency unit = "RMB" 〉20.00〈/price



Name 〉xml Grammar 〈/name

!--The book is about to be published--〉

Author Dick 〈/author

Price currency unit = "RMB" 〉18.00〈/price




This is a typical XML file, edited and saved as a file with an. xml suffix. We can divide this file into a file preface (prolog) and a text
Parts of the body two a large part. The first line in this file is the preamble to the file. The line is something that an XML file must declare, and it must also be located in an XML
The first line of the piece, which basically tells the XML parser how to work. Where version is the Standard Edition number that is used to indicate this XML file;
Encoding indicates the type of character used in this XML file and can be omitted, and when you omit this declaration, the following character code must be a Unicode character code
(It is not recommended to omit). Because we use the GB2312 character code in this example, encoding this statement cannot be omitted. In the preambular part of the document
There are also statements that we'll give you later.

The rest of the file is part of the file body, and the content information for the XML file is stored here. We can see that the file body is from the beginning of the reference
And the end of the 〈/reference control tag, which is called the "root element" of the XML file; The book is the "child element" under the root element;
Under the book There are also "name", "Author", "Price" of these child elements. The monetary unit is a "property" in the element "price", "RMB"
is the property value.

!--The book is about to be published--〉 this sentence is the same as HTML, is the annotation, in the XML file, the annotation part is placed between "!--" and "--〉" mark

As you can see, the XML file is fairly straightforward. Like HTML, an XML file is made up of a series of tags, but the markup in the XML file is my
The custom tag, with a clear meaning, we can explain the meaning of the content in the tag.

Syntax for XML files
With an initial impression of the XML file, let's talk about the syntax of the XML file in detail. Before we speak grammar, we must understand an important
Concept is the XML parser (XML Parse).

1.XML Parser

The main function of the parser is to check the XML file for structural errors, peel the tags in the XML file, and read the correct content to the next
Application processing. XML is a markup language for structured file information, and there is a detailed rule in XML specification for how to tag a file.
The profiler is the software (written in Java) that is written according to these laws. Like HTML, in a browser, you must have an HTML parser so that the browser
Be able to "read" a variety of HTML tags made up of web pages, to show them in front of us. If a browser's HTML parser does not read the tag, it will return
Return us the error message.

Because the HTML tags are actually quite confusing, there are a lot of nonstandard tags (some web pages with IE can be normal display, and with Netscape Navigator
, so from the outset, the XML Designer has strictly defined the syntax and structure of XML, we write the XML file must follow these rules, otherwise the XML
The parser will show you the error message mercilessly.

There are two kinds of XML files, one is the well-formed XML file, the other is the validating XML file.

If an XML file satisfies some of the relevant rules in the XML specification and does not use a DTD (a file format definition-after detailed), it can be said that the file is
Well-Formed. And if an XML file is well-formed and correctly using the syntax in DTD,DTD is correct, then the file is
Validating. For two kinds of XML files, there are two kinds of XML parsers, one is the well-formed parser, the other is the validating parser. In IE 5
The validating parser can also be used to parse well-formed XML files, including validating parsers.

Check to see if it satisfies the well-formed condition. We can open the first XML file that we just edited with IE 5 version of the browser.

You may want to ask why the display in the browser is the same as my source file? Yes, because for an XML file, we just care about its content, and its
The display form is given to CSS or XSL for completion. Here, we do not define its CSS or XSL file for this XML file, so it shows in its original form
Shown In fact, for electronic data interchange, you just need an XML file, and if you want to show it in some form, we have to edit the CSS or XSL
File (this issue will be discussed later).

2.well-formed XML file

We know that XML must be well-formed in order to be parsed correctly by the parser and displayed in the browser. So, what's well-formed?
What about the XML file? There are several guidelines that we must meet when creating XML files.

First, the first line of the XML file must be to declare that the file is an XML file and the XML specification version it uses. There can be no other elements in front of the file or
The comment of the person.

Second, there is only one root element in the XML file. In our first example, the reference 〉...〈/reference is the XML file
The root element.

Third, the tags in the XML file must be properly closed, that is, in an XML file, the control tag must have an end tag corresponding to it. Such as:
The name tag must have a corresponding 〈/name of the end tag, unlike HTML, where some tag's end tag is optional. If you encounter a self in an XML file
The tag of a cell, which is similar to the 〈img src=.....〉 in HTML without the end tag, XML calls it an "empty element" and must use this
Type of writing: empty element name/〉, if the element contains attributes when the write rule is: empty element Name Property name = "Property value"/〉.

Four, the mark must not cross. In the previous HTML file, you could write this:

〈b〉〈h〉xxxxxxx〈/b〉〈/h〉,〈b〉 and 〈h〉

There are overlapping areas between tags, and in XML it is strictly forbidden to have such staggered notation, and tokens must appear in a regular order.

The value of the attribute must be enclosed by the "" number. As in the first example, "1.0", "gb2312", "RMB". It's all enclosed in the "" number.
, you can't miss it.

The control tag, instruction, and attribute names are case-sensitive in English. Unlike HTML, in HTML, similar to 〈b〉 and 〈b〉 markup meanings
Is the same, and in XML, tags like 〈name〉, 〈name〉, or 〈name〉 are different.

We know that in HTML files, if we want the browser to display exactly what we're typing, we can put these things in the
〈pre〉〈/pre〉 or 〈xmp〉〈/xmp〉 mark the middle. This is essential for us to create HTML-instructional pages because HTML is displayed in the Web page
The source code. In XML, to implement such a function, you must use a CDATA tag. Information in a CDATA tag is passed unaltered to the application by the parser
And does not resolve any control tags in that segment of information. CDATA areas are by: "! [cdata[] is the start tag, with "]]〉" as the closing tag. Cases
such as: Example 2 in the source code, in addition to "! [cdata["and"]]〉 "symbols, the rest of the content parser will be handed to the downstream application intact, even if
The start and end spaces in the CDATA area, as well as line-wrapping characters, are also forwarded (note that CDATA are uppercase characters).

Example 2

〈! [cdata[flying xml〉〉〉〉〉,:-)



The XML processing whitespace character is not the same as HTML. The HTML standard stipulates that no matter how many blanks there are, it is treated as a blank, whereas in the XML the Rules
Fixed, the parser must be faithfully handed over to the downstream application to handle all the blanks outside the tag. In this way, we sometimes have to discard the indentation when writing HTML files
Custom, because the indentation of the space, the parser also to deal with. Such as:

Author John 〈/author





The above content is different for the parser (the latter includes two newline marks in addition to the John character in the author 〉〈/author)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.