Getting started with the basics of XML

Source: Internet
Author: User
Tags date format definition empty end include tag name version
XML discussion XML file before we look at an example:


? Alan Turing


This is a well-formed XML file,<person> and </person>, respectively, are start and end tags.

L start tag: start with <, End with >, with tag name in the middle.

L END tag: Start with </, End with >, with tag name in the middle.

Note: The tag name for the start and end tags must be the same, but there is no specification for what to use as a name, and this is different from HTML (the HTML tag name is certain), and you can use person to mark a man and use cat to mark a cat.

The Alan Turing in the middle of the tag is data, where the space between Alan and Turing is also data, which means that the space in the tag's data is not ignored.

Sometimes we may need elements that do not have any data (the element refers to the content between the start and end tags, including the start and end tags, such as the example above is an element), such as the following:


This is an empty tag, but we have another, simpler notation for the empty tag:

<person/> Note: XML is case-sensitive, which is different from HTML. <Person> and <PERSON> are different tags, such as if you have an element that starts with <person>, then you can't use </person > as the closing tag.

The above example refers to an element, we now give an example of a complex point, and then give the concept of an XML tree.





<profession>computer scientist</profession>




Obviously the above example is still a person element, but unlike the previous one, this element contains 4 child elements, 1 name elements, and 3 profession elements. We call person the parent element of name, and obviously he is also the parent element of profession, as we can see that name is the parent element of First_Name and last_name.

In the example above we found that the tag was nested, which is allowed. But overlapping markers are illegal, such as:
<strong><em>this Common example from html</strong></em>

Should be:

<strong><em>this Common example from html</em></strong>

Depending on the relationship between the parent-child element in the above example and noting that any XML file can only and only contain one root element (that is, an element without a parent element), it looks like a tree.

Now we give an example of an XML file mixed with data,

<name><first_name>Alan</first_name> <last_name>Turing</last_name>
</name> was one of the the the ' the ' people to truly deserve the name
<emphasize>computer Scientist</emphasize> Although his contributions
To the field are too numerous to list, his best-known are the
Eponymous <emphasize>turing test</emphasize> and
<emphasize>turing Machine</emphasize>

<definition>the <term>turing test</term> is to the standard
Test for determining whether a computer is truly intelligent. This
Test has yet to be passed. </definition>

<definition>the <term>turing Machine</term> is an abstract finite
State automaton with infinite memory, can be proven equivalent
To any of the other finite state automaton with arbitrarily large memory.
Thus What is true for a Turing machine is true to all equivalent
Machines no matter how implemented.
<name><last_name>Turing</last_name></name> was also an accomplished
<profession>cryptographer</profession>. His assistance
Was crucial in helping the Allies decode the German Enigma
Machine. He committed suicide on <date><month>June</month>
<day>7</day&gt, <year>1954</year></date> after being
Convicted of homosexuality and forced to take female
Hormone injections.
The above example I don't explain, but you need to know that he is a legitimate XML file, which means that tags and content can be mixed. However, XML files of this format are cumbersome to process, so they are not recommended for use.

Then we'll talk about attributes (Attributes). See Example:

<person born= "1912-06-23" died= "1954-06-07" >

Alan Turing


The Born and died of red flags are attributes. Where born is the property name, 1912-06-23 is the property value, the attribute value is "the basket up, of course can also use single quotes ' basket up."

<person died = ' 1954-06-07 '? born = ' 1912-06-23 ' >

Alan Turing


The effect of using single quotes is that you can add double quotes to the value of the attribute.

Here we find a problem:


<name first= "Alan" last= "Turing"/>

<profession value= "computer scientist"/>

<profession value= "mathematician"/>

<profession value= "cryptographer"/>


In this example I have added 4 child elements to the person element, each element has its own attributes, and the corresponding value, and then the 4 elements are empty elements. In contrast to the previous example, do you think it is better to limit the value to the attribute or to put it directly between the tags? This is a vexed question, and my opinion is that you decide which one to use. But be aware that for the same element, he cannot contain several attributes of the same name.

<person born= "1912-06-23" born= "1954-06-07" >

Alan Turing


The above is an illegal XML file.

Then we look at the problem with the special character. Since < and > are used as tokens, we generally cannot include < and "in the data section," but instead use < and >, which is actually in HTML, and is handled in HTML, as in the case of & we use &, double quotes "use" and so on.

Then look at the annotation, and the method is the same as the HTML:

<!--to the left is the note start tag, the right side is the end tag-->

But note that the contents of the annotation cannot be included? --and any markup inside the annotation will be ignored! Also note that annotations cannot appear in the markup of an element.

Now we look at XML as a whole:

1. XML declaration

All the XML documents may (and should!) ) is started by an XML declaration (XML declaration). Although the document sounds

Ming uses a syntax similar to that of instructions, but technically, they are not the same as the X M L recommendation criteria because

A declaration is a reserved part of the XML.

<?xml version= "1.0" encoding= "ASCII" standalone= "yes"?>


Alan Turing


If you include the X M L declaration, it must be in the front of the document-no white space or comment is allowed. Strictly speaking,

This declaration is not necessary in X M, but we will see later that it does make some optimizations when working with the document.

These properties have been defined in the XML 1.0 specification:

version-cannot be omitted; value must be "1.0"; This property is used to guarantee support for future versions of X M L.

e n c o d i n g-optional; value must be a valid character encoding, such as "U t F-8", "U t F-1 6" or

"I S O-8 8 5 9-1" (i.e. L a T I n-1 character encoding). All x M L parsers are required to support at least U T F-8 and U T F-1 6.

If this attribute is not included, it is assumed to be "U t F-8" or "U T F-1 6" encoding, depending on the starting "<". The format of the X M L "string.

s t a n d a l o n e optional; the value must be "Y e S" or "N o"; if "Y e S" means that all required entity declarations are included in the document, and if "N o" means the external D T D is required. The DTD is described later.

Finally, the relevant requirements for good XML are given:

1. Each start tag must have an end tag match

2. tags can be nested but not overlapping

3. Each XML file has only one root element

4. An element cannot contain two attributes of the same name

5. Annotations cannot appear in element tags

6. No such characters as < or & appear in element values or attribute values

Related Article

Cloud Intelligence Leading the Digital Future

Alibaba Cloud ACtivate Online Conference, Nov. 20th & 21st, 2019 (UTC+08)

Register Now >

11.11 Big Sale for Cloud

Get Unbeatable Offers with up to 90% Off,Oct.24-Nov.13 (UTC+8)

Get It Now >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.