Summary of XML Learning (ii) Introduction to--xml

Source: Internet
Author: User
Tags cdata coding standards processing instruction xml parser

A summary of XML Learning (ii) Introduction to--xml XML Syntax learning

The purpose of learning XML syntax is to write XML

An XML file is divided into the following sections:

    • Document Declaration
    • Elements
    • Property
    • Comments
    • CDATA zones, special characters
    • Processing instructions (processing instruction)
1.1. XML syntax--document declaration

When you write an XML document, you need to declare the type of the XML document using the document declaration.

Simplest declarative Syntax: <?xml version= "1.0"?>

For example:

1 <?xml version= "1.0"? >2 <softcompany>3     <company>microsoft</company>4     <company >google</company>5     <company>apple</company>6 </softCompany>

The browser parsing results are as follows:

  

Use the Encoding property to describe the character encoding of the document: <?xml version= "1.0" encoding= "GB2312"?>

When you have Chinese in the XML file, you must use the Encoding property to indicate the character encoding of the document, for example: encoding= "GB2312" or encoding= "Utf-8", and save the file with the corresponding file encoding. Otherwise, when parsing an XML file using the browser, a parsing error occurs.

For example:

1 <?xml version= "1.0"? >2 <softcompany>3     <company>microsoft</company>4     <company >google</company>5     <company>apple</company>6     <company> Baidu </company>7 </softCompany>

This XML file does not use the encoding attribute to indicate the character encoding of the document, but there is a "Baidu" in the document, such as the Chinese characters, in the use of IE browser parsing the XML file, IE will not know what encoding to parse the file, it can not be resolved, the error appears as shown in Figure 1:

  

Figure-1

To parse the XML document correctly, you can use the encoding property to indicate the character encoding of the document.

For example:

1 <?xml version= "1.0" encoding= "GB2312"? >2 <softcompany>3     <company>MicroSoft</company> 4     <company>google</company>5     <company>apple</company>6     <company> Baidu </company>7 </softCompany>

At this point in the use of IE browser to parse the XML file, you can normally parse the inside of the Chinese characters, as shown in Figure 2:

  

Figure-2

1.2. An issue often encountered in writing XML files

XML files generally use the internationalized Universal encoding "utf-8", so you can see the XML file header will have such code:

1 <?xml version= "1.0" encoding= "Utf-8"?>

If we are writing an XML file using text editing tools such as Notepad or editplus, for example, use "EditPlus" to write the following XML file:

1 <?xml version= "1.0" encoding= "Utf-8"?> 2 <CharacterEncoding> 3     < China > 4         <encoding> Gb2312</encoding> 5         <encoding>GBK</encoding> 6     </China > 7     < japan > 8         < Encoding>jis</encoding> 9     </Japan >10 </CharacterEncoding>

When we save the file, the encoding of the file is saved by default as "ANSI", as shown in Figure 3:

  

Figure-3

When we wrote the XML file, we used encoding= "utf-8" to indicate the character encoding of the document, but when we saved it, we used the "ANSI" encoding to save the file, because we used the encoding= "Utf-8" in the XML file to indicate the character encoding of the document. So when the browser parses the XML file, it uses the "Utf-8" encoding to parse it, as shown in Figure 4:

   

Figure-4

As you can see, the browser parsing failed, which is why? We clearly specify the character encoding of the document is "UTF-8" Ah, why the Chinese can not parse it? Here we have to say that the ANSI code represents the meaning of God horse.

Different countries and regions have developed different standards, resulting in GB2312, BIG5, JIS and other coding standards. These use 2 bytes to represent a character of a variety of Chinese character extension encoding, called ANSI encoding. under the Simplified Chinese system, ANSI encoding represents GB2312 encoding, and in Japanese operating system, ANSI encoding represents JIS code. different ANSI encodings are incompatible, and when information is exchanged internationally, text that is in two languages cannot be stored in the same piece of ANSI-encoded text.

The following is an analysis of why IE browser x cannot parse an ml file: As shown in Figure 5:

  

Figure-5

It is important to remember that when writing an XML file using a text editing tool such as Notepad or editplus, you must save the file with the encoding indicated by the encoding property of the XML file , so that the browser can parse the XML file correctly.

The above problem will be Characterencoding.xml file again with "UTF-8" code saved once can parse out normally characterencoding.xml

  

Figure-6

The results of the browser parsing are as follows (Figure-7):

  

Figure-7

When writing an XML file using some of the more intelligent Ides, when the IDE saves the XML file, it automatically saves the file with the encoding indicated by the encoding property, such as when writing an XML file in MyEclipse, which can be encoded according to the character encoding specified by the Encoding property. When encoding= "GB2312" is specified, saving the XML file automatically saves the file to GB2312 encoding (-8), indicating that the encoding= "UTF-8" is automatically saved as UTF-8 (-9).

  

Figure-8

  

Figure-9

Use the standalone property to indicate whether the document is independent:

1 <?xml version= "1.0" encoding= "GB2312"  standalone= "yes"?>
1.3. XML syntax--elements

An XML element is a label that appears in an XML file, a label that is divided into a start tag and an end tag, and a label that has the following written forms, such as:

    • Contains the label body:<a>www.cnblogs.com/</a>
    • Without tag body: <a></a>, abbreviated to:<a/>

Several sub-labels can also be nested within a tag. However, all tags must be properly nested and never allow cross nesting, for example:

    Wrong wording: <a>welcome to <b>www.cnblogs.com/</a></b>

A well-formed XML document must have only one root tag, and the other tag is the descendant tag of the root tag.

For all whitespace and line breaks that appear in the XML tag, the XML parser is treated as a label content. For example, the meaning of the following two paragraphs is different.

First paragraph:

1 < URL >http://www.cnblogs.com/</url >

Second paragraph:

1 < URL >2      HTTP://WWW.CNBLOGS.COM/3 </website >

Since spaces and line breaks are processed as raw content in XML, the "good" writing habits of using line wrapping and indentation to make the contents of the original file readable in an XML file can be forced to change.

An XML element can contain letters, numbers, and other visible characters, but must adhere to some of the following specifications:

    1. Case sensitivity, such as,<p> and <p>, are two different tokens.
    2. You cannot start with a number or a "_" (underscore).
    3. You cannot start with XML (or XML, or XML, and so on).
    4. Cannot contain spaces.
    5. The middle of the name cannot contain a colon (:)
1.4. XML syntax--Properties

A label can have multiple properties, each with its own name and value, for example: <input name= "Text" >, the attribute value must be enclosed in double quotation marks (") or single quotation marks ('), and the definition attribute has to follow the same naming convention as the label.

  learn one more trick: In XML technology, the information represented by the tag attribute can also be changed to be described in the form of a child element, for example:

1 <input>2   <name>text</name>3 </input>
1.5. XML syntax--comment

The comments in the XML file take the following: <!--annotation--format.

Attention:

    • Cannot have comments before XML declaration
    • Annotations cannot be nested, for example:
<!--large segment annotation ......<!--local annotation-->......-->  
1.6. XML Syntax--cdata area

When writing an XML file, some content may not want the parsing engine to parse the execution, but rather as the original content processing, in this case, you can put the content in the CDATA area, for the content within the CDATA region, the XML parser will not process, but directly intact output .

Syntax:<![ cdata[content]]>

  For example:

1 <?xml version= "1.0" encoding= "Utf-8"?> 2 <soft> 3 <! [cdata[4     <a classname= "GACL.XDP" > 5         <a1>gacl</a1> 6         <a2>xdp</a2> 7     </a> 8]]> 9     <b>10         <b1> aloof Wolf </b1>11         <b2> Xu Da </b2>12     </b >13 </soft>

After parsing the XML file using the IE Browser parsing engine, the results are as follows (Figure 10):

  

Figure-10

For some individual characters, if you want to display their original styles, you can also use escape as a way to handle them.

  

Escape Character Descriptor

For example:

1 <?xml version= "1.0" encoding= "Utf-8"? >2 <soft>3     <b>4         &lt;b1&gt; aloof Wolf &lt;/b1 &gt;5         <b2> Xu Da </b2>6     </b>7 </soft>

The results of the parser parsing are as shown in Figure 11:

  

Figure-11

1.7. XML syntax-processing instructions

  processing instructions, referred to as pi (processing instruction). The processing instruction is used to direct the parsing engine to parse the XML document content . For example, in an XML document, you can use the Xml-stylesheet directive to notify the XML parsing engine that the CSS file displays the contents of the XML document and that the CSS does not work when the label is in Chinese.

<?xml-stylesheet type= "text/css" href= "CSS file name. css"?>

For example:

1 <?xml version= "1.0" encoding= "Utf-8"? >2 <!--You can use Xml-stylesheet directives in an XML document to inform the XML parsing engine Apply Country.css file to display XML document content-->3 <?xml-stylesheet type= "text/css" href= "Country.css"? >4     <country>5 <c1> China </c1>6     <c2> US </c2>7     <c3> Japan </c3>8     <c4> Korea </c4> 9 </Country>

The Country.css style file code is as follows:

1 c1{2     font-size:200px; 3     color:red; 4} 5 c2{6     font-size:150px; 7     color:green; 8} 9 c3{10     Font-si ze:100px;11     color: #ccc;}13 c4{14     font-size:130px;15     color:blue;16}

The results of parsing the XML file in the browser are as follows (Figure 12):

  

Figure-12

  The processing instructions must be "<?" At the beginning, with "?>" as the end , the XML declaration statement (<?xml version= "1.0" encoding= "Utf-8"?>) is the most common kind of processing instruction.

Here, the explanation of the XML syntax is all done.

Summary of XML Learning (ii) Introduction to--xml

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.