Web programming must look: XML grammar analysis

Source: Internet
Author: User
Tags define cdata character set opening and closing tags string version xmlns xsl
xml| Programming | Web pages before parsing the XML grammar, it is necessary to understand the basic rules of XML syntax:

Lexical Features:
1 XML is case-sensitive, such as element names in open and close tags should be consistent case <mytag>...</mytag>,xml reserved word strings should match the case requirements <?xml ...> <! Entity>

2 the XML reserved tag character is:< >, and the reserved character is not allowed to appear in the element name, element text, property name, property value,< user open tag,> used to turn off the tag,& for turning, the common conversion is &lt generation. &GT Generate >,&amp Build &,&apos build ', &quot build '

3 The element name begins with an underscore or letter, can contain letters, numbers, periods, hyphens, underscores, colons, and extended characters for other languages, which cannot have spaces (Geff, Hop Geff, newline, carriage returns), and element names can be prefixed by a name domain. such as:<mytag> <dt:mytag> element text can be in addition to the XML reserved character of the character set, such as <mytag> I money is $2000 </mytag>

4 The rule of the property name is the same as the element name, and the property value is enclosed in single or double quotes, which can be composed of strings other than XML reserved characters, such as: <mytag myprop= "proper value" >. The attribute name has an xmlns prefix, indicating that the property defines a name field, such as: <mytag xmlns:ns= "Http://www.myweb.com/myschema" >

Syntactic features:
1 An XML document consists of an XML description, multiple optional document descriptions, multiple optional XML directives, multiple optional XML annotations, and a data body of a root element, in addition to a CDATA section in an embedded statement, such as:


<?xml ...? >/*xml Description * *
<! DOCTYPE ...>/*xml Document Description * *
<!--...-->/*xml notes * *
<?xml-stylesheet ...? >/*xml Instruction */
<root>/* Root data elements * *
<child>
... <! [cdata[...]] >
</child>
</root>

2 The XML description is opened by <?xml and closed by?>, which contains optional instructions such as version and encoding, such as: <?xml version= "1.0" encoding= "UTF-9"?>
3 The XML document description is opened by the <! and the reserved string, which is closed by >, such as: <! DOCTYPE mydoc SYSTEM "MYDOC.DTD" >
4 The XML instruction is opened by?> and the reserved string, which is closed by the <?xml-stylesheet, such as: Type= "text/xsl" href= "mystyle.xsl"?>
5 XML annotations are opened by <!―― and closed by ――>, such as the:<!--this are my XML document-->
6 XML elements are opened by the < element name >, by/&gt, or </element name > Close, and the opening and closing tags of the elements match each other, such as <myteg. /> or <mytag>...</myteg>,xml elements allow nesting, should also maintain a level of matching, such as <myteg><subtag> </subtag></mytag>.
7) Cdtata section by <! [cdata[> is turned on and closed by]]> to circumvent the XML parsing rules for the statements that reside in it. such as: <! [cdata[SELECT * FROM mytable where Thefield <= ']]>
According to the above XML grammatical features, we can construct the regular formula for lexical analysis and the structure of the push automata for syntactic analysis.
XML lexical Regular:
#define DIGIT [1,2,..., 9]//* Numeric character/*
#define Letter [A,b,..., z,a,b,..., z]//* Alphanumeric character * *
#define SIGNS [~,!, @, #,%, ^, &,*, (,),?,:,;, ",",,, .,/,-, _, +, =, |, \]////* Symbol characters * *
#define ASCII2 [0x80,..., 0xFF]/*ascii chart2 Extended character */
#define SPACE [0x20, T, \ r, \ n]/* spaces, jump character, carriage return, line feed/*
#define RESERVE [<,/*xml reserved character */
1) The regular formula for the element name:
Element_name-> (_ | letter | ascii2) (ε| _ |-| | |. | digit | letter | signs | ascii2) *
2 The regular formula of the element text:
Element_text-> (ε| not reserve) *
3) The regular formula for the property name:
Proper_name-> (_ | letter | ascii2) (ε| _ |-| | |. | digit | letter | signs | ascii2) *
4) The regular formula for the property text:
Proper_value-> (ε| not reserve) *



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.