XML learning notes 5--XSD complex data types

Source: Internet
Author: User
Tags xml attribute

A simple data type corresponds to a complex data type. The data type of an XML element can be a simple data type or a complex data type, the data type of the XML Attribute can only be a simple data type. In this note, we will take a look at the complex data types in XSD.

1. Define complex data types

(1) The <simpleType> element is used to define a simple data type. You can use the <complexType> element to define a complex data type. Its syntax is:

  id  name  abstract  mixed  block  final  any-attributes

The attributes of the <complexType> element are described as follows:

Attribute Description
Id Unique Identifier <complexType> element itself
Name Use the name of the new data type defined by the <complexType> element
Abstract Whether the data type is abstract. For example, the data type cannot be directly used in XML documents.
Mixed

Whether it is a hybrid type. If it is a hybrid type, character data and child elements can appear simultaneously.

This attribute cannot be used if the sub-element is <simplexContent>

If the child element is <complexContent>, the mixed attribute can be overwritten by the mixed attribute of the <complexContet> element.

Block Prevents the use of a complex type of the specified derived type to replace the complex type of the current definition
Final Prevents the use of specified derived types to derive new types
Any attributes Specify any other attributes of the non-schema namespace

(2) complex data types can only be used for elements but not for attributes. Further, You can classify data types based on the elements that can be applied:

  • Simple data type: the corresponding element content is a simple type value, and the element cannot have attributes. Use <simpleType> to define
  • Complex data types with simple content: the corresponding element content is a simple type value, but the element has attributes, which are defined using <complexType> <simpleContent> </complexType>, the syntax of the <simpleContent> element is as follows:
 
  • Complex data types that contain complex content: the corresponding elements can be elements that contain child elements, empty elements, or elements that contain mixed content, regardless of whether the elements have attributes, use <complexType> <complexContent> </complexType> to define the <complexContent> element Syntax:
 

(3) If the complex data type defined under the root element <schema> is global, the name attribute is required; otherwise, it is local.

(4) The final attribute is used to specify that a new type cannot be derived in that way. The optional values include # all, extension, and restriction. The default value is the finalDefault attribute value of the root element <schema>. This attribute is similar to the final attribute of <simplexType>, but the final attribute value of <simplexType> can be a free combination of # all, restriction, list, and union.

(5) The block attribute specifies that the defined type cannot be replaced by the Type derived from the specified method. The value can be the same as that of final. The default value is the blockDefault attribute value of the root element <schema>.

2. Define Elements

(1) how to define elements when defining complex data types and defining sub-elements and attributes? In XSD, you can use <element> to define elements. The syntax is as follows:

  substitutionGroup  abstract

The attributes of element are as follows:

Attribute Description
Id Unique Identifier <element> element
Name Name of the newly defined element. Required attribute when defining the root element <schema>
Ref References to another element can contain a namespace prefix.
Type Data type, which can be a built-in data type, simpleType, or complexType
SubstitutionGroup It can be used to replace the element name of this element. It must have the same type or be derived from it.
Default Default Value. used when the element content is simple type or textOnly
Fixed Fixed value. used when the element content is simple type or textOnly. The default value is default and the fixed value fixed cannot be specified at the same time.
Form Whether to use the namespace prefix to specify the element. The default value is the elementFormDefault attribute value of the <schema> element.
MaxOccurs The maximum number of occurrences of a parent element. It is a non-negative integer or unlimited (unbounded). The default value is 1.
MinOccurs The minimum number of times that a parent element appears. It must be smaller than or equal to maxOccurs. The default value is 1.
Nillable Whether the zero value displayed can be assigned to this element. The default value is false. If the value is true, the nil attribute of this element can be set to true in the XML document.
Abstract Whether it is an abstract element. For example, it cannot be directly used in an XML document.
Block Prevents the current element from being replaced by an element of the specified derivation method.
Final Set the default value of the final attribute on the element.
Any attributes Specify any other attributes of the non-schema namespace

When the parent element is the root element <schema>, attributes such as ref, form, maxOccurs, and minOccurs cannot be used, however, attributes such as substitutionGroup and final can only be used when the parent element is the root element.

(2) You can use the <group> element to define a group of attributes, and then use the reference of the element group where other elements are needed. Syntax:

 

Let's look at an example:

                          

Three sequence indicators are involved to define the order of elements:

  • All: sub-element can appear in any order, but each sub-element must appear only once. In this case, set minOccurs to 0 or 1, and set maxOccurs to 1 only.
  • Choice: child elements are mutually exclusive. Only one of them can appear.
  • Sequence: child elements must appear in the specified order

(3) element wildcard

In some cases, if you cannot determine which child elements and attributes the specified element must contain, you can use wildcards. In XSD, the <any> element is used as the element wildcard to represent any element. That is, the position where the <any> element appears can be replaced by any element. The syntax format is as follows:

 

<Any> attributes of an element:

Attribute Description Value/Value Type Default Value
Id Uniquely identifies this element ID type  
MaxOccurs Maximum number of times this element can appear Non-negative integer or unbounded 1
MinOccurs Minimum number of times this element can appear Non-negative integer, must be smaller than maxOccurs 1
Namespace Specify the space in which the element that can replace the wildcard must come from
  • # Any: any namespace Element
  • # Other: Elements from any namespace other than the current namespace
  • # Local: elements without namespace restrictions
  • # TargetNamespace: elements of the current namespace
  • Namespace URI: Specify the namespace Element
  • List of the preceding values: Elements of any namespace in the Value List
 
ProcessContents Specifies how the application or XML processor verifies the replacement element.
  • Strict: the XML processor must obtain the Schema corresponding to the namespace specified by the namespace and verify all elements from the namespace.
  • Lax: the XML processor attempts to obtain the Schema corresponding to the namespace specified by namespace. If the Schema is successful, all elements are verified. Otherwise, no error is reported.
  • Skip: the XML processor does not obtain the required namespace or perform any verification.
Strict

(4) element replacement

XSD also provides a mechanism to allow one element to replace another element. to define an element to replace another element, you can add the substitutionGroup attribute to the element, the value is the name of the element to be replaced. Pay attention to the following two points when using element replacement:

  • The replacement element and the replaced element must be declared as a global element.
  • The replacement element and the replaced element either have the same data type, or the replacement element type is the derived type of the replaced element type.

In addition,

  • You can use the final attribute to prevent yourself from being replaced by a specified derivative type.
  • You can use the block attribute to prevent replacement of a specified derivative type.

3. Define attributes

The definition attributes and definition elements are completely unified, but the <attribute> element is used for defining attributes. The syntax format is as follows:

  uese

(1) the attributes of the <attribute> element are basically the same as those of the <element> element. The difference is that the default value of the form attribute is the value of the attributeFormDefault attribute of the root element <schema>. In addition, the use attribute is not available in <element>. It indicates how to use this attribute. the following values can be obtained:

  • Optional: The attribute is optional and can have any value of the specified data type.
  • Prohibited: Attributes cannot be used. (Since attributes cannot be used, why? It is mainly used to derive a new type to delete an attribute of the original type)
  • Required: required attribute. default and fixed cannot be specified.

(2) attributes defined under the root element <schema> are called Global attributes. Other attributes can be referenced by the ref attribute of the <attribute> element; you can also directly put <attribute> In the <complexType> element to define attributes.

(3) similar to the <group> element definition element group, you can also use the <attributeGroup> element to define an attribute group. The syntax format is as follows:

 

(4) attribute wildcard

Similar to element wildcards, you can use <anyAttribute> to represent attribute wildcards. The syntax format is as follows:

 

The attribute meaning is the same as that of the element wildcard <any>.

4. Let's see how to derive complex data types.

After knowing how to define elements and attributes, You can further understand how to define complex data types. In general, there are two problems to define complex data types: the first question is the question of the base type-which type is the basis for defining complex data types? The second problem is the issue of the derivation method-you can use the <restriction> and <extension> methods to derive complex data types.

(1) base type

  • AnyType: similar to ANY in DTD, XSD also has an anyType type. This type of element has no restrictions and can contain sub-elements and string content, you can also add any attributes (but these attributes need to be defined in the XSD file). The anyType type is the base type of all simple and complex types and is usually used to derive new types, instead of defining elements directly.
  • Simple Type
  • Complex types with simple content: the element content is a simple type value, but the element includes attributes
  • Empty element type: it is used to define an element whose content is null or a null string, but this element can accept attributes. There are two methods to define the null element type:
    • Extended string with a length of 0: If this element does not need to contain attributes, define the element directly using a string type with a length of 0.
    • AnyTye restriction: When anyType is restricted, no sub-elements are defined. You only need to define the required attributes.
                                                                            
  • Types that contain child elements
  • Hybrid content type

(2) Derivation

  • Limit <restriction>
  • Extended <extension>

The following is a list of statistics from these dimensions:

Base type Derivation Method XSD Element Used for definition Description
AnyType type Restrictions

<ComplexType> <complexContent> <restriction>

Because anyType can only be restricted and cannot be extended, You can omit the <complexContent> <restriction> element,

Directly use <all> | <choice> | <sequence> and other elements in <complexType>.

Extension

 

The anyType type has no restrictions, so it does not need to be extended.

Simple Type Restrictions

<SimpleType>

The final result of limiting a simple type is also a simple type, so the <simpleType> element is used
Extension

<ComplexType> <simpleContent> <extension>

You can add attributes or attribute groups to derive complex data types.
Complex types that contain simple content Restrictions

<ComplexType> <simpleContent> <restriction>

  • Add further constraints to Element Content
  • Adds further constraints to the attribute type of an element.
  • Delete certain attributes
Extension

<ComplexType> <simpleContent> <extension>

Add attribute
Empty element type Restrictions

<ComplexType> <complexContent> <restriction>

  • Add further constraints to the specified attribute
  • Delete an attribute
Extension

<ComplexType> <complexContent> <extension>

  • Add attribute for original type: the new derived type is still null element type
  • Add a child element to the original type: the derived new type will be the type containing the child element.
  • Add mixed = "true" to the original type: the new derived type will be the mixed content type.
Types that contain child elements Restrictions

<ComplexType> <complexContent> <restriction>

  • You can add further constraints on the type of the specified attribute.
  • You can add further constraints on the type of the specified child element.
  • You can delete a specified attribute.
  • You can delete a specified element.
Extension

<ComplexType> <complexContent> <extension>

  • Add new child elements for the base type
  • Add new attributes for the base type
Hybrid content type Restrictions

<ComplexType> <complexContent> <restriction>

The method for limiting the mixed content type is basically the same as that for limiting the types containing child elements
Extension

<ComplexType> <complexContent> <extension>

The extension of the mixed content type is basically the same as that of the extension containing sub-elements, but mixed = "true" must be retained"

Another usage of the derived type:

Assume that the element <book> is defined in XSD and its type is book_type (including a name attribute ), in addition, the extended_book_type derived data type of book_type is also defined (the price attribute is added based on book_type). In this case, in the actual XML document, you can use the <book> element in either of the following ways:

 

5. Consistency Constraints

You can also specify three types of constraints for the definition element:

  • Key constraint: equivalent to the primary key constraint in the database. The specified content must exist and be unique.
  • Keyref constraint: equivalent to the foreign key constraint in the DB. The value of the specified content must use the refer attribute to reference another key constraint or unique constraint.
  • Unique constraint: it is equivalent to the unique constraint in the DB. The specified content must be unique but does not exist.

These three consistency constraints can only be defined within the <element> element, and can only be defined at the end of the <element> element.

When defining constraints in DB, you must not only specify the type of constraints to be used, but also define which fields should be applied constraints. The consistency constraints defined in XSD are also similar, you also need to specify which part of the constraint will take effect. Therefore, you need to use the following two child elements in the constraint:

  • <Selector>: You must specify an xpath attribute. The value is an XPath expression used to determine an element range. In a constraint definition, <selector> must appear only once.
  • <Field>: You must specify an xpath attribute. The value is an XPath expression. In a constraint definition, <field> must appear at least once or multiple times.

The meaning of these two elements is: within the range indicated by the XPath expression of the <selector> element, the content represented by the XPath expression of the <field> element must comply with consistency constraints, if there are multiple <field> elements, the combination of their XPath expressions must comply with the consistency constraint, this concept is equivalent to the multi-column combination constraint in DB.

Let's take a look at the example:

                                            

6. Define symbols

Finally, let's take a look at the usage of the corresponding definition symbol in the DTD. In XSD, the <notation> element definition symbol is used to identify the external data in the XML document. The acceptable attributes of this element include:

  • Id: Specifies the unique identifier of the symbol.
  • Name: Specifies the name of the symbol. It is a required attribute and must be unique throughout the XSD.
  • Public: Specifies the external format or corresponding processing program of the data identified by the symbol. Required attribute, equivalent to <! The role of PUBLIC in NOTATION>
  • System: Specifies the external format or corresponding processing program of the data identified by the symbol. Optional attribute, equivalent to <! Functions of SYSTEM in NOTATION>

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.