XSD provides data types and supports custom data types. However, all these are based on the XSD built-in data types and a set of rules that extend the built-in data types, in this document, let's take a look at the data types in XSD.
1. XSD data type Diagram
Let's take a look at the data type chart, which has a rough outline, which will be further refined later:
From the XSD data type diagram above, we can see that there are two main categories:
(1) simple type: it can be used for attributes or for elements. In addition to built-in types, you can also use <simpleType> to customize simple types. There are three custom methods: limits <restriction>, list <list>, and union <union>.
(2) complex types: they can only be used by elements, and must all be defined by <complexType>, based on the content, you can further differentiate between complex types with simple content and complex types with complex content, respectively using <simpleContent> and <complexContent> to define their content. In addition, you can use restrictions <restriction> and extensions <extension> to derive new types.
Complex types can only be used by elements, which is probably the reason for distinguishing between simple and complex types.
2. built-in types
Built-in types have a basic importance and are the basis for deriving other types. First, we should take a copy of the inheritance relationship diagram of built-in types from the official document:
TypeDescriptionQNAMEThe name of an XML tag with a namespace prefix. The namespace can be omitted, but it cannot start with a colon. The namespace cannot end with a colon.StringCharacter string, including characters, line breaks, carriage returns, and tabs. All characters remain unchanged. NormalizedStringThe line breaks, carriage returns, and tabs in the string are replaced with spaces. TokenReplace line breaks, carriage returns, and tabs with spaces, and automatically delete the front and back spaces. Multiple consecutive spaces in the middle are compressed into one space. LanguageLegal Language code, such as en-GB, en-US, fr, zh-CN, etc.NameA valid XML tag name consists of letters, numbers, underscores, dashes, colons, and periods. It cannot start with a number, hyphen, or dot. NCNameA legal XML tag name without a namespace cannot contain colons IDSame as DTD, which must be unique in XML documents and can only be used for attributes but not for elements.IDREFThe same as DTD. It must reference an existing ID attribute value. It can only be used for attributes and cannot be used for elements. IDREFSThe same as DTD. It must reference one or more existing ID attribute values. Multiple values are separated by spaces and can only be used for attributes, not for elements.ENTITYSame as DTD, external entity, which can only be used for attributes and cannot be used for elements ENTITIESOne or more external entities in the same DTD. Multiple external entities are separated by spaces. They can only be used for attributes and cannot be used for elements.NMTOKENThe same as DTD. Valid XML tag names can only contain letters, numbers, underscores, hyphens, periods, and colons. NMTOKENSOne or more NMTOKEN in the same DTD. Multiple NMTOKEN are separated by spaces. They can only be used for attributes and cannot be used for elements.
(2) Value Type
Special values of float and double types:-INF (negative infinity), INF (positive infinity), NaN (non-number), + 0 (positive zero) and-0 (negative zero ). Among them, positive zero is greater than negative zero, NaN is greater than all values (including INF), and INF is greater than all floating point numbers.
Type |
Description |
Float |
32-Bit Single-precision floating point number. Scientific notation can be used. If the integer is 0, the decimal point cannot be omitted, and the suffix f/F cannot be used. |
Double |
64-bit double-precision floating point number. Scientific notation can be used. If the integer is 0, the decimal point cannot be omitted. |
Decimal |
Exact decimal places. Scientific notation is not supported, and special values such as-INF, INF, and NaN are not accepted. |
|
Integer |
Represents any big integer |
|
NonPositiveInteger |
Non-positive integer |
|
NegativeInteger |
Negative integer |
Long |
64-bit signed integer |
|
Int |
32-bit signed integer |
|
Short |
16-bit signed integer |
|
Byte |
8-bit signed integer |
NonNegativeInteger |
Non-negative integer |
|
PositvieInteger |
Positive Integer |
UnsignedLong |
64-bit unsigned integer |
|
UnsignedInt |
32-bit unsigned integer |
|
UnsignedShort |
16-bit unsigned integer |
|
UnsignedByte |
8-bit unsigned integer |
(3) Boolean Type
Boolean values include true, false, 1 (true), and 0 (false.
(4) Date and Time types
Type |
Format |
Description |
Date |
YYYY-MM-DD |
Date |
Time |
Hh: mm: ss. sss |
Time. sss indicates the number of milliseconds. |
DateTime |
YYYY-MM-DDThh: mm: ss. sss |
Date and Time. The T in the middle is required and is the separator between the date and time. |
GYear |
YYYY |
Year |
GYearMonth |
YYYY-MM |
Year and month |
GMonth |
-- MM |
Month, the first two hyphens are required |
GMonthDay |
MM-DD |
The day of the month. The preceding two hyphens are required. |
GDay |
--- DDD |
The preceding three hyphens are required. |
Duration |
PnYnMnDTnHnMnS |
Defines the time interval. P is fixed, indicating the period. The n before S can have decimal parts, and the others must be integers. |
Note: The first eight types listed above can be followed by Z to indicate UTC time; Y, M, D, h, m, and s to indicate year, month, day, hour, minute, and second respectively, can be replaced with a valid integer, where, the year is not 4 to the left to fill 0, plus a negative number to represent BC, the day of the month, minutes and seconds is not 2 to the left to fill 0, the sss can be 1-3 integers in milliseconds.
(5) binary data type
XSD has the following binary data types:
A: hexBinary, Which is binary data stored in hexadecimal format. Therefore, it can only be set from 0 ~ 9. ~ F, ~ F and other characters. The character length must be an even number.
B: base64Binary. Any binary data stored in Base64 encoding. Therefore, it can be changed from 0 ~ 9. ~ F, ~ F and the plus sign +. The character length must be a multiple of 4.
(6) anyURI type: Valid URI.
(7) NOTATION type: The same as DTD, which indicates a valid symbol.
3. Customize simple data types
(1) syntax for using <simpleType> elements to customize simple data types is as follows:
any-attributes
(2) global data types can be defined under the schema element, or local data types can be defined under other elements.
(3) The final attribute is used to restrict the derivation of new types. The default value is the finalDefault attribute value of the root element <schema>. The values can be:
A, # all: restrict this type to derive A new type in any form
B. Free combination of restriction, list, and union: restrict the use of the specified method to derive a new type.
C. ": no restrictions
4. Customize data types by limiting them
In terms of syntax, <simpleType> is used to customize simple types. Three methods are available: <restriction>, list <list>, and union <union>. This section first describes the restrictions <restriction>.
(1) syntax format:
The id attribute uniquely identifies the <restriction> element, which is optional. The base attribute indicates the schema (or other schema indicated by the specified namespace) the name of the built-in data type, simpleType, or complexType element defined in. You can also directly use the <simpleType> sub-element to define the restricted base type without specifying the base attribute.
(2) constraints: In XSD, restrictions are implemented by adding constraints to the base type. What are the constraints? Which base types can these constraints apply?
Category |
Available data types |
Constraints |
Description |
Enumeration Constraints |
|
Enumeration |
List of acceptable values |
Precision Constraint |
Decimal |
FractionDigits |
Maximum allowed decimal places |
TotalDigits |
Maximum number of digits allowed (excluding the decimal point) |
Length constraints |
String, QName, anyURI, binary data It can also be used to constrain the number of items in the list type list. |
Length |
Length of characters or number of items in the list |
MaxLength |
Maximum length of characters or number of items in the list |
MinLength |
Character length or minimum number of items in the list |
Range Constraints |
You can compare the types of sizes, such as values and dates. |
MaxExclusive |
The maximum allowed value cannot be equal to the upper limit. |
MinExclusive |
The lower limit of the allowed value cannot be equal to the lower limit. |
Maxcompute sive |
The maximum value allowed. It can be equal to the upper limit. |
MinInclusive |
The lower limit of the allowed value, which can be equal to the lower limit. |
Regular Expression Constraints |
Various data types |
Pattern |
Use regular expressions to constrain characters that can appear |
Blank processing Constraints |
|
WhiteSpace |
Define the handling method of blank characters (line breaks, carriage returns, spaces, tabulation, etc.) Preserve: Keep all blank as is Replace: replace all blank characters with spaces Collapse: replace first, remove the spaces at the beginning and end, and compress the consecutive spaces in the middle into one |
When a new type is derived, if the original type has a constraint, the new derived type uses the same constraint and the new constraint scope is within the original constraint range, the constraints of the New Type will overwrite the constraints of the original type. However, if you want to prevent the derived type from overwriting the existing constraints, you can add the fixed attribute (true | false) to the original type constraint ).
5. Use the syntax of <list> element to derive a custom data type (1:
(2) method for specifying the List member type
A. Use the itemType attribute
B. Use the sub-element <simpleType>. The itemType attribute cannot be specified at this time.
(3) In fact, the built-in IDREFS, ENTITIES, and NMTOKENS are the list types of IDREF, ENTITY, and NMTOKEN types respectively.
(4) use space as the separator for list type values.
(5) You can use length constraints (number of constraints), enumeration constraints, regular expression constraints, and blank processing constraints (but the value can only be collapse) for the list type ), it should be noted that enumeration constraints and regular expressions constrain the value of the entire list type, not just one of the list item values.
6. Customize data types through Union Derivation
(1) syntax for union <union> elements:
(2) method for specifying the union member type
A. Use attributes memberTypes. Multiple types are separated by spaces.
B. use one or more child elements <simpleType>
(3) You can use enumeration constraints and regular expression constraints for the Union type. Likewise, the constraint is the value of the entire Union type.
(4) The list type and union type are derived from the existing types. In XSD, you can also use the <list> element to derive the corresponding list type from the Union type, you can also use the <union> element to derive a new union type from one or more existing list types. However, you must note that the list type cannot be used to derive a new list type, you cannot use the Union type containing the list type to derive a new list type.
7. Finally, let's look at an instance that uses three methods to customize simple types: