Contents
How to read the HTML DTD
1. DTD Comments
2. Parameter Entity Definitions
3. Element declarations
. Content Model Definitions
4. Attribute declarations
. DTD entities in attribute definitions
. Boolean attributes
How to read the HTML DTD
Each element and attribute declaration in this specification are accompanied by its document type definition fragment. We have chosen to include the DTD fragments in the specification rather than seek a more approachable, but longer and less Precise means of describing an element ' s properties. The following tutorial should allow readers unfamiliar with SGML to read the DTD and understand the technical details of T He HTML Specificationi.
1. DTD Comments
In DTDs, comments is spread over one or more lines. In the DTD, comments is delimited by a pair of "--" marks, e.g.
<! ELEMENT Param-o EMPTY--named property value--
Here, the comment ' named property value ' explains the use of the PARAM element type. Comments in the DTD is informative only.
2. Parameter Entity Definitions
The HTML DTD begins with a series of parameter entity definitions. A parameter entity definition defines a kind of macro that is referenced and expanded elsewhere in the DTD. These macros may not be appear in HTML documents, as in the DTD. Other types of macros, called character references, May is used in the text of an HTML document or within attribute values .
When the parameter entity was refered to by name in the DTD, it was expanded into a string.
A parameter entity definition begins with the keyword <! Entity% followed by the entity name, the quoted string the entity expands to, and finally a closing;. Instances of parameter entities in a DTD begin with "%", then the parameter entity name, and terminated by an optional ";" .
The following example defines the string that the "%fontstyle;" entity would expand to.
<! ENTITY% fontstyle "TT | I | B | BIG | SMALL ">
The string the parameter entity expands to could contain other parameter entity names. These names is expanded recursively. In the following example, the "%inline;" parameter entity was defined to include the "%fontstyle;", "%phrase;", "%special;" and "%formctrl;" parameter entities.
<! ENTITY% inline "#PCDATA | %fontstyle; | %phrase; | %special; | %formctrl; " >
You'll encounter the DTD entities frequently in the HTML DTD:%block; "%inline;". They is used when the content model includes block-level and inline elements, respectively (defined Global structure of an HTML document).
3. ELement declarations
The bulk of the HTML DTD consists of the declarations of the element types and their attributes. The <! ELEMENT keyword begins a declaration and the > character ends it. Between these is specified:
(1). The element ' s name.
(2). Whether the element ' s tags are optional. The appear after the element name mean, the start and end tags are mandatory. hyphens. One hyphen followed by the letter "O" indicates, the end tag can be omitted. A pair of letter "O" s indicated this both the start and end tags can be omitted.
(3) The element s content, if any. The allowed content for a element is called its content model. Element types that is designed to has no content is called empty elements. The content model for such element types is declared using the keyword "EMPTY".
In this example:
<! ELEMENT UL--(LI) +>
. The element type being declared is UL.
. The hyphens indicate that both the start tag <UL> and the end tag </UL> for this element type is Required.
. The content model for this element type was declared to being "at least one LI element". Below, we explain how
To specify content models.
This example illustrates the declaration of an empty element type:
<! ELEMENT Img-o empty>
. The element type being declared is IMG.
. The hyphen and the following "O" indicate that the end tag can is omitted, but together with the content
Model "EMPTY", this was strengthened to the rule and the end tag must be omitted.
. The "EMPTY" keyword means that instances of this type must not has the content.
Content Model Definitions
The content model describes what is contained by a instance of an element type. Content model definitions may include:
The names of allowed or forbidden element types (e.g., the UL element contains instances of the LI element type, and th e P element type may not contain other P elements).
DTD entities (e.g., the LABEL element contains instances of the "%inline;" parameter entity).
Document text (indicated by the SGML construct "#PCDATA"). Text may contain character references. Recall that these begin with & and end with a semicolon (e.g., "hergé" S Adventures of Tintin "contains the character entity reference for the" E Acute "character). The
content model of an element is specified with the following syntax. Please note the list below are a simplification of the full SGML syntax rules and does not address, e.g., precedences.
( ... )
Delimits a group.
A
A must occur, one time only.
A +
A must occur one or more times.
A?
A must occur zero or one time.
A *
A may occur zero or more times.
+ (A)
A may occur.
-(A)
A must not occur.
A | B
Either A or B must occur, but not both.
A, B
Both A and B must occur, in that order.
A & B
Both A and B must occur, in any order.
Here is some examples from the HTML DTD:
<! ELEMENT UL--(LI) +>
The UL element must contain one or more LI elements.
<! ELEMENT DL--(dt| DD) +>
The DL element must contain one or more DT or DD elements in any order.
<! ELEMENT option-o (#PCDATA) >
The OPTION element may be only contain text and entities, such as & --This is indicated by the SGML data type #PCDATA.
A few HTML element types use the additional SGML feature to exclude elements from their content model. Excluded elements is preceded by a hyphen. Explicit exclusions override permitted elements.
In this example, the-(A) signifies of the element a cannot appear in another a element (i.e., anchors ).
<! ELEMENT A--(%inline;) *-(A) >
Note the A element type is part of the DTD parameter entity "%inline;", but is excluded explicitly because of-(A).
Similarly, the following element type declaration for FORM prohibits nested forms:
<! ELEMENT FORM--(%block;|script) +-(form) >
4. Attribute declarations
The <! attlist keyword begins the declaration of the attributes that a element may take. It is followed by the name of the element in question, a list of attribute definitions, and a closing;. Each attribute definition was a triplet that defines:
. The name of an attribute.
. The type of the attribute ' s value or an explicit set of possible values. Values defined explicitly by the DTD is case-insensitive. Please consult the sections on basic HTML data types for more information about attribute value types.
. Whether the default value of the attribute is implicit (keyword "#IMPLIED"), in which case the default value must be Suppl IED by the user agent (in some cases via inheritance from parent elements); Always required (keyword "#REQUIRED"); or fixed to the given value (keyword "#FIXED"). Some attribute definitions explicitly specify a default value for the attribute.
In this example, the name attribute was defined for the MAP element. The attribute is optional for this element.
<! Attlist MAP
Name CDATA #IMPLIED
>
The type of values permitted for the attribute are given as CDATA, an SGML data type. CDATA is text which may contain character references.
For more information about "CDATA", "NAME", "ID", and other data types, please consult the sections on HTML data types.
The following examples illustrate several attribute definitions:
RowSpan Number 1--Number of rows spanned by cell--
HTTP-EQUIV name #IMPLIED--HTTP response header name--
ID ID #IMPLIED--document-wide Unique ID--
valign (Top|middle|bottom|baseline) #IMPLIED
The RowSpan attribute requires values of type number. The default value is given explicitly as "1". The optional HTTP-EQUIV attribute requires values of type NAME. The optional id attribute requires values of type ID. The optional valign attribute is constrained-take values from the set {top, middle, bottom, baseline}.
DTD entities in attribute definitions
Attribute definitions may also contain parameter entity references.
In this example, we see the attribute definition list for the LINK element begins with the "%ATTRS;" parameter entity .
<! ELEMENT Link-o EMPTY--a media-independent LINK--
<! Attlist LINK
%attrs;--%coreattrs,%i18n,%events--
charset%charset; #IMPLIED--char encoding of linked resource--
href%uri; #IMPLIED--URI for linked resource--
Hreflang%languagecode; #IMPLIED--language code--
type%contenttype; #IMPLIED--Advisory content Type--
rel%linktypes; #IMPLIED--forward link types--
rev%linktypes; #IMPLIED--Reverse link types--
media%mediadesc; #IMPLIED--for rendering on these media--
>
Start tag:required, End Tag:forbidden
The "%ATTRS;" parameter entity is defined as follows:
<! ENTITY% attrs "%COREATTRS; %i18n; %events; " >
The "%COREATTRS;" parameter entity in the "%ATTRS;" Definition expands as follows:
<! ENTITY% Coreattrs
"ID ID #IMPLIED--document-wide Unique ID--
Class CDATA #IMPLIED-space-separated List of classes--
Style%stylesheet; #IMPLIED--associated style info--
Title%text; #IMPLIED--Advisory title--"
>
The "%ATTRS;" parameter entity has been defined for convenience since these attributes is defined for most HTML element t Ypes.
Similarly, the DTD defines the "%uri;" parameter entity as expanding into the string "CDATA".
<! ENTITY% URI "CDATA"
--A Uniform Resource Identifier,
See [URI]
-
As this example illustrates, the parameter entity '%uri; ' provides readers of the DTD with more information as to the type of data expected for an attribute. Similar entities has been defined for "%color;", "%charset;", "%length;", "%pixels;", etc.
Boolean attributes
Some attributes play the role of the Boolean variables (e.g., the selected attribute for the OPTION element). Their appearance in the start tag of a element implies that the value of the attribute is "true". Their absence implies a value of "false".
Boolean attributes legally take a single value:the name of the attribute itself (e.g., selected= "selected").
This example defines the selected attribute to be a Boolean attribute.
Selected (selected) #IMPLIED--option is pre-selected--
The attribute is set to "true" by appearing in the element ' s start tag:
<option selected= "Selected" >
..... contents ...
</OPTION>
In HTML, Boolean attributes could appear in minimized form – the attribute ' s value appears alone in the element ' s start tag . Thus, selected May is set by writing:
<option selected>
Instead of:
<option selected= "Selected" >
Authors should be aware this many user agents only recognize the minimized form of a Boolean attributes and not the full for M.
How to read the HTML DTD