Xml
The reader may find it inappropriate to describe this chapter in a book a p, in fact the extended Markup Language (e X t e n S i b l emarkup language,x M) is getting deeper into our lives, which is a good thing. X m L can cross all boundaries to obtain a truly independent, cross-platform data transfer format, X M L may be the only choice.
This may be frustrating, but in the computer industry x M L is almost recognised by all large (or small) developers. The standard can be so uniform that it used to happen only when the T C p/i P Protocol was adopted as the standard for network protocols. X M L is an international standard that is controlled by an industry standard group and is widely supported by the world and becomes one of the few technologies with only one standard.
Readers may find it strange that today's changes in the world's standards are as frequent as the seasons, even if a single standard is divided into segments by several companies in pursuit of competitive advantage. However, X M is an exception, because it does get broad support from many companies. It is amazing that all of us seem to be trying to achieve and follow this unique standard. X M is even more surprising if you think back to the debate over the standard issues that have caused so much trouble to industry in the past.
Since X M is a universally implemented standard, it is also easy to use in a S p. If you use a S p to create an we site, it is likely that some form of database will be used to store the data. and x M is another format for storing data, and its application is more and more extensive, so it must be mastered. Of course, the function of x M is not only this.
Although it is not possible to get full support at once, we have seen that X M support has indeed been introduced into I e and a D o. The difficulty is that the development of the I e and a D O is inconsistent, so the interaction between them is not ideal, so this chapter does not introduce the universal technology of data transmission that people want. In writing this book, the combination of I E and a D O is not yet tight, but they are constantly improving. So although there are no specific messages for the new release, a D o and I e will certainly be better integrated in the future.
Before giving X M an explicit definition, it is best to understand what is a markup language. First, there is a problem, because the term "language" is not used properly. In fact, X M is not a programming language, and V B or C + + is the real programming language, and X m is just a set of rules that define how to tag text or documents. So what does "mark" mean? Marking a document refers to a process that identifies certain parts of the document as having a special meaning. This may be difficult to understand, and we give an example of the Hypertext Markup Language (hypertext Markup language,h T m), since "M" in "H T m" represents the tag (m a R k u).
H T M L is a set of tags that specify the layout of the document. H T l contains some predefined tags, each of which has its own meaning, such as:
This is a text that contains a small amount of markup. The text begins with the < B o d > Mark, and in H T L, the tag represents the beginning of the document body, and the main part ends with the </B o d y > mark. Within the body of this document are headings, placed between < H 1 > and
You may notice that the above example does not use the word format. This is carefully considered, because tagging and formatting are not the same thing. The < B O D > Mark marks the area of the document and does not define any formatting. However,< B > Mark this area of the document is shown in bold. This is because the < B > tag in H T M L is a token that implicitly has the specified format.
So keep in mind that markup language is just a rule that defines how to add special meaning to a particular part of a document. This definition may play a good role in formatting, but this is not the only reason to use markup.
11.1.1 the difference between XML and H T M l
Although both x M L and h T m use tags, they are different. The main difference is that x M is specifically used to describe the structure of the text, not to describe how to display the text. X M L does not have a fixed set of tags, for example:
Does the above code look exactly the same as the H T M code in the previous section? If it is the H T M document, the same is true. If you load it into the browser, the above content will appear as shown in Figure 11-1, which acts as if it were a formatted document.
However, if the above code is an X M L document, then the markup in it does not have any meaning, the content of which is only the description:
? There is a mark named B O D, which has some text inside the tag.
? There is a mark called H 1, which has some text inside the tag.
? There is a mark named B, which has some text inside the tag.
If the above code is loaded into the I e browser as an X m L document (with the file extension. x m), it can be seen very clearly and the result is shown in Figure 11-2.
I e interprets the X M L document and displays it. Note that I e did not do any processing on this X M L document, just to show it. The browser knows how to interpret the H T L document and knows how to display the document in a format defined by the tag. Similarly, browsers know how to interpret the X m L document, but because the X m l tag does not define a format, the document is not formatted, so the tags are displayed as they are.
But I actually did a bit of formatting to make x M easier to read. It divides the tags into different levels, so we see a set of structured tags, and I e does not explain them.
The X M L document that has been learned so far is made up of tags in some parts of the markup document. So what's the problem with X M l using data, and look at another example. This example appears in the previous chapters and the reader will find that x M is very meaningful here.
Several pairs of different tags are used in the example above. In the beginning, you might think that these markings must have meaning. They all have a meaningful name that defines a list of a U t h o r, a single a U t h o r, and some values related to a U t h o r. In the previous chapters, the content appeared several times, and when viewed in a browser, we could format it as a table to display. But since this is the mark in X m l,x m L does not mean anything, as shown in Figure 11-3.
As you can see, I am not doing anything with it here. So even if the tags are meaningful to us, they don't have the X M L. In fact, this code can be written in the following form:
The browser simply displays the tags intact, as shown in Figure 11-4.
The tag can be any symbol you like. Of course, it is intuitive to give a meaningful name at the outset. X M is very readable, so you typically use a tag name that describes its content.
Here, you've seen that X M L consists of a series of tags that describe the parts of the document. In the above example using a U t h o r Information, the X M L is used to describe the data, and a tag name representing the data field name is used. This is the true meaning of x M L as a data interchange format. It is standard text, so it can be easily transferred from one machine to another. But it's not a proprietary format, so anyone can read it, and if the tag name makes sense, X M l data has a "self-describing" feature.
11.1.2 tags and elements
Use the name "tag" to determine some of the H T M tags, such as < B > or < H 1 >. An element is a whole that is formed by using these tags. For example:
Therefore, an element consists of a start tag and a closing tag, which surrounds the text in the middle, and can include other child elements. This is important because it involves the concept of a "formalized x M L", where each of the start tags must have a corresponding closing tag. This differs from HTML 4.0 and its previous versions, and in those H T M versions, some tags do not have end tags (for example < I M G > and < B R >) and < P > tags.
If you use X M to describe data, it is possible that you do not include data in some fields. In this case, the tag is empty. There are two ways to define an empty tag in x M L. The first approach is to use a start tag and an end tag, but there is no content:
Another layer of meaning of the formalized x m is that the mark of X M L is case sensitive, so in this case the start tag and the end tag must be consistent. This also means that the following line is invalid for x M L:
1. Root tag
Another term to know is the root tag. It represents the outermost tag, and an X M L document can have only one root. For example, let's take a look at the example of a U t h o r:
The root tag here is < a u t h o R S >. Because there is only one root tag, the above statement is legal. But the following code is wrong:
11.1.3 mode and document type definitions
We started off by declaring that the X M l tag actually doesn't mean anything, you can give the tag any name, but how do you know what type of tag is allowed in the document? Therefore, you must use a document type definition (d o c u m e n ttype definition,d T D) or a pattern (S c h e m a). The function of the pattern and D T D is almost the same, all of which elements are available in the document, and a formalized X m L document can be transformed into a valid X m L document. That is, it is properly labeled (that is, in good form) and contains only the allowed elements and attributes.
The reason for using the model with D T D is because Microsoft feels that D-D is awkward in some places. D T D is a text file that defines the X m l document structure, but D T D itself is not X m L and has completely different grammatical rules. This is a bit unusual, so we agree with Microsoft on this point. If you are working with the X m L document, then the structure that defines those documents should also be X M L, which is what the pattern does, i.e. the pattern is the X m L equivalent of D T D.