[XML/C ++] tinyxml Chinese Document

Source: Internet
Author: User
Tags windows visual


[XML/C ++] tinyxml Chinese Document
Tinyxml

Description: This document is a Chinese Document of tinyxml 2.5.2. It was translated by the original author Lee Thomason with the consent of Hansen. please correct me if you have any mistakes or omissions.
Copyright: The copyright belongs to the original author, and the copyright of the translation document belongs to Hansen. For more information, see the source.
Original article: http://www.grinninglizard.com/tinyxmldocs/index.html

Tinyxml document

Tinyxml is a simple and small parser that can be easily integrated into other programs.

What can it do?

To put it simply, tinyxml parses an XML document and generates a readable and modifiable Document Object Model (DOM ).

XML indicates the Extensible Markup Language ). It allows you to create your own document tag. HTML does a good job in marking documents for browsers. However, XML allows you to define any document tag. For example, you can define a document describing the "to do" list for an organizer application. XML has a structured and convenient format. All the random file formats created to store application data can be replaced by XML, and only one parser is required.

The most comprehensive and correct description can be found at http://www.w3.org/tr/2004/rec-xml-20040204/. However, It is very concise and difficult to understand. In fact, I like the introduction of xmlon http://skew.org/xml/tutorial.

There are different ways to access and interact with XML data. Tinyxml uses the Document Object Model (DOM), which means that XML data is parsed into a C ++ object that can be browsed and operated, and then it can be written to a disk or another output stream. You can also construct a C ++ object into an XML document and write it to a disk or another output stream.

Tinyxml is designed to be easy to get started quickly. It only has two header files and four CPP files. You just need to add them to your project. There is an example file named xmltest. cpp to guide you how to do it.

Tinyxml is released with the zlib license, so you can use it in open source or commercial software. The license details can be found at the top of each source code file.

Tinyxml tries to become a flexible parser on the basis of ensuring correct and appropriate XML output. Tinyxml can be compiled on any reasonable C ++ application system. It does not depend on the exception or runtime type information, and can be compiled with or without STL support. Tinyxml fully supports UTF-8 encoding and the first 64 K character entities (<I> Note: if you do not understand this sentence, you may need to know about unicode encoding </I> ).

It cannot do anything

Tinyxml does not parse and does not use dtds (Document Type Definition) or S (Extensible style sheet language ). Other Resolvers (search for XML at www.sourceforge.org) have more comprehensive features, but they are larger. It takes longer to build your project and have a steep learning curve, and there is often a stricter license agreement. Tinyxml is not suitable for browsers or for more complex XML requirements.

The following DTD syntax is not parsed in tinyxml:

<! Doctype Archiv [
<! Element comment (# pcdata)>
]>

Because tinyxml regards it as an illegal embedding! Element Node! Doctype node. This may be supported in the future.

Guide

Patience. This is a good guide to how to get started. It is worth the time to read it completely.
Tinyxml Guide

Code status

Tinyxml is a mature and tested code that is very robust. If you find the vulnerability, submit the vulnerability report to the sourcefore website (www.sourceforge.net/projects/tinyxml ). We will fix the issue as soon as possible.

In some cases, you can be improved. If you are interested in tinyxml, you can search for it on SourceForge.

Related Projects

You may find tinyxml useful! (Description provided by the Project)
Tinyxpath (http://tinyxpath.sourceforge.net). tinyxpath is a small XPath syntax decoder script written in C ++.
Tinyxml ++ (http://code.google.com/p/ticpp/). tinyxml ++ is a brand new tinyxml interface that uses many c ++ strengths such as templates, exception handling, and better error handling techniques.

Features

Use STL

Tinyxml can be compiled to use or not use STL. If STL is used, tinyxml uses the STD: string class and fully supports STD: istream, STD: ostream, operator <and operator>. Many API methods have two versions: 'const char * 'and 'const STD: string.

If compiled to do not use STL, any STL will not be included. All string classes are implemented by tinyxml itself. All API methods only provide 'const char * 'input parameters.

Runtime definition:

Tixml_use_stl

To compile it into different versions. This can be passed as a parameter to the compiler or set in the first line of the tinyxml. h file.

Note: If you compile the test code on Linux, you can set the environment variable tinyxml_use_stl = yes/no to control STL compilation. In Windows, project files provide STL and non-STL target files. In your project, adding "# define tixml_use_stl" in the first line of tinyxml. H should be the simplest.

UTF-8

Tinyxml supports UTF-8, so it can process XML files in any language, and tinyxml also supports the "legacy mode"-a way of coding that is used before UTF-8 is supported, the best possible explanation is "extended ASCII ".

Under normal circumstances, tinyxml will detect the correct encoding and use it. However, by setting the tixml_default_encoding value in the header file, tinyxml can be forced to always use a certain encoding.

Tinyxml uses the legacy mode by default unless the following conditions occur:
Tinyxml reads a file or data stream starting with a non-standard but common "UTF-8 Bootstrap Byte" (0xef 0xbb 0xbf) as a UTF-8.
If a declaration containing encoding = "UTF-8" is read, tinyxml reads it in UTF-8.
If you read a declaration that does not specify the encoding method, tinyxml will read it as a UTF-8.
If the declaration containing encoding = "Other encoding" is read, tinyxml reads it in legacy mode. In the legacy mode, tinyxml will work as before. Although it is not clear how this mode works, the old content must be able to run.
In addition to the above cases, tinyxml runs in legacy mode by default.

What happens if the encoding is set incorrectly or an error is detected? Tinyxml will try to skip the seemingly incorrect encoding and you may get some strange results or garbled characters. You can force tinyxml to use the correct encoding mode.

By using LoadFile (tixml_encoding_legacy) or LoadFile (filename, tixml_encoding_legacy), you can force tinyxml to use the legacy mode. You can also set tixml_default_encoding = tixml_encoding_legacy to force you to always use the legacy mode. Similarly, you can use the same method to force tixml_encoding_utf8.

For English users who use English XML, the UTF-8 is the same as low-ASCII. You don't need to know the UTF-8 or modify your code at all. You can treat UTF-8 as an ASCII superset.

The UTF-8 is not a double byte format, but it is a standard Unicode code! Tinyxml is not currently used or directly supports wchar, tchar, or Microsoft _ Unicode. The term "Unicode" is generally considered inappropriate to refer to UTF-16 (a Unicode-wide byte encoding), which is the source of confusion.

For the "High-ASCII" language-almost all non-English languages, tinyxml can be processed as long as XML is encoded as a UTF-8. It may be a bit subtle. Older programs and operating systems tend to use "default" or "traditional" encoding methods. Many applications (and almost all of today's applications) can output UTF-8, but older or hard to handle (or simply not usable) the system can only output text by default encoding.

For example, the Japanese system traditionally uses SHIFT-JIS encoding, in which case tinyxml cannot be read. But a good text editor can import the text of the SHIFT-JIS and save it in UTF-8 encoding format.

On skew.org link, the topic about conversion encoding is well done.

The test file "utf8test. xml" contains English, Spanish, Russian, and simplified Chinese (hopefully they will all be converted correctly ). The “utf8test.gif file is an XML file snapshot taken from IE. Note that if your system does not have the correct font (Simplified Chinese or Russian), you will not see the same output as the GIF file even if you parse it correctly. At the same time, you must note that this file cannot be correctly displayed on a western encoding Console (at least on my windows machine), print () or printf, this is not about tinyxml-it's just about the operating system. Tinyxml does not throw or corrupt data, but the console cannot display the UTF-8.

Entity

Tinyxml recognizes the predefined special "character entity", namely:

& Amp ;&
& Lt; <
& Gt;>
& Quot ;"
& Apos ;'

These are identified when the XML document is read and converted into equivalent UTF-8 characters. For example, the following XML text:

Far & amp; away

When the tixmltext object is queried, it will become a value like "far & away", and when it is written back to an XML Stream/file, it will be written back in the form of "& amp. Older versions of tinyxml "retain" character entities, which are converted into strings in the new version.

In addition, all characters can be specified by their unicode encoded numbers. "& # xa0;" and "& #160;" Both indicate spaces that cannot be separated.

Print

Tinyxml has several different ways to print the output. Of course, they have their own advantages and disadvantages.
Print (File *): output to a standard C stream, including all c files and standard output.
"Pretty nice print", but you cannot control the Print Options.
The output data is directly written to the file object, so the tinyxml Code has no memory burden.
Called by print () and SaveFile.

Operator <: output to a C ++ stream.
It is integrated with C ++ iostreams.
In "network printing" mode, no line break is output, which is good for XML exchange between network transmission and C ++ objects, but it is hard to read.
Tixmlprinter: output to a STD: string or memory buffer.
The API is not very concise.
Printing options will be added in the future.
There may be some minor changes in future versions because it will be improved and expanded.

Stream

With tixml_use_stl set, tinyxml can support C ++ stream (operator <,>) and C (File *) streams. But there are some differences between them you need to know:

C style output:
Based on file *
Use print () and SaveFile () Methods

Generate formatted output with many spaces to make it as understandable as possible. They are fast and can tolerate format errors in XML documents. For example, an XML document contains two root elements and two declarations that can still be printed.

C style input:
Based on file *
Use the parse () and LoadFile () Methods

High Speed and good fault tolerance. You can use it when you do not need a c ++ stream.

C ++ style output:
Based on STD: ostream
Operator <

The compressed output is generated to Facilitate network transmission rather than readability. It may be a little slow (maybe not), which is mainly related to the implementation of the ostream class on the system. XML with incorrect format cannot be tolerated: This document can contain only one root element. In addition, root-level elements cannot be output as streams.

C ++ style input:
Based on STD: istream
Operator>

Read XML from the stream so that it can be used for network transmission. Through some tips, it knows that when the XML document is read, the stream must be followed by other data. Tinyxml always assumes that the XML data ends after it reads the root node. In other words, documents with more than one root element cannot be correctly read. In addition, operator> will be slower than parse due to STL implementation and tinyxml restrictions.

Space

There is no consensus on whether to retain or compress spaces. For example, suppose '_' represents a space. For "Hello ____ world", HTML and some XML parsers are interpreted as "hello_world", and some spaces are compressed. Some XML parsers do not. They retain spaces, so they are "Hello ____ world" (remember _ indicates a space ). We recommend that _ Hello ___ world _ be changed to hello ___ world.

This is a problem that cannot satisfy me. Tinyxml supports both methods at the beginning. Call tixmlbase: setcondensewhitespace (bool) to set the expected result. By default, extra spaces are compressed.

If you want to change the default behavior, you should call tixmlbase: setcondensewhitespace (bool) before parsing any XML data, and I do not recommend you change it later.

Handle

It is important to check whether the return value after a method call is null if you want to read an XML document with robustness. A secure error detection implementation may produce code like this:

Tixmlelement * root = Document. firstchildelement ("document ");
If (Root)
{
Tixmlelement * element = root-> firstchildelement ("element ");
If (element)
{
Tixmlelement * child = element-> firstchildelement ("child ");
If (child)
{
Tixmlelement * child2 = Child-> nextsiblingelement ("child ");
If (child2)
{
// Finally do something useful.

Using a handle will not be so lengthy. Using the tixmlhandle class, the previous Code will become like this:

Tixmlhandle dochandle (& document );
Tixmlelement * child2 = dochandle. firstchild ("document"). firstchild ("element"). Child ("child", 1). toelement ();
If (child2)
{
// Do something useful

This process is much easier. Refer to tixmlhandle for more information.

Row and column tracking

For some applications, it is important to track nodes and attributes in their source files. In addition, knowing where a parsing error occurs in the source file can save a lot of time.

Tinyxml can track the original positions of all nodes and attributes in the row and column of a text file. The tixmlbase: Row () and tixmlbase: column () Methods return the original location of the node in the source file. Correct tabulation symbols can be configured through tixmldocument: settabsize.

Use and Installation

Compile and run xmltest:

Provides a Linux makefile and a Windows visual c ++. DSW file. You just need to compile and run it, it will generate the demotest. xml file on your disk and output it on the screen. It also tries to traverse the DOM in different ways and print the number of knots.

The Linux makefile is very common and can run on many systems-it has been tested on mingw and MacOSX. You do not need to run 'make depend' because the dependencies are hard-coded in the file.

Windows project file for vc6
Tinyxml: tinyxml library, non-STL
Tinyxmlstl: tinyxml library, STL
Tinyxmltest: application used for testing, non-STL
Tinyxmlteststl: The application for testing, STL

Makefile

At the top of makefile, you can set:

Profile, debug, and tinyxml_use_stl. Makefile has a specific description.

Enter "make clean" in the tinyxml directory and then "make" to generate an executable "xmltest" file.

In an application:

Add tinyxml. cpp, tinyxml. H, tinyxmlerror. cpp, tinyxmlparser. cpp, tinystr. cpp, and tinystr. h to your project and makefile. It is so simple that it can be compiled on any reasonable C ++ applicable system. You do not need to enable exceptions for tinyxml or support runtime type information.

How tinyxml works

For example, it may be the best way to understand:

<? XML version = "1.0" standalone = NO>
<! -Our to do list data->
<Todo>
<Item priority = "1"> go to the <bold> toy store! </Bold> </item>
<Item priority = "2"> do bills </item>
</Todo>

It is not a to do list, but it is enough. Read and parse this file (called "demo. xml") as follows, and you can create a document:

Tixmldocument DOC ("demo. xml ");
Doc. LoadFile ();

Now that it is ready, let's see how some of the rows relate to Dom.

<? XML version = "1.0" standalone = NO>

The first line is a declaration, which will be converted to the tixmldeclaration class and the first child node of the document node.

This is the only command/special tag that tinyxml can parse. Generally, command labels are stored in tixmlunknown to ensure that these commands are not lost when they are saved back to the disk.

<! -Our to do list data->

This is a comment and will become a tixmlcomment object.

<Todo>

The "Todo" label defines a tixmlelement object. It does not have any attributes, but contains two other elements.

<Item priority = "1">

Generates another tixmlelement object, which is the child node of the "Todo" element. This element has an attribute named "Priority" and a value of "1.

Go to

Tixmltext, which is a leaf node and cannot contain any other node. It is a subnode of "item" tixmlelement.

<Bold>

Another tixmlelement, which is also the child node of the "item" element.

And so on.

Finally, let's look at the entire Object Tree:

Tixmldocument "demo. xml"
Tixmldeclaration "version = '1. 0'" standalone = No"
Tixmlcomment "Our to do list data"
Tixmlelement "Todo"
Tixmlelement "item" attribtutes: Priority = 1
Tixmltext "go to"
Tixmlelement "bold"
Tixmltext "toy store! "
Tixmlelement "item" attributes: Priority = 2
Tixmltext "Do bills"

Document

This document is generated by doxygen using the 'docker' configuration file.

License

Tinyxml is released based on the zlib license:

The software is provided according to the "status quo" (that is, what you see now) without any explicit or concealed warranty. Any loss caused by the use of this software shall never be borne by the author.

As long as the following restrictions are observed, anyone can use the software for any purpose, including commercial software, or modify it and release it freely:

1. You must not falsely report the source of the Software. You must not claim that you are the first author of the software. If you use this software in a product, we are very grateful for adding a thank-you message to the product documentation, but this is not necessary.

2. If the source version is modified, it should be clearly marked and the original software cannot be falsely reported.

3. This announcement cannot be removed or modified from the source release version.

Bibliography

The World Wide Web Alliance is an authoritative standard organization for customizing XML. Its webpage contains a large amount of information.

Authoritative guide to http://www.w3.org/TR/2004/REC-xml-20040204/

I also recommend the release of "XML pocket reference" by Robert Eckstein by oreilly "...... This book covers everything you need to get started.

Donors, contacts, and a brief history

Thank you very much for your suggestions, vulnerability reports, comments, and encouragement. They are useful and make the project interesting. We are particularly grateful to those donors for making the website page vibrant.

There are many people who have missed out on their reports and comments, which are not listed here as we try to write them to changes.txt.

The original author of tinyxml is Lee Thomason (the word "I" often appears in this document ). With the help of Yves berquin, Andrew ellerton, And the tinyxml community, Lee checked and modified and released the new version.

We are very grateful for your suggestion, and we want to know if you are using tinyxml. Hope you like it and find it useful. Please mail questions, comments, and vulnerability reports to us, or you can log on to the website to contact us:

Www.sourceforge.net/projects/tinyxml

Lee Thomason, Yves berquin, Andrew ellerton

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.