C # Operations Word aspose.words Components Introduction and Usage-Basic introduction and DOM overview

Source: Internet
Author: User

1. Basic Introduction

Aspose.words is a commercial. NET class library that enables applications to handle a large number of file tasks. Aspose.words supports doc,docx,rtf,html,opendocument,pdf,xps,epub and other formats. You can use Aspose.words to generate, modify, transform, and print documents without using Microsoft.word. Using Aspose.words in your project can have the following benefits.

1.1 Rich feature set

Its rich functional characteristics are mainly the following 4 aspects:

1) format conversion. Aspose.words has a high-quality file format conversion function, and can be converted to doc,ooxl,rtf,txt and other formats.

2) Document Object model. Access all document elements and formats programmatically through a rich API, allowing creation, modification, extraction, copying, splitting, adding, and replacing file content.

3) file rendering. You can convert the entire document on the server side or the page to pdf,xps,swf format, as well as convert the document page to an image format, or a. NET Graphics object, which is the same as the Microsoft.word.

4) Report. You can generate a file from an object or from a data source fill-in template.

1.2 No Microsoft.word required

Aspose.words can work on a machine that does not have Microsoft Office installed. All Aspose components are independent and do not require Microsoft's authorization. In summary, Aspose.words is a great choice for security, stability, scalability, speed, price, and automation capabilities.

1.3 Stand-alone platform

Aspose.words can run on Windows,linux and Mac OS OS. You can use Aspose.words to create 32-bit or 64-bit. NET applications, including ASP, WCF, WinForm, and so on, as well as using COM components in the ASP, Perl, PHP, and Python languages. You can also use aspose.words to build. NET applications on the Mono platform.

1.4 Performance and Scalability

Aspose.words can run on both the server and the client, which is a standalone. NET assembly that can be replicated and deployed by any. NET application. With Aspose.words, you can generate thousands of documents in a short period of time, open documents, modify formats and content, populate data, and save them. Aspose.words are multithreaded security, and different threads process different documents at the same time.

1.5 Minimum learning curve

Although Aspose.words has more than 150 public classes and enumeration types, Aspose.words's learning curve is small because the Aspose.words API is designed around the following goals:

1) draw on some well-known API design experience, such as Microsoft Word.

2) Reference. The experience of the NET Framework Design Guide.

3) provides easy-to-use detailed documentation of document element operations.

Developers who previously used Microsoft Word in your project can find many familiar classes, methods, and properties in Aspose.words.

Back to Table of contents 2. Document Object Model Overview 2.1 DOM Introduction

The Aspose.words Document Object Model (hereinafter referred to as the DOM) is a Word document that is mapped in memory, and the Aspose.words Dom can programmatically read, manipulate, and modify the contents and formatting of a Word document. It is important to understand the structure of the DOM and the corresponding type, which is the basis for flexible programming using aspose.words. The following is a Word document example and its structure is as follows:

When the above document is read by the Aspose.words DOM, a tree object with the following structure is created:

From the structure and the corresponding Word documents, we can see the approximate structure of the related objects in the DOM, and with these basic concepts, we can manipulate the Word document in a very process. Document, section, Paragraph, Table, Shape, Run, and other ovals in the diagram are Aspose.words objects that have a tree hierarchy, and the annotations in the diagram also show that the objects in these document object trees have multiple properties.

The DOM in Aspose.words has the following characteristics:

1. All node classes eventually inherit from the node class, which is the basic type of the Aspose.words dom.

2. Nodes can contain (nest) other nodes, such as sections and paragraph, which inherit from the Compositenode class, and the Compositenode class source and Node class.

2.2 Node type

When aspose.words reads a Word document into memory, different types of document elements are substituted by different types of objects, and each text box is a node object, paragraph, table, section, or even the document itself. Aspose.words defines a class for each document node type.

The following is a UML class diagram that represents the relationship between different node types in the DOM. The name of the abstract class is represented in italics. Note that the Aspose.words DOM also includes classes of non-node types, such as style, PageSetup, font, and so on, which are not shown in this image.

Look at these major classes and roles.

Aspose.words class

Category

Describe

Document

Document

The Document object is the root node of the documentation tree, providing access to the entire document

Section

Document

Section object corresponds to one of the sections in a document

Body

Document

is the main text container in a section

HeaderFooter

Document

Special header or footer container in the section

Glossarydocument

Document

Represents the root entry for a glossary in a Word document

BuildingBlock

Document

Represents a glossary document, such as a widget, AutoText, or an AutoCorrect entry

Paragraph

Text

A text paragraph that protects an inline node

Run

Text

A text block of consistent formatting

Bookmarkstart

Text

A bookmark's start mark

BookmarkEnd

Text

End tag for a bookmark

Fieldstart

Text

A special character specifies the beginning of a word field

Fieldseparator

Text

Delimiter for a Word field

Fieldend

Text

A special character specifies the end of a word field

FormField

Text

A form Field

Specialchar

Text

Special character type, no specific

Table

Tables

A table in a Word document

Row

Tables

Row of a Table object

Cell

Tables

Cells for Table rows

Shape

Shapes

Images, shapes, text boxes, or OLE objects in a Word document

Groupshape

Shapes

A group of Shapes objects

DrawingML

Shapes

Sharp or image in a document, chart

Footnote

Annotations

Include footnotes or endnotes for text in the document

Comment

Annotations

Comments that contain text in the document

Commentrangestart

Annotations

The start of a related annotation area

Commentrangeend

Annotations

The end of a related comment area

SmartTag

Markup

A smart tag that surrounds one or more inline structures within a paragraph

Customxmlmarkup

Markup

Custom XML markup for some structures in a document

Structureddocumenttag

Markup

A structured document label (content control) in a document

Officemath

Math

A mathematical object, such as a function, equation, or matrix, that represents office.

2.3 Composition Mode

The structure tree of the Aspose.words document is very important, and the following design spit can clearly understand the containment relationship between the nodes.

2.3.1 Document and section

Documents and sections:

It can be seen from:

1. A document has 1 or more section nodes;

2.Section has 1 body (body), no or multiple headerfooter nodes;

3.Body and HeaderFooter can contain multiple block-level nodes;

4.1 Document can have a glossarydocument.

1 Word documents contain 1 or more sections, a section that defines its own page numbers, margins, orientation, and text for the header and footer; A section that protects the main issues such as headers, footers (home page, odd pages, even pages).

2.3.2 Block-level Node

The diagram for the Block-level node is as follows:

From here you can see:

The 1.block-level element can appear in many places in the document, such as the body's child nodes, footnotes, comments, and other elements of the cell.

2. The most important BLOCK-LEVEL nodes are tables and paragraphs;

3.1 tables have 0 or more rows;

Customxmlmarkup and Structureddocumenttag can contain other block-level nodes;

2.3.3 Inline-level Node

From the chart above you can see the following relationships:

1.Paragraph is the most frequent inline-level node;

2.Paragraph can contain different run-format nodes, or they can contain bookmarks (bookmarks) and annotations (annotations)

3.Paragraph can also contain shapes, images, drawing objects, and smart tags;

2.3.4 Table row Cells

A table can contain many rows, rows can contain cells, and cells can include block-level nodes.

2.4 Design Patterns and navigation

Aspose.words represents a document as a tree with nodes, so you can switch between nodes. Aspose.words provides a "document browser" (Documentexplorer), which is a project example demo. As shown in the following:

The ancestor node can be accessed through the ParentNode property of the node class, so it is convenient to get the parent node. The Document Object model is composed of a large number of objects, and their relationships are as follows:

The 1.Node class is the base class for all node classes;

The 2.CompositeNode class is the base class of the combined nodes;

In the 3.Node class, there is no child node management interface, the method of child node management only appears in Compositenode;

4. Remove the child node management method from the node class, cleaner, can reduce a lot of additional conversions;

C # Operations Word aspose.words Components Introduction and Usage-Basic introduction and DOM overview

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.