How to write a text editor in C #

Source: Internet
Author: User
Tags abstract exit html page insert modify object model object serialization relative

How to write a text editor "2005-8-24 Edition" in C #

Nanjing Trinidad Lone 2005 Copyright, not limited to reprint, please retain the copyright notice

Summary

This paper explores the task of developing a formatted text editor using C # from the bottom, deeply discusses the design of the Document Object model, the processing of graphical user interface and the response of user operation, and explains some technical problems and solutions.

Objective

Younger brother from the university began to contact programming also has 6 years, work 4 years is also dry programming of live, see a lot of procedures, they have made many, in the school programming oneself think is to do art, in fact, play some games, such as civilization, such as the Pharaoh StarCraft from some point of view is also engaged in art, looking at their own painstakingly built buildings and personnel from less to more , from simple to complex, with some sense of accomplishment in mind. Programming is the same, the program from dozens of lines written to tens of thousands of lines, functions from Hellowword to quite complex and powerful, the heart also has a lot of sense of achievement.

After graduating from work, only gradually realized that software development is essentially a tool, this tool for others or their own use. With tools, many problems can be solved by the blade. So the programmer and the Mason Blacksmith Carpenter are the same kind of people. But nothing, programmers are not superior, people in society, seriously really work on the line.

Problem

Don't say much nonsense, now talk about the question of the title, how to write a text editor in C #. I was fortunate enough to have developed a more complex text editor, so it's a bit of experience to share. The text editor you refer to here is not simply a single or multiple-line edit box like Windows, but a text editor similar to Word.

Rough Look, an editor what is difficult, in fact, because we think that the easy thing for the computer is really a big problem. For example, we often surf the Internet, you can find that in recent years many Web site login In addition to the user name and password to enter the so-called verification code, and the verification code is next to the input box crooked draw out, just like a student in a dirty paper written on the same, this is only to prevent the program to simulate login, Because the crooked text human can easily identify, and the computer is very not easy to identify.

A text editor mainly handles problems with

    The
    • file saves the definition of the format, whether the document is saved in text or binary format, and what information is stored in each unit of information in the document. Document formatting is important. The communication between
    • and the document storage System, which is the function of saving and loading documents, where the document storage system can be the operating system file subsystem, the database, the network, in fact, the file format set up, a variety of document storage system is not very different.
    • Document-loaded Document object maintenance, in the face of more complex document processing, the need to use object-oriented programming ideas, careful analysis of the document structure, the loading of the document data a little bit off, each of the smallest indivisible document data into an object, You then use an object tree to hold the hierarchical relationship of the document content so that a Document object tree is constructed. Document editing is the task of maintaining this Document object tree.
    • Document Object layout, which requires processing the entire Document object tree after the document has been loaded, calculates the display size of each object, and then arranges the objects to display in the view area, including calculations for paragraphs and document lines, and then calculates the Cartesian coordinates of the object in the View area.
    • the drawing of the document, which includes drawing the contents of the document on the computer screen and drawing on the printer. Based on the coordinates of the calculated object in the view area, the program makes some coordinate transformations, drawing objects on the graphics output object, such as drawing a text or picture. Because. NET Framework, the operating screen and printer are based on GDI +, there is no essential difference, so some of the processing of the drawing code can draw the screen, you can draw the printer. Drawing documents on the screen is also particularly needed to optimize and minimize flicker.
    • The processing of environment messages, which refer to Windows messages that should change the contents of the document, such as mouse keyboard messages, and related messages to the system pasteboard. The program processes these messages, modifies the Document object tree, inserts deletes or modifies the document element object to the object tree. After the Document object tree changes, you need to rearrange the document, handle paragraph calculations and document line calculations, recalculate the position of the object in the View area, and then refresh the screen display as needed. In addition, the user also has to process the document content when it is selected. The
    • document is saved, and the program generates some data from the Document object tree and then saves it to the document storage system, which can be thought of as object serialization.
    • The openness of the application, providing two development capabilities and providing functionality similar to VBA

A complete function of the text editor structure is very complex, the problem involved is very broad, no tens of thousands of lines of code is uncertain, these issues in this article is impossible to list and discuss, this has to pick some focus to say.

Document Object Model

In the actual development does not need to solve the problem, I was first to determine the structure of the Document object tree, where the concept of document Object model, we have encountered many kinds of document Object model, the most is the HTML Document Object model, When we use JavaScript to control HTML page content, we use the HTML Document Object model, in addition to the XML Document Object model, which VBA operates in the Word or Excel document Object model. Using the Document Object model, you can associate all the content in a document with an object in memory, and the corresponding document content is modified when the application modifies the data of the object in memory. Deleting an object in memory also deletes the corresponding document content. The idea of some document object models can refer to http://www.w3.org.

It is common in the Document object model to inherit and overload objects. We can see. NET class Library The XML Document Object model defined under the System.Xml namespace, you can find either an XML Document Object (XmlDocument), an XML node (XmlElement) or a property (XmlAttribute), or even a comment ( Xmlcomment) Plain Text data (XMLText) are inherited from the abstract class XmlNode. The advantage of this design is that it is convenient to traverse the XML Document object tree, all kinds of objects are derived from XmlNode, all of which are overloaded with some member methods according to their needs, and other programs can treat these objects as XmlNode, using the overload and polymorphism of object methods to implement different processing.

Underlying objects

Under this guidance, I have also defined an abstract class TextElement, where all document objects derive from the object. This class defines the following virtual members

    • Left,top,width,height property, which is used to indicate the location and display size of the object
    • Realleft, Realtop read-only property that represents where the object appears in the View area
    • Refreshsize method, which is used to recalculate the display size of an object
    • Refreshview methods, redrawing objects
    • Handlemousedown method, handle mouse button press event
    • Handlemousemove method to handle mouse movement events
    • Handlemouseup method, handle mouse button release event
    • FromXml method to load object data from an XML node
    • ToXML method to save all the data of an object to an XML node

Because the document content is hierarchical, it also defines a container type TextContainer, derived from TextElement that extends to save several child objects that define the following virtual members

    • MaxWidth property, the maximum width of the object's content, the width of a document displayed is the width of the paper minus the distance between the left and right margins, and all the contents of the document are restricted to the display width, which is related to the display width
    • Childelements A read-only property that returns a collection of all child objects, with a return type of System.Collections.ArrayList
    • AppendChild method, which is a TextElement object, this method adds the object to the collection of child objects
    • RemoveChild method, the method parameter is a TextElement object that deletes the specified document element object from the collection of child objects
    • Removechildrange method, which is similar to RemoveChild, just to delete a batch of child objects
    • InsertBefore method, the method parameter is two TextElement objects, the first parameter is the document element object to be added, and the second is the document element object that contains the insertion point
    • Insertrangebefore method, the method and the InsertBefore type, just for inserting a batch of document element objects

There is a special child element in some container objects. The child element is the last element and cannot be deleted, for example, for a paragraph object, this is a container object, the last element of the object is marked with a paragraph end, the object cannot be deleted, and a similar end object may exist in other types of container objects , so the case is considered in the TextContainer object, so a set of virtual members is defined to handle the

    • Addlastelement virtual method, you want the container object to add the end of the paragraph tag object as the last object, other derived container objects can overload the method to implement their own last object
    • islastelement function, this function argument is a TextElement object, this function returns whether the specified TextElement object is the last object, the program is called before the child element is deleted, the element to be deleted is the last element should not be deleted

The TextContainer object also overloads the Refreshsize method to recalculate the display size of all child elements, and also defines a new virtual method refreshline for branch processing, which also defines document line objects for easy branch processing Textline, Document line objects are used to hold the document Content branch information, and when the document is finished and the contents are not changed, the coordinates of the content to be displayed need not be recalculated, and the members of the document row object have

    • LineSpacing line spacing, which is the distance between the bottom of this document line and the upper end of the bank
    • Elements A collection of all document elements that belong to the line of the document, which is convenient for programming
    • Firstelement the first element of this document line
    • Lastelement the last element of a document line
    • Realleft, realtop the position of the upper-left corner of the document line in the document view area
    • Container the container object in which this document line is located
    • Contentwidth the width of all elements in this document line and

To save the branch information, the TextContainer object also defines a lines read-only property that returns a list of System.Collections.ArrayList objects that are all text line objects that belong to the container. Container object Execution Refreshline the steps for a branch

  • Empty the text line collection lines
  • Set up all element collections that participate in a branch
  • Iterate through all the child elements in the collection of elements that participate in the branch
  • If the child element object is a tab or horizontal line object, recalculate its width
  • If the child element is a container object, it calls its Refreshline method
  • Adds an element to the list of elements in the current row and accumulates the width of the element and, if the width and greater than the container display width (which we call Case 1) or the current element occupies a single row, cancels adding elements to the current row and ending the current row
  • Ends the current line if the current element is a forced line wrap
  • If the current element cannot appear at the end of the line or the next element cannot appear at the beginning of the row, the current element is not added to the current row (which is also 1). According to writing conventions, certain characters such as!),.:;?]} ¨ ˇˉ―‖ ' "... :,. 〃々〉 "" ")〗! "'),. :;? ] ' |}~¢ is not displayed at the beginning of the line, while some other characters such as ([{·] 〈《「『【〔〖(. [{£¥ is not displayed at the end of a line, and there may be other types of elements in some particular application that need to be considered.] For this reason, a method canbelinehead is defined in the underlying element object type TextElement to determine whether an element object can appear at the beginning of the line, and a method canbelineend is defined to determine whether an element object can appear at the end of a line. Such a character element object and other element objects can overload the two methods to make the necessary judgments. Be careful in making such judgments, if the container display width is relatively small may cause the death cycle because of this kind of judgment, therefore also needs the extra to carry on the judgment of the Dead Loop (in that year to discover this error to vomit out dozens of two blood).
  • You need to calculate the relative position of the document element in the current row at the end of the current line. If the current row is due to 1, the end of the element spacing needs to be fixed, because the document line all the elements of the width and not necessarily equal to the container display width, so if you do not make a correction, the right edge of the document is uneven, affect the appearance, Therefore, you need to calculate the difference between the width of the element and the display width of the container, and insert the width difference evenly between the individual document elements so that the right edge of the document is relatively neat. To save this correction, add a Widthfix property to the TextElement to save the value. In fact, you can observe IE display the contents of the document without the right edge of the correction and Word has a similar correction
  • If the current row is due to the end of the last element Force branch, there is no right edge correction due to condition 1, but the document element position needs to be corrected when the document is aligned. First find the paragraph object that affects the current line of text, get its alignment settings (left, right, center), calculate the white space in the element based on alignment, and then set the element's Widthfix property
  • In addition, you need to fix the top coordinates of the element in the document line. Because the document element height of the same row is not necessarily consistent, you need to iterate through all the elements to the height of the document row with the highest element height, to calculate the top position of the element in the document row to ensure that the lower edges of each element are on the same horizontal line
  • The finished row object is added to the container's lines document rowset, and then creates a document row object as the current line, so that it loops until all the contents of the container object are processed
  • After all document line objects are generated, the coordinates of the document rows in the View area are calculated based on the coordinates of the container object in the View area and the line spacing settings for the document line. So the coordinates of all the elements in the document line in the view area are the coordinates of the document line and the relative coordinates of the elements in the document line.
  • When you modify the position of an element in a document line, you need to get the element's old minimum rectangular data in the view area, and then compare it to the least-computed rectangular rectangle, which, if they are different, will change the position of the element displayed in the view area, adding the two rectangles to the text editor redrawing the rectangular collection. When the document has been reopened, the text editor adds all the redrawing rectangles, and the obtained rectangle is the area that needs to be redrawn. So this is to optimize the display operation, reduce the page flicker, because the user changes the contents of the document after the resulting branch only affect the display area, while the other parts, although the location is recalculated but the old and new location is not different, so do not need to redraw

In fact, there should be more optimization of branch operations, but my ability is limited, can only propose this method. Experiments show that the process of processing small documents when the speed is also good, but when the document content, there are tens of thousands of characters, the branch speed is very slow, but also hope that the master provides a solution.

To represent the entire Document object, the Document object textdocument is defined, which is the largest object in the Document object model, and I do not derive it from the textelement of other document objects, but are directly defined. This object is used to manipulate the document as a whole, and to list some basic operations of the operation document, such as delete, copy and paste. It also provides a set of methods for implementing VBA functionality.

    also defines a document content management object, which is subordinate to the TextDocument object, which is used to manage all the document elements that define the attribute elements, This property is a list of all the element objects in the document that are saved. The object also defines the property selectstart to represent the position of the insertion point, selectlength to indicate the length of the selection, 0 indicates that no element is selected, and a positive number indicates that several elements are selected backwards from the insertion point, and a negative number indicates that several elements are selected from the insertion point forward. This object also defines a set of functions that handle the insertion point, such as moving several elements to the right and moving up one line. As you know, you can move the insertion point directly with the cursor key in the text box. You can also use the cursor key while pressing the SHIFT key to move the insertion point and select the document content, the user can also use mouse clicks to move the insertion point, the mouse clicks while the SHIFT key can also move the insertion point to select the document content To do so, a property autoclearselection is defined on the Content object, and when the property is set, the Selectlength is set to 0 when the insertion point is moved, and the Selectlength value is set when the insertion point is moved without setting the property. Causes the elements between the new insertion point and the old insertion point to be selected so that the text editor sets the Autoclearselection property based on whether the user presses the SHIFT key. The user modifies the insertion point and selection area, and the text editor needs to redraw the user interface, where optimizations are needed to redraw only those elements where the selection state has changed. It can be proved that when the selected element is continuous, the selection and insertion point are modified in any case, and only the selected state of the elements in the two areas is changed. So just get the starting position and the length of the two areas, and then redraw the elements in both areas.

Users can do a lot of things with documents, for example, move the insertion point, select elements, set the font color and size of characters, insert text and pictures, modify the settings of elements, delete the Cut copy paste, and so on, there are dozens of kinds of operations, and these operations at some point is not available, need to be judged, If these operations define the corresponding interface functions in the TextDocument, then the TextDocument class code is too much, too bloated, and each new operation needs to modify the TextDocument, so the concept of action is presented here. An action is a type that implements a document operation that has a unified interface and uses the basic operations provided by textdocument or other objects to implement more complex operations. This defines the action base class Editoraction, which is an abstract class, and its main interface has

    • HotKey field, action corresponding to the hotkey code, Action object initialization, set the action corresponding to the hotkey
    • KeyCode field, keyboard key code when triggering action
    • Shiftkey field, the SHIFT key state when the action is triggered
    • Controlkey field, Control key state when action is triggered
    • Altkey field, the ALT key state when the action is triggered
    • Mousex,mousey field, the coordinates of the mouse cursor in the view area when the action is triggered
    • Mousebutton field, the mouse button state when the action is triggered
    • PARAM1,PARAM2,PARAM3 field, action parameter, its meaning is determined by the specific action
    • The Testhotkey function tests the keyboard hotkey, which is invoked by the text editor to determine whether an action is triggered
    • ActionName read-only property, action name
    • Isenable action is available
    • Execute Execute action
    • Ownerdocument the Document object that the Action object operates on

All kinds of actual action objects are derived from Editoraction, if the object has a hotkey in the initialization of the Set Hotkey field, first overload actionname given a name, and then overload execute to implement their own action processing process, You can also overload isenable or testhotkey on demand.

In TextDocument, there is a property action, which is a list of various action objects that initializes the list of Actions objects when textdocument initializes the When the text editor gets the input focus and presses the keyboard key, the program iterates through all the actions in the actions, makes the hot key judgment, and if the hit hotkey executes the action, other applications can also set the text Editing function button and the availability of the corresponding menu according to the isenable properties of each action.

For example, define the Copy Action Object Editorcopyaction, which is derived from Editoraction, overload actionname to return "copy", Overload isenable, return True if the document has a selected section, or False if Overload execute to invoke the function that implements the copy function in TextDocument, and the object is initialized with the hotkey System.Windows.Forms.Keys.Control | SYSTEM.WINDOWS.FORMS.KEYS.C, this defines the action as a hotkey for ctl+c.

This mode of action processing is also convenient for programs to expand, other applications can also add custom action objects to the action list, so that the text editor can automatically apply the action. The application can also modify the hotkey settings of various actions to achieve personalization of user actions.

Derived objects

After you define the underlying object, you begin to derive the object, first defining the character object type Textchar, the most important or character data in a document's content, where each character in the document is a character object. The character object overloads the Refreshsize object Refreshsize method, which is used to calculate the text size based on the measurestring of the drawing object (System.Drawing.Graph object) that is currently drawn. Note By default, the string that the method calculates displays the width back with additional padding, and uses the System.Drawing.StringFormat.GenericTypographic parameter to calculate the actual size. In addition, there is a special character-tab. The width of this character is not fixed and needs to be calculated when typesetting is performed.

The character object (Textchar) also derives the Refreshview method, which is simpler, coordinates conversion according to the Left,top value, calculates the plotting place, and then calls the System.Drawing.Graph.DrawString method. The character object also defines its own members, such as the Char property returns the character data represented by the object, font represents the typeface used by the drawing object, and ForeColor represents the color of the drawing text.

A tab in a character is special because its width is variable, but it is based on its position in the document view, so the Textchar on the derivation textchartab to change the handling of this situation, it adds the Refreshtabwidth method, To compute the character width based on the left position of the object in the View area. Here I assume that a tab step is equal to the width of the four bottom line character, the tab's right end coordinate must be a natural number of tab steps, so the tab's width can be calculated based on the tab's position for modulo and other operations.

To represent a paragraph, you define a paragraph object textparagraph, which is not a container object and holds information about paragraph alignment, which is similar to the style of a paragraph character (hard return) in Word.

Also defines the row-end object textlineend, which simulates word's line breaks (soft return).

You can define a picture object, and after observing the behavior of word processing documents, you can see that the picture and OLE object features inserted in the Word document are very similar, so in order to consider the extensibility of the text editor, first derive the Textobject abstract class from the TextElement base, The abstract class represents an object in a document that is determined by the class to which it derives.

Deriving textimage from the Textobject object represents a picture object that overrides the Refreshview method to draw a picture on the drawing output object. The FromXml and ToXml methods are also overloaded to Exchange data with XML nodes, and can be designed to save picture binary data in Base64 format as XML nodes.

In addition, other types can be derived from the Textobject object depending on the needs of the application, such as directly reading the database to draw graphs on the interface, etc., where the object in the document can dynamically display the latest data in the system.

You can observe that objects in Word (including pictures) can change size, and when you click on a picture object with your mouse, you will see 8 dots on the midpoint of the four and four sides of the picture. These little dots I call control points. Dragging these 8 points with the mouse can change the size of the object dynamically. In fact, in many types of programs can encounter these 8 control points, such as in the Vs.net form designer, the current control around the 8 control points. There is also a set of ways to implement these 8 control points.

Control points can be divided into internal control points and external control points of two types, we have these 8 points from 0 to 7 of the number. You need to set a different cursor style when you move the mouse cursor over these 8 control points.

             Internal control Point ┌─────────────────┐│ 0 1 2 ││││                            ││││││ 7                                  3 ││││││││               ││ 6 5 4 │└─────────────────┘ outside Control point                                ┌────────────────┐│0 1 2││││                              │││││ │7   3│ ││││││                 │││6 5 4│└────────────────┘ Control point on the mouse cursor below the northwest-southeast SiZenwse North-South Sizens northeast-southwest SIZENESW ┌────────────────┐│0 1                                2││││││                                │││ │7 West-South Sizewe 3│ West-South Sizewe│    │││││││         │6 5 4│└────────────────┘ Northeast-southwest SIZENESW North-South Sizens Northwest-Southeast Sizenwse

As shown in the illustration above, the location of all control points can be computed by the known primary rectangle, the type of control point (either internal or external control point), and the width of the control point. You can make a routine, enter 3 parameters, the rectangle structure of the main rectangular area, whether it is an internal control point (not an internal control point is an external control point) and the width of the control point, the routine calculates the position of all control points, and then returns an array containing 8 rectangle. The array is the position and size of the control rectangle from 0 to 7th.

Textobject object display should know its position in the view area, when it corresponding mouse movement message, can be based on the mouse cursor position and 8 control rectangle comparison, if the mouse cursor in a control rectangle to notify the text editor to change the style of the mouse cursor.

The general control points are drawn into a rectangular box, and the control points are drawn into two types, one being a dark color (blue or black) and a white border, the other a dark border and filled with white. You can observe the Vs.net form designer, which allows you to select multiple controls in the designer, where the control's control point is a blue and white border with a fill color, and the control is the current control. The control points for the other selected controls are a blue border and are filled with white, which is the selection control. This is not the case in the text editor, so you can use the internal control points in this way, the control points are filled with black, and the border is white.

When the mouse on the control point on the drag and drop operation should be able to dynamically modify the size of the object, previously I was so implemented

    • In event handling (Handlemousedown) when the mouse button is pressed, if the mouse cursor is over a control point, set a mouse button to press the marker variable, note the mouse cursor position, and then exit the event handling
    • In the Mouse movement event (Handlemousemove), if the mouse button is set to press the marker variable, the difference between the current mouse cursor position and the previous mouse cursor position is the distance between the mouse cursor movement, and the horizontal and vertical components of the distance are the changes in the width and height of the object, You can use the library function System.Windows.Forms.ControlPaint.DrawReversibleFrame to draw a dotted box on the interface and call the library function as the mouse moves, which enables the so-called "rubber Band" operation.
    • In the mouse button release event (handlemousedown) processing, according to the current position of the mouse cursor and previously noted the mouse button press when the mouse cursor position to calculate the difference between the mouse cursor, which is the whole drag mouse pointer movement distance, the program can be based on the distance to change the size of the object

After some programming practice, found that the operation is more troublesome, need to write a lot of code, and the code scattered in 3 event processing, more than some global variables, it is difficult to write a general routines everywhere call, after analysis, this treatment mode to get rid of. In fact, the general program is in the mouse drag and drop operation, the user is not possible to do other operations at the same time (as the mouse and drag-and-drop side typing), and the "rubber Band" operation when the user interface does not need to redraw, This allows the application application to handle mouse-movement messages and mouse-release messages without any other action when the mouse is dragged. For simple programming, even the redrawing of the interface is not handled, so you can make a general routine to handle the whole mouse drag and drop to achieve the "rubber Band" operation, The function process is

    • This routine is called when the mouse button presses event handling (Handlemousedown)
    • Enter the routine, first note the current position of the mouse cursor, and then enter a dead loop
    • Damn it. The loop first calls the Win32API function WaitMessage waits for Windows messages and exits the loop without any Windows messages
    • Call the Win32API function PeekMessage to get the current Windows message
    • Exit loop If the current message is a mouse button to release the message
    • If the current message is a mouse move message, then get the current mouse cursor position, based on the start of the mouse cursor position to draw a rubber band rectangle
    • Call the Win32API function getmessage the current Windows message to "eat" and then go to the next loop
    • After the routine exits the loop, the difference between the current mouse cursor position and the mouse cursor position before the drag operation is performed, that is, the distance the mouse cursor moves in the entire drag-and-drop operation is returned to the keynote function as the return value (Handlemousedown)
    • The tone function accepts the distance that the mouse cursor is moved, and then carries out other processing based on that distance, where the size of the object is modified

Insert a paragraph here, in fact. NET Framework is better suited to WIN32 API programming, System.Windows.Form.Control's Handle property is a handle to a form that can be called by other Win32API as a parameter, and the CreateParams property is actually a CreateWindowEx parameter that overloads it to set the style of the control when it is created; wnd Proc is the default procedure for a control to handle all Windows messages, or it can overload itself to handle the underlying Windows messages. System.Windows.Forms.Application static functions Addmessagefilter and Removemessagefilter can easily add or remove "hook" programs for the entire application. The C # language can use System.Runtime.InteropServices.DllImport to import API functions in a declaration DLL file.



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.