Words' support for Chinese Word Segmentation is quite good. The most intuitive feeling is that when you edit a Doc document, word constantly detects spelling errors (including English and Chinese Words). In addition, you may also notice that when you double-click a Word in the document, the words that contain this word are automatically selected. For example, if you double-click "medium" in "I am a Chinese, the word "Chinese" will be selected (double-click "country" or "people" will have the same effect ). Word in the evidence table supports Chinese Word segmentation, and the Word segmentation effect is quite good based on my experience in office2003.
It is not convenient to call Word Segmentation in an application. So I want to make a simple experiment and compile a small program. Naturally, I think of "macro" and "VBA" in word ". First click "Recording macro", and then press Ctrl +. I found that some VB code is generated in the Word Macro editor, and there is a selection object, which should represent the selected area. Further exploration is made and selection is found. words is what I want. It is an array of words obtained after the selected area is segmented. In addition, selection has many attributes and methods. For example, selection. Sentences is an array of all sentences. Next, we will port the scattered VBA code in word to VB6.0. Basic functions of this applet: enter a text segment in Box A. After you click the button, output the word segmentation effect in Box B. All words are separated by spaces. The VB Code is as follows:
Option explicit
Dim wdapp as word. Application
Dim doc as word. Document
Private sub commandementclick ()
'On error resume next
Dim segwords as string me. command1.caption = "executing ..."
Me. command1.enabled = false
Segwords = ""
Wdapp. Selection. HomeKey Unit: = wdStory
Wdapp. Selection. TypeText Me. Text1.Text
Wdapp. Selection. WholeStory
Dim I As Integer
For I = 1 To wdapp. Selection. Words. Count
Segwords = segwords + wdapp. Selection. Words (I) + ""
DoEvents
Next I
Wdapp. Selection. Delete Unit: = wdCharacter, Count: = 1 Me. Command1.Caption = "Start test"
Me. Command1.Enabled = True
Me. Text2.Text = segwords
End Sub
Private Sub Form_Load ()
Set wdapp = New Word. Application
Set doc = wdapp. Documents. Add
Wdapp. ActiveDocument. SaveAs "c ://~ Ftemp.doc"
End SubPrivate Sub Form_Terminate ()
Doc. Close
Set doc = Nothing
Wdapp. Quit savechanges: = False
Set wdapp = Nothing
End Sub
The program running effect is as follows:
The word splitting effect does not seem to be poor, but when I use a long article for testing, the speed is very slow, and the ICTCLAS word splitting is not in the same magnitude. At first, I thought it was caused by the execution of the explanation of VB. The speed of rewriting with Delphi was improved, but the speed was still unsatisfactory. Delphi code: unit Unit1; interface uses Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms, Dialogs, ComCtrls, StdCtrls, Buttons, comobj; type TForm1 = class (TForm) labels: TStatusBar; GroupBox1: TGroupBox; GroupBox2: TGroupBox; BitBtn1: cached; Memo1: TMemo; Memo2: TMemo; procedure outputs (Sender: TObject ); procedure FormDestroy (Sender: TObject); procedure BitBtn1Click (Sender: TObject); private {Private declarations} public {Public declarations} end; var Form1: TForm1; wdapp: Variant; doc: Variant; implementation {$ R *. dfm} procedure TForm1.FormCreate (Sender: TObject); begin wdapp: = createoleobject ('word. application '); wdapp. visible: = false; doc: = wdapp. documents. add (); wdapp. activeDocument. saveAs ('C ://~ Ftemp.doc '); end; procedure TForm1.FormDestroy (Sender: TObject); begin doc. close; wdapp. quit (savechanges: = False); end; procedure outputs (Sender: TObject); varsegwords: TStringList; wdStory, wdCharacter: OleVariant; I: Integer; begin wdStory: = 6; wdCharacter: = 1; segwords: = TStringList. create; self. bitBtn1.Caption: = 'executing '; self. bitBtn1.Enabled: = false; wdapp. selection. homeKey (wdStory); wdapp. selection. typeText (self. memo1.Lines. text); wdapp. selection. wholeStory; for I: = 1 to wdapp. selection. words. count do begin segwords. append (wdapp. selection. words. item (I); end; wdapp. selection. delete (wdCharacter, 1); self. bitBtn1.Caption: = 'start test'; self. bitBtn1.Enabled: = true; self. memo2.Text: = segwords. text; end.