Stroke Input Method Search Algorithm example (LUA implementation)

Source: Internet
Author: User
Document directory
  • Bishun Database
  • Construct a subtree and search
  • Package download

My colleagues recently used the Google Pinyin input method to implement their own Pinyin input methods. After understanding it, the most important thing is the construction and retrieval of a trie (Dictionary tree, so today I want to implement a stroke input method. The general idea is:

  1. Find a stroke database for all Chinese characters or level 1 or Level 2 Chinese Characters
  2. Use Lua to read the database and construct a trie tree.
    1. Each node stores a stroke.
    2. Each node has a subnode set.
    3. Each node contains a set of Chinese characters, indicating the complete Chinese characters consisting of all strokes at this level.
  3. A node is retrieved Based on the strokes entered by the user, and then the child tree is traversed in the order of strokes.
    1. Traversing the subtree can give all Chinese characters starting with these strokes, but it cannot be displayed at all. Therefore, an iterator is required to give a possible value at each call, the implementation of this iterator in C is complicated, but the implementation in Lua is simply a small concept. The function that traverses the subtree is directly encapsulated into a coroutine, and yield (Chinese character) is enough for every Chinese Character found.
Bishun Database

Download To http://download.csdn.net/detail/yyjlan/3766691 on csdn

I do not like the downloaded MDB format, and Lua does not. Because luasql supports ODBC, you can add the MDB file to the ODBC data source, load it, and convert it to the sqlite3 format for later use. The conversion code is as follows:

Require "luasql. ODBC "require" luasql. sqlite3 "odbc_env = luasql. ODBC () -- add the access file to the user DSN in the control panel> Administrative Tools> data source. The name is hzbsodbc_conn = odbc_env: connect ("hzbs") odbc_cur = odbc_conn: execute ("select * From hzbs;") sqlite_env = luasql. sqlite3 () sqlite_conn = sqlite_env: connect ("hzbs. sqlite3.db ") sqlite_conn: Execute (" create table hzbs (ID integer primary key, Hanzi text, stroke_number integer, stroke_order text, Unicode text, GBK text); ") sqlite_conn: setautocommit (false) -- start transactionrecord ={} while odbc_cur: Fetch (record, "N ") dolocal id = record [1] local Hanzi = record [2] local stroke_number = record [3] local stroke_order = record [4] local Unicode = record [5] local GBK = record [6] sqlite_conn: execute ("insert into hzbs (ID, Hanzi, stroke_number, stroke_order, Unicode, GBK) values (".. ID .. ",\'".. hanzi .. "\',".. stroke_number .. ",\'".. stroke_order .. "\',\'".. unicode .. "\',\'".. GBK .. "\ ');") endsqlite_conn: Commit () -- commit the transactionsqlite_conn: Close () odbc_cur: Close () odbc_conn: Close () odbc_env: Close ()

Construct a subtree and search

Let's just look at the code. The Code is a bit messy, but it is okay to catch up with it. To run the code, you must first install luaforwindows

Require "luasql. sqlite3 "require" WX "function _ T (s) return send -- Enum stroke_t {local stroke_root = 0 -- for trie root, not a valid strokelocal stroke_heng = 1 Local stroke_shu = 2 Local stroke_pie = 3 local stroke_na = 4 Local stroke_zhe = 5 Local stroke_max = 5 Local stroke_text = {_ t "1 ", _ t "primary", _ t "primary", _ t ",", _ t "primary"} --} function new_node (stroke) return {stroke = stroke, -- see stroke definitionsubnodes ={}, -- next strokeshanzis ={}-- two or more Hanzi cocould have the same stroke order} endfunction new_trie () return new_node (stroke_root) end -- insert Hanzi and create the triefunction insert_hanzi (node, stroke_order, Hanzi) Local stroke, not_found_indexfor I = 1, # stroke_order dostroke = tonumber (stroke_order: Sub (I, I )) if node. subnodes [stroke] thennode = node. subnodes [stroke] elsenot_found_index = ibreakendendif not_found_index thenfor I = not_found_index, # stroke_order dostroke = tonumber (stroke_order: Sub (I, I) node. subnodes [stroke] = new_node (stroke) node = node. subnodes [stroke] endendtable. insert (node. hanzis, Hanzi) end -- check whether the stroke sequence nodes composed of the strokes array exist. If so, return the function find_node (root, strokes) of the node) local node = rootif # strokes <1 thenreturn nilendfor I, stroke in ipairs (strokes) doif node. subnodes [stroke] thennode = node. subnodes [stroke] elsereturn nilendreturn nodeendfunction db_to_trie (db_name) Local Env = luasql. sqlite3 () Local conn = env: connect (db_name) Local cur = Conn: Execute ("select Hanzi, stroke_order from hzbs;") Local trie = new_trie () record = {} while cur: Fetch (record, "A") doinsert_hanzi (trie, record. stroke_order, record. hanzi) endcur: Close () Conn: Close () ENV: Close () return trieendfunction get_hanzi_enumerator (Root) Local traversetraverse = function (node) for I = 1, # node. hanzis docoroutine. yield (node. hanzis [I]) endfor stroke = 1, stroke_max doif node. subnodes [stroke] thentraverse (node. subnodes [stroke]) endendendlocal CO = coroutine. create (function () traverse (Root) End) Return (function () Local ret, Hanzi = coroutine. resume (CO) if not RET then -- already stoppedreturn nilelseif Hanzi = nil then -- the last call, no yield and no return valuereturn else hanziendend) end vertex guilocal new_id = (function () local id = wx. wxid_highestreturn (function () id = ID + 1 return idend) End) () dialog = wx. wxdialog (wx. null, wx. wxid_any, _ t "Lua stroke input method demonstration", wx. wxdefaposition position, wx. wxdefaultsize) Panel = wx. wxpanel (dialog, wx. wxid_any) Local main_sizer = wx. wxboxsizer (wx. wxvertical) -- returns the local stroke_label = wx button. wxstatictext (panel, new_id (), _ t "Optional strokes") Local heng_button = wx. wxbutton (panel, stroke_heng, stroke_text [stroke_heng]) Local shu_button = wx. wxbutton (panel, stroke_shu, stroke_text [stroke_shu]) Local pie_button = wx. wxbutton (panel, stroke_pie, stroke_text [stroke_pie]) Local na_button = wx. wxbutton (panel, stroke_na, stroke_text [stroke_na]) Local zhe_button = wx. wxbutton (panel, stroke_zhe, stroke_text [stroke_zhe]) Local button_sizer = wx. wxboxsizer (wx. wxhorizontal) button_sizer: add (stroke_label, 0, wx. wxalign_left + wx. wxall, 5) button_sizer: add (heng_button, 0, wx. wxalign_left + wx. wxexpand + wx. wxall, 5) button_sizer: add (shu_button, 0, wx. wxalign_left + wx. wxexpand + wx. wxall, 5) button_sizer: add (pie_button, 0, wx. wxalign_left + wx. wxexpand + wx. wxall, 5) button_sizer: add (na_button, 0, wx. wxalign_left + wx. wxexpand + wx. wxall, 5) button_sizer: add (zhe_button, 0, wx. wxalign_left + wx. wxexpand + wx. wxall, 5) main_sizer: add (button_sizer, 0, wx. wxalign_left + wx. wxexpand + wx. wxall, 5) -- enter the stroke list local input_label = wx. wxstatictext (panel, new_id (), _ t "input stroke") Local input_textctrl = wx. wxtextctrl (panel, new_id (), "", wx. wxdefaposition position, wx. wxdefaultsize, wx. wxte_readonly) Local input_backspace_button = wx. wxbutton (panel, new_id (), _ t "") Local input_clear_button = wx. wxbutton (panel, new_id (), _ t "clear") Local input_sizer = wx. wxboxsizer (wx. wxhorizontal) input_sizer: add (input_label, 0, wx. wxalign_left + wx. wxall, 5) input_sizer: add (input_textctrl, 1, wx. wxalign_left + wx. wxexpand + wx. wxall, 5) input_sizer: add (input_backspace_button, 0, wx. wxall, 5) input_sizer: add (input_clear_button, 0, wx. wxall, 5) main_sizer: add (input_sizer, 1, wx. wxalign_left + wx. wxexpand + wx. wxall, 5) -- Alternative Chinese Character local candidate_label = wx. wxstatictext (panel, new_id (), _ t "Alternative Chinese character") Local candidate_sizer = wx. wxboxsizer (wx. wxhorizontal) candidate_sizer: add (candidate_label, 0, wx. wxalign_left + wx. wxall, 5) Local candidate_number = 5 function create_candidate_btn (Num) Local textctrls ={} for I = 1, num dotextctrls [I] = wx. wxbutton (panel, new_id (), "") candidate_sizer: add (textctrls [I], 1, wx. wxalign_left + wx. wxall + wx. wxexpand, 5) endtextctrls. start_id = textctrls [1]: GETID () textctrls. end_id = textctrls. start_id + candidate_number-1 return textctrlsendlocal candidate_textctrls = create_candidate_btn (candidate_number) main_sizer: add (candidate_sizer, 1, wx. wxalign_left + wx. wxall + wx. wxexpand, 5) -- select the output Chinese Character local output_textctrl = wx. wxtextctrl (panel, new_id (), "", wx. wxdefaposition position, wx. wxsize (0,100), wx. wxte_multiline) Local output_sizer = wx. wxboxsizer (wx. wxhorizontal) output_sizer: add (output_textctrl, 1, wx. wxalign_left + wx. wxexpand + wx. wxall, 5) main_sizer: add (output_sizer, 0, wx. wxalign_left + wx. wxexpand + wx. wxall, 0) main_sizer: setsizehints (DIALOG) dialog: setsizer (main_sizer) -- must be added; otherwise, the program dialog: connect (wx. wxevt_close_window, function (event) dialog: Destroy () Event: Skip () End) -- read the local trie = db_to_trie ("hzbs. sqlite3.db ") -- input stroke array input_strokes ={} records = nilfunction update_candidate () If rows = nil thenfor _, textctrl in ipairs (rows) dotextctrl: setlabel ("") endelsefor _, textctrl in ipairs (region) dolocal Hanzi = get_next_candidate () If Hanzi thentextctrl: setlabel (Hanzi) elsetextctrl: setlabel ("") endendendendendfunction update_input () local text ={} for _, stroke in ipairs (input_strokes) dotable. insert (text, stroke_text [stroke]) endinput_textctrl: setvalue (table. concat (text, "") endfunction insert_stroke (stroke) table. insert (input_strokes, stroke); local node = find_node (trie, input_strokes) If node = nil thentable. remove (input_strokes) -- delete invalid input -- beepelseget_next_candidate = get_hanzi_enumerator (node) update_input () update_candidate () endendfunction remove_stroke () table. remove (input_strokes) local node = find_node (trie, input_strokes) If node = nil nodes = nil else nodes = nodes (node) endupdate_input () update_candidate () endfunction clear_stroke () input_strokes = {} get_next_candidate = nilupdate_input () update_candidate () enddialog: connect (wx. wxid_any, wx. wxevt_command_button_clicked, function (event) Local id = event: GETID () If id <= stroke_max theninsert_stroke (ID) elseif ID> = role and ID <= %thenoutput_textctrl: appendtext (parts [id-candidate_textctrls.start_id + 1]: getlabel () clear_stroke () elseif id = input_backspace_button: GETID () equals () elseif id = input_clear_button: GETID () thenclear_stroke () endend) dialog: centre () dialog: Show (true) WX. wxgetapp (): mainloop ()

Package download

The database files and source code files can be downloaded from my resources (the csdn upload is too junk. I have moved to cnblogs, uploaded the files to it, and linked to ghost ), first come, Zhang ~~

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.