Ibus database high-frequency keyword Error Correction script

Source: Internet
Author: User
Ibus database high-frequency keyword Error Correction script-general Linux technology-Linux technology and application information, the following is a detailed description. After using ibus for a long time, I often suddenly find that some of the original preference or commonly used words suddenly fall behind, or even to the second page, it is not squeezed out by other words, however, the ibus user database may be out of order.

I don't know whether this is a bug in the ibus program or a problem with the SQLite database system used by ibus. When a user enters a pinyin, ibus puts forward the user input frequency of corresponding words from the user database, determines the location of a word. If the user chooses to enter a word for the first time, the record of the word is added to the user database, and the location of the word is recorded in advance when the user enters the word for the next time. In theory, the record of one entry in the user database can only appear once (multiple words are counted as multiple words) at most. However, in actual use, sometimes I do not know why, A frequently-used word is added to the database as the first input. The word is sorted as a low-frequency word during the next input, resulting in inconvenient location.

This Python script finds such entries, deletes the records that are added later, and restores the entry frequency.

Script download: http://code.google.com/p/ptcoding/source/browse/trunk/ibus_fix
(Ibux_db_fix.py In the svn directory. The other two are test scripts)

Program functions:

1. Automatic Backup of user lexicon
2. Check that there are two entries in the user database, but they are not multi-Tone Words.
3. Delete the added entry

SQL statement for checking out the incorrect word:

SELECT * FROM py_phrase
WHERE phrase IN
(SELECT phrase
FROM py_phrase
Group by phrase
Having count (*) = 2)

Remaining defects:

1. If the record of the same entry appears three or more times, the program will not be able to authenticate it (It is very rare that it may appear. You can modify the SQL statement in the script to query it)
2. If a word itself is a multi-tone word and one of the syllables has the above situation, the program cannot be identified (it seems that the probability is quite low)
3. If the user input frequency of the two records is the same, both records will be deleted (not a bad thing, but it will not affect much)

Python source code:



#! /Usr/bin/python
#-*-Coding: UTF-8 -*-
Import OS
Import sqlite3

DB = OS. getenv ("HOME") + "/. ibus/pinyin/user. db"

If not OS. path. exists (DB ):
Print "? Why ?? Pingyi yishicang? Bus ...... PT sends a congratulatory message ...... "
Exit (1)

# ------ Backup database file --------
Import time
Nowtime = time. strftime ("_ % Y-% m-% d-% H _ % M _ % S", time. localtime ())
DB_BK = DB + nowtime
Execute = "cp-v % s" % (DB, DB_BK)
OS. system (execute)
Print "ibus user database backed up to", DB_BK


# ------ Connect to Database ---------
Con = sqlite3.connect (DB)
C = con. cursor ()
C.exe cute ("" SELECT * FROM py_phrase
WHERE phrase IN
(SELECT phrase
FROM py_phrase
Group by phrase
Having count (*) = 2 )""")

Rows = c. fetchall ()
Badphrase = []

# ------ Detemine bad phrases -------
For I in range (0, len (rows), 2 ):
Flag = True
Phrase = rows [I: I + 2]
For j in range (1, 5 ):
If phrase [0] [j]! = Phrase [1] [j]:
Flag = False
If flag:
Badphrase. append (phrase [1])


If not len (badphrase ):
Print "no error entries found ...... PT sends a congratulatory message ...... Http://apt-blog.net"
Else:
Print "% d of the following error entries found:" % len (badphrase)
For row in badphrase:
Print "** [% s] **" % row [-3]
Print "\ n optimize and clean up ...... "

# ------ Clean work to Database
Try:
For row in badphrase:
SQL = "DELETE FROM py_phrase WHERE phrase = \" % s \ "AND user_freq = % s" % (row [-3], row [-1])
# Print SQL
C.exe cute (SQL)

Con. commit ()
Print "cleaned up ...... PT sends a congratulatory message ...... Http://apt-blog.net"
Failed t sqlite3.OperationalError:
Print "the cleanup cannot be completed. Please exit ibus ..."

Con. close ()
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.