Use SQLite for website search

Source: Internet
Author: User
Tags keyword list

  • Download the demo project-389 Kb
  • Database file-1.38 Kb
  • Download source code-111 Kb

Introduction

If your website or program needs a text search engine, unless you write one, otherwise, you will hand over the data you want to search to a network search engine and then spend money to index your website, but now you can use my program to index your text/HTML/ASP files. It will save keywords to the database for future search.

The program uses SQLite as the database. It is an open source database that can be used for free. For more information, see this website.

I use the. NET package of finsar to use SQLite. This package implements an ADO. NET driver for SQLite and is easy to use. Its DLL can be downloaded from SourceForge.

using Finisar.SQLite;
Database Design

You can see the E/R link diagram of the database:

This figure illustrates the relationship between m <-> n. Based on this relationship, we convert itWord_PageTable (index table ).

Word_Page
Wid Pid QTY
Create a database (class DBWSE)

I think a program should be able to create its own database, so it is a complete program, that is, its executable file does not need to carry an empty database. We useCREATESQL statement to create the required table.

Collapse
private void CreateDataBase(string path)
{
try
{
FileStream fi = File.Create(path);
fi.Close();
DBFile = path;
OpenDataBase();
string strSQL = "Create Table words (" +
"WID INTEGER PRIMARY KEY ," +
"Word NVarChar(50)" +
")";
SQLiteCommand sqd = new SQLiteCommand(strSQL, sqconn);
sqd.ExecuteNonQuery();
sqd.CommandText ="Create Table pages(" +
"PID INTEGER PRIMARY KEY ," +
"path NVarChar(100) NOT NULL," +
"nofw INTEGER NOT NULL," +
"Date_add NVARChar(10) NOT NULL" +
")";
sqd.ExecuteNonQuery();
sqd.CommandText ="Create Table word_page(" +
"WID INTEGER FORIGEN KEY REFERENCES words (wid)," +
"PID INTEGER FORIGEN KEY REFERENCES pages (pid)," +
"QTY INTEGER NOT NULL," +
"PRIMARY KEY (WID,PID)" +
")";
sqd.ExecuteNonQuery();
}
catch (Exception ex)
{
throw ex;
}
}

The above Code creates an empty database file.

Add data to the Word, Page, and Word_Page tables

Now we need to store the data in the database.INSERTCommand.

public int AddPage(string path, int nofw)<br>{<br>    int i = PID(path);<br>    if (i &lt; 0)<br>    {<br>        // the page must be add <br>        string strSQL = "INSERT INTO pages (path,nofw,Date_add)" +<br>            "VALUES ('" + path + "'," + nofw + ",'" + <br>            DateTime.Now.ToShortDateString() + "')";<br>        SQLiteCommand sqc = new SQLiteCommand(strSQL, sqconn);<br>        sqc.CommandText = strSQL;<br>        sqc.ExecuteNonQuery();<br>        //return PID<br>        i = PID(path);<br>    }<br>    return i;<br>}

The "PID function" is used to return the ID (page ID) of the new page. Next we store the page data in the database.

private int createWord(string word)<br>{<br>    string strsql = "INSERT INTO words (word) VALUES ('" + word + "')";<br>    SQLiteCommand sqc = new SQLiteCommand(strsql, sqconn);<br>    sqc.ExecuteNonQuery();<br>    return WID(word);<br>}

This function adds a keyword to the "Word" table.

public void AddWord(int page, string word, int QTY)<br>{<br>    int i = WID(word);<br>    if (i &lt; 0)<br>        i = createWord(word);<br>    string strsql = "INSERT INTO word_page (WID,PID,QTY) VALUES " +<br>        "(" + i.ToString() + "," + page.ToString() + <br>        "," + QTY.ToString() + ")";<br>    SQLiteCommand sqc = new SQLiteCommand(strsql, sqconn);<br>    sqc.ExecuteNonQuery();<br>}

According to the current design, the keywords cannot be repeated. If a word is found during query, the ID corresponding to the word is returned.(WID)Otherwise-1 is returned. Of course, we must first store Words in the Words table. In fact, this function does not store more redundant data when storing page indexes.

Search databases

Now we can search for a word in the database.

public DataTable SearchWord(string word)<br>{<br>    string strSQL = "SELECT pages.path,pages.nofw,pages.Date_add,word_page.QTY " +<br>        "FROM words INNER JOIN word_page ON words.wid=word_page.wid" +<br>        " INNER JOIN pages ON word_page.pid=pages.pid" +<br>        " WHERE words.word='"+word+"'";<br>    SQLiteDataAdapter sqd = new SQLiteDataAdapter(strSQL, sqconn);<br>    DataTable dt = new DataTable();<br>    sqd.Fill(dt);<br>    return dt;<br>}

Function returnDataTableObjects are used in programs.

Create an index for data stored in the database

Now you have saved all the data in the database. To use the data, we need to create an index for these pages!

It was not difficult at the beginning! We can use some delimiters to split an article into words. for example: space, ')', '(', '[', '\', etc., but this is not very accurate.

You must remove unimportant or repeated words before creating an index.

Split Article Section
string[] split = words.Split (new Char[] { ' ', ',', '.', ':',';','{','}','[',<br>     ']','(',')','\'','"','\\','&lt;','&gt;','=','+','-','/','*','#','&amp;','%','^',<br>     '`','~','0','1','2','3','4','5','6','7','8','9' });
Create a keyword list (class ScanFile) fold
private void StartMonitoring(string p)
{
//first step must read file
StreamReader stR = File.OpenText(p);
string words = stR.ReadToEnd();
stR.Close();
string[] split = words.Split(new Char[] { ' ', ',', '.', ':',';','{','}','[',
']','(',')','\'','"','\\','<','>','=','+','-','/','*','#','&','%','^',
'`','~','0','1','2','3','4','5','6','7','8','9' });
max = split.Length;
int index;
int k = 0;
list1.Clear();
for (int i = 0; i < split.Length; i++)
{
WordInfo word = new WordInfo();
word.Word = split[i].Trim();
if (word.Word.Length > 2)
{
SearchableWord = word.Word;
index = list1.FindIndex(HaveWord);
if (index < 0)//not found
{
word++;
list1.Add(word);
k++;
}
else
{
// increment count of word
list1[index]++;
k++;
}
}
// Progress( this is a event )
OnProgress(System.EventArgs.Empty);
}
total = k;
}
private static bool HaveWord(WordInfo str)
{
if (str.Word.ToUpper() == SearchableWord.ToUpper())
return true;
else
return false;
}

In this function, I usedWordInfoAndList Class, and creates a new event for the retrieved file"OnProgress".

SCF.Progress += new EventHandler(SCF_Progress);

To findList)To define a delegate:

private static bool HaveWord(WordInfo str)

However, we cannot find the word with this parameter. To solve this problem, I added a private variable for string comparison:

private static string SearchableWord;
Class RepluseTrivial)

To exclude words that have no value, you must have a list of these words. I want to save these words to the database so that SQL statements can be used conveniently. therefore, I created a garbage vocabulary and used insert/delete to add and delete these words.

When you createRepluseTrivialObject, you need to pass it a word list and then callRepulseFunction to remove valuable words.

public void Repluse()<br>{<br>    List<wordinfo> temp = new List<wordinfo>();<br>    for (int i = 0; i &lt; list1.Count; i++)<br>        if (!IsTrivial(list1[i].Word))<br>            temp.Add(list1[i]);<br>    list1 = temp;<br>}</wordinfo></wordinfo>

Function callIsTrivialFunction to check whether it is a Spam word.

How to use my code

Now you have a list of spam words and a database for storing these words. Use the algorithm mentioned above to call the "search function" to search for them. of course, this function can be further developed to hide indexes like Google and Microsoft engines. (efficiency cannot be compared ).

We can use a progress bar to display the progress:

public Form1()<br>{<br>    InitializeComponent();<br>    SCF = new ScanFile();<br>    SCF.Progress += new EventHandler(SCF_Progress);<br>}<br>void SCF_Progress(object sender, EventArgs e)<br>{<br>    progressBar1.Maximum = SCF.MaxProgress;<br>    progressBar1.Value++;<br>}

Create a list to remove spam words:

private void button2_Click(object sender, EventArgs e)<br>{<br>    progressBar1.Value = 0;<br>    SCF.Scan(textBox1.Text);<br>    label2.Text = "Total Word:"+SCF.Total.ToString();<br>    List<wordinfo> list1=new List<wordinfo>(); <br>    list1 = SCF.WordList;<br>    string str = <br>      Path.GetDirectoryName(Application.ExecutablePath)+ <br>      "\\Trivial.db3";<br>    RepluseTrivial rt = new RepluseTrivial(str);<br>    list1 = rt.Repluse(list1);<br>    listView1.Items.Clear();<br>    for (int i = 0; i &lt; list1.Count; i++)<br>    {<br>        ListViewItem li = new ListViewItem(new string[] { i.ToString(), <br>                          list1[i].Count.ToString(), list1[i].Word });<br>        listView1.Items.Add(li);<br>    }<br>}</wordinfo></wordinfo>

Save the list to the database:

private void button3_Click(object sender, EventArgs e)<br>{<br>    string str = Path.GetDirectoryName(Application.ExecutablePath);<br>    DBWSE db = new DBWSE(str+""\\WSEDB.db3");<br>    int i=db.AddPage(textBox1.Text,SCF.Total);<br>    progressBar1.Maximum = SCF.WordList.Count;<br>    progressBar1.Value = 0;<br>    for (int j=0;j &lt; SCF.WordList.Count;j++)<br>    {<br>        db.AddWord(i, SCF.WordList[j].Word, SCF.WordList[j].Count);<br>        progressBar1.Value++;<br>    }<br>}

You can easily display the query results inDataGridViewControl:

private void button7_Click(object sender, EventArgs e)<br>{<br>    string str = Path.GetDirectoryName(Application.ExecutablePath);<br>    DBWSE db = new DBWSE(str + "\\WSEDB.db3");<br>    dataGridView2.DataSource = db.SearchWord(textBox3.Text);<br>}


From: http://www.sqlitechina.org/html/1/20071221/147.html

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.