coreseek中文全文檢索索引的應用

來源:互聯網
上載者:User

Coreseek之我們的應用觸屏版HTML5)

安裝詳情請參考:http://www.coreseek.cn/products-install/install_on_bsd_linux/

這裡以centos6.2為例進行說明:下面安裝內容取自官網)

一.安裝依賴包

yum install make gcc g++ gcc-c++ libtool autoconf automakeimake mysql-devel libxml2-devel expat-devel

.安裝coreseek
$ wget http://www.coreseek.cn/uploads/csft/3.2/coreseek-3.2.14.tar.gz
$ 或者 http://www.coreseek.cn/uploads/csft/4.0/coreseek-4.0.1-beta.tar.gz
$ 或者 http://www.coreseek.cn/uploads/csft/4.0/coreseek-4.1-beta.tar.gz
$ tar xzvf coreseek-3.2.14.tar.gz 或者 coreseek-4.0.1-beta.tar.gz 或者 coreseek-4.1-beta.tar.gz
$ cd coreseek-3.2.14 或者 coreseek-4.0.1-beta 或者 coreseek-4.1-beta

##前提:需提前安裝作業系統基礎開發庫mysql依賴庫以支援mysql資料來源和xml資料來源
##安裝mmseg
$ cd mmseg-3.2.14
$ ./bootstrap  #輸出的warning資訊可以忽略,如果出現error則需要解決
$ ./configure --prefix=/usr/local/mmseg3
$ make && make install
$ cd ..

##安裝coreseek
$ cd csft-3.2.14 或者 cd csft-4.0.1 或者 cd csft-4.1
$ sh buildconf.sh  #輸出的warning資訊可以忽略,如果出現error則需要解決
$ ./configure --prefix=/usr/local/coreseek --without-unixodbc --with-mmseg --with-mmseg-includes=/usr/local/mmseg/include/mmseg/ --with-mmseg-libs=/usr/local/mmseg/lib/ --with-mysql  ##如果提示mysql問題,可以查看MySQL資料來源安裝說明
$ make && make install
$ cd ..

##測試mmseg分詞,coreseek搜尋需要預先設定好字元集為zh_CN.UTF-8,確保正確顯示中文)
$ cd testpack
$ cat var/test/test.xml  #此時應該正確顯示中文
$ /usr/local/mmseg/bin/mmseg -d /usr/local/mmseg/etc var/test/test.xml
$ /usr/local/coreseek/bin/indexer -c etc/csft.conf --all
$ /usr/local/coreseek/bin/search -c etc/csft.conf 網路搜尋

三.配置

[zdh@gy03 coreseek]$ cat etc/movie.conf

#

# Minimal Sphinx configuration sample(clean, simple, functional)

#

source movie

{

type = mysql


sql_host =dbIP

sql_user =dbuser

sql_pass =dbpassword

sql_db = dbname

sql_port =3306 # optional, default is 3306

sql_query_pre = SETNAMES utf8

sql_query = \

SELECT index_id, movie_id,movie_name,movie_name_alias, movie_name_pinyin, starin, starin_pinyin,director, director_pinyin, movie_type, show_time, region \

FROM index_movie


sql_attr_uint = movie_id

#sql_attr_timestamp =date_added


sql_query_info = SELECTmovie_id, movie_name, starin, director, movie_type, show_time, region FROMindex_movie WHERE index_id=$id

}


index movie

{

source = movie

path =/usr/local/coreseek/var/data/movie

charset_type =zh_cn.utf-8

#charset_table = 0..9,A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F

charset_dictpath =/usr/local/coreseek/etc/

morphology = none

docinfo = extern

mlock = 0

min_stemming_len = 1

ngram_len = 0

min_word_len = 1

html_strip = 0

#ngram_chars =U+3000..U+2FA1F

}

indexer

{

mem_limit = 64M

}

searchd

{

listen = 9312

#listen = 9306:mysql41

log =/usr/local/coreseek/var/log/searchd.log

query_log =/usr/local/coreseek/var/log/query.log

read_timeout = 5

max_children = 30

pid_file =/usr/local/coreseek/var/log/searchd.pid

max_matches = 1000

seamless_rotate = 1

preopen_indexes = 1

unlink_old = 1

#workers = threads # for RT to work

#binlog_path =/usr/local/coreseek/var/data

}


四.啟動

[zdh@gy03 coreseek]$ searchd -cetc/movie.conf #開啟

[zdh@gy03 coreseek]$ searchd -cetc/movie.conf --stop #關閉

[zdh@gy03 coreseek]$ indexer -c /usr/local/sphinx/etc/movie.conf--all --rotate

#如果在啟動serched之前建索引,去掉--rotate


五.附index_movie表結構:

CREATE TABLE `index_movie` (
`index_id` int(11) NOT NULL AUTO_INCREMENT,
`movie_id` int(11) DEFAULT NULL,
`movie_name` varchar(32) DEFAULT NULL,
`movie_name_alias` varchar(255) DEFAULT NULL,
`movie_name_pinyin` varchar(255) DEFAULT NULL,
`starin` varchar(128) DEFAULT NULL,
`starin_pinyin` varchar(256) DEFAULT NULL,
`director` varchar(64) DEFAULT NULL,
`director_pinyin` varchar(64) DEFAULT NULL,
`movie_type` varchar(64) DEFAULT NULL,
`show_time` varchar(10) DEFAULT NULL,
`region` varchar(32) DEFAULT NULL,
`movie_desc` varchar(1024) DEFAULT NULL,
`weight` int(11) DEFAULT '0',
`dp_class_type` varchar(128) DEFAULT NULL,
`dp_district_type` varchar(128) DEFAULT NULL,
`dp_age_type` varchar(64) DEFAULT NULL,
`show_time_format` varchar(64) DEFAULT NULL,
`resource_flag` int(10) DEFAULT '0' COMMENT '1有 0無',
PRIMARY KEY (`index_id`),
KEY `index_movie_id` (`movie_id`)
) ENGINE=InnoDB AUTO_INCREMENT=196606 DEFAULT CHARSET=utf8;




本文出自 “linuxblind開放空間” 部落格,轉載請與作者聯絡!

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.