Sphider Dingtingson English perfect Chinese version with Spider search Engine program v1.3.4 is the most official version, free open source, with the official latest release of the original Chinese. No kernel files have been changed.
Sphider is a perfect search engine program with spiders.
Sphider is a lightweight, PHP-developed web spider and search engine that uses MySQL to store data. You can use it to add search functionality to your site. Sphider is very small, easy to install and modify, there are already thousands of sites in use it.
Official Homepage http://www.sphider.eu/
Click to download Sphider Dingtingson English perfect Chinese version with Spider search Engine program v1.3.4
Today need for several web site to do a full-text search engine, find a few PHP open-source projects, first try a sphinx, but is based on the database, the equivalent of database search extension. Sphider is good, but the Chinese word can not, basically only by space and symbols for word segmentation. Want to use luence words can only use Java and. NET, there is no PHP version, so had to try to modify the Sphider word. Fortunately found Scws this good Chinese word-breaker system, just need to add his function into the sphider inside can.
First deploy the Sphider and SCWS according to their installation documents, the SCWS-1.1.6 used here, need to deploy PHP extension, note that Linux under the permission to modify the thesaurus, otherwise the word will be separated from all Chinese characters. Sphider is used here Dingtingson the perfect Chinese version with Spider search Engine.
After the deployment is correct, modify the Sphider, find the admin folder under the spider file, first in the beginning to join the code initialization of the word breaker
Note the GBK used here, if your Web page is encoded with UTF8, change the location of the dictionary and the rules file here
In the Index_url function, replace the original English participle, in $wordarray = Unique_array ("", $data [' content ']);
$cws->send_text ($data ['content']); $list= $cws->get_tops ( +, $xattr); Settype ($list,'Array'); $wordarray=Array (); $i=0; //segmentforeach($list as$tmp) {$wordarray [$i] [1]= $tmp ['Word']; $wordarray [$i] [2]= $tmp [' Times']; $i++; }
Delete
$wordarray = Unique_array (explode ("", $data [' content ']);
And
$wordarray = calc_weights ($wordarray, $title, $host, $path, $data [' keywords '] );
Two statements, because Sphider original English participle here is completely unnecessary to use, here can be self-limiting and optimization of $wordarray, here I write very simple.
After the modification is complete, the crawler can be normal to the Chinese word segmentation, the effect is good, note if garbled note page or dictionary code is UTF8 or gb2312.
Sphider Dingtingson English perfect Chinese version with Spider search Engine program v1.3.4