Sphider Dingtingson English perfect Chinese version with Spider search Engine program v1.3.4

Source: Internet
Author: User
Tags vars

Sphider Dingtingson English perfect Chinese version with Spider search Engine program v1.3.4 is the most official version, free open source, with the official latest release of the original Chinese. No kernel files have been changed.

Sphider is a perfect search engine program with spiders.

Sphider is a lightweight, PHP-developed web spider and search engine that uses MySQL to store data. You can use it to add search functionality to your site. Sphider is very small, easy to install and modify, there are already thousands of sites in use it.

Official Homepage http://www.sphider.eu/

Click to download Sphider Dingtingson English perfect Chinese version with Spider search Engine program v1.3.4

Today need for several web site to do a full-text search engine, find a few PHP open-source projects, first try a sphinx, but is based on the database, the equivalent of database search extension. Sphider is good, but the Chinese word can not, basically only by space and symbols for word segmentation. Want to use luence words can only use Java and. NET, there is no PHP version, so had to try to modify the Sphider word. Fortunately found Scws this good Chinese word-breaker system, just need to add his function into the sphider inside can.

First deploy the Sphider and SCWS according to their installation documents, the SCWS-1.1.6 used here, need to deploy PHP extension, note that Linux under the permission to modify the thesaurus, otherwise the word will be separated from all Chinese characters. Sphider is used here Dingtingson the perfect Chinese version with Spider search Engine.

After the deployment is correct, modify the Sphider, find the admin folder under the spider file, first in the beginning to join the code initialization of the word breaker

Note the GBK used here, if your Web page is encoded with UTF8, change the location of the dictionary and the rules file here

In the Index_url function, replace the original English participle, in $wordarray = Unique_array ("", $data [' content ']);

$cws->send_text ($data ['content']); $list= $cws->get_tops ( +, $xattr); Settype ($list,'Array'); $wordarray=Array (); $i=0; //segmentforeach($list as$tmp) {$wordarray [$i] [1]= $tmp ['Word']; $wordarray [$i] [2]= $tmp [' Times']; $i++; }  

Delete

$wordarray = Unique_array (explode ("", $data [' content ']);

And

$wordarray = calc_weights ($wordarray, $title, $host, $path, $data [' keywords '] );

Two statements, because Sphider original English participle here is completely unnecessary to use, here can be self-limiting and optimization of $wordarray, here I write very simple.

After the modification is complete, the crawler can be normal to the Chinese word segmentation, the effect is good, note if garbled note page or dictionary code is UTF8 or gb2312.

Sphider Dingtingson English perfect Chinese version with Spider search Engine program v1.3.4

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.