Using PHP to invoke the Lucene package for Full-text Search

Source: Internet
Author: User
Tags define createindex modify php class php source code linux
Full-Text search due to the needs of the work, the need to use PHP to achieve a large number of sites in the full text search,
And the most popular search engine library for Full-text search is lucene now,
It is a subproject of the Apache Jakarta and provides a simple and useful API
With these APIs, you can perform full-text retrieval of any basic text data (including databases).
Because PHP itself supports invoking an external Java class, you first write a class in Java,
This class implements two methods by invoking the Lucene API:
public string CreateIndex (String indexdir_path,string datadir_path)
public string Searchword (String ss,string index_path)
Where CreateIndex is the method of creating indexing,
Two parameters were passed in Indexdir_path (Directory of index files), Datadir_path (indexed file directory), return indexed file list string,
The other is Searchword, which retrieves the index via the incoming keyword parameter (ss), Index_path is the directory of the index file. Returns all the retrieved files.
Here is the source code, very simple, you can refer to: Txtfileindexer.java
and the PHP program calls these two methods to achieve the call to Lucene, so as to achieve the purpose of Full-text search.
The method of invoking PHP is as follows:
First, create an instance of the Txtfileindexer class that we write,
$TF = new Java (' testlucene.txtfileindexer ');
You then call the normal PHP class by calling the method, creating the index first:
$data _path = "F:/test/php_lucene/htdocs/data/manual"; Defines the contents of the indexed content
$index _path = "F:/test/php_lucene/htdocs/data/search"; To define the generated index file storage directory
$s = $TF->createindex ($index _path, $data _path); Methods to invoke Java classes
Print $s; Print the results returned
Try again this time:
$index _path = "F:/test/php_lucene/htdocs/data/search"; To define the generated index file storage directory
$s = $TF->searchword ("keyword for search", $index _path);
Print $s;
Also note the Java class path, you can set in PHP
Java_require ("f:/test/php_lucene/htdocs/lib/"); This is an example where my class and lucene are placed in this directory
This is OK, is not very simple.
PHP Source code: test.php
Next, I'll talk about the environment configuration,
First need to have a Java SDK, is necessary, I use the 1.4.2 version, the other version should also be no problem.
PHP5, have tried PHP4, should be able to.
Because the Java extension of the PHP5 band is not tuned, and the previous invocation of Java is inefficient and slow, it uses a Php/java BridgeThis project.
1. Download Javabridge
Url: http://sourceforge.net/projects/php-java-bridge/
The current version is
Php-java-bridge_3.0.8_j2ee.zip
After the bag is solved
Javabridge\web-inf\cgi\java-x86-windows.dll
Javabridge\web-inf\lib\javabridge.jar
Copy it to the C:\php\ext directory and put
Java-x86-windows.dll renamed as Php_java.dll
2. Modify PHP.ini (example)
Extension=php_java.dll
[Java]
Java.class.path = "C:\php\ext\JavaBridge.jar; F:\test\php_lucene\htdocs "
Java.java_home = "C:\j2sdk1.4.2_10"
Java.library.path = "C:\php\ext; F:\test\php_lucene\htdocs "
3. Restart Apache.
4. You can find some files to index
In test.php, you can modify the path of the index file and the data file.
Txtfileindexer.java's 37 lines limit files that index only HTML suffixes, and can be modified if necessary.
Depending on the current situation (Javabridge support for Linux and FreeBSD), you can fully
Linux or Freebsd/apache2/php4/lucene/javabridge
Operating environment.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.