Free Open-source full-text indexing and retrieval platform (firtex) and Chinese Word Segmentation System (ICTCLAS)

Source: Internet
Author: User

ICTCLASIntroduction:

Public evaluation by domestic and international authorities and recognition by 50 thousand of customers-- ICTCLAS won the first place in the evaluation activities organized by the 973 Expert Group in China and the first sighan organization in the evaluation.

 Optimal Overall Performance-- Ictclas3.0 word segmentation speed: 98.45% kb/s for a single machine, Word Segmentation accuracy, kb for APIs, and less than 3 MB for various Dictionary data compression.

Comprehensive support for application development in various environments-ICTCLAS is all written in C/C ++ and supports Linux, FreeBSD, and Windows operating systems, supports C/C ++, C #, Delphi, Java, and other mainstream development languages.

Official Website: http://ictclas.org/index.html

Supplement: You can use ICTCLAS to create a database-based full-text search for small projects,

Firtex introduction:

Firtex supports plain text, web pages, PDF, Microsoft Office, and other file formats. It supports Chinese (gb2312 and GBK) and English, and supports other languages and codes in a flexible architecture; the search syntax is rich and supports multi-Field Retrieval, date range retrieval, and custom sorting of search results. The system can also use the com plug-in for unlimited extension.

Firtex is designed to process large-scale data with high performance. It provides a text-only index speed of over 2.8 mb per minute on a single Pentium 4 200g 2 gram machine, search on a web page of nearly 100 Gb. The results can be returned within several milliseconds with only a dozen MB of memory.

Firtex is developed in C ++ and released in the form of the GPL (General Public License) Open Source authorization protocol. This means that you can freely use firtex Based on the GPL protocol, you can also participate in firtex development. If you need other authorization protocols, contact us.

Official Website: http://www.firtex.org/index.html

 

Supplement:FirtexThe interfaces used in C #, Java, and other languages are not encapsulated. The method of application in web projects is as follows:

Method 1: You can encapsulate firtex and establish an independent search server to communicate with the website socket to implement full-text search;

Method 2: encapsulate firtex in memcached and use the memcached service as the daemon process to encapsulate all search modules. The job to do this is to load and load data without writing it by yourself.Code. (From group discussion)

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.