PHP Chinese word breaker: phpsplit

Source: Internet
Author: User
Tags composer install
Phpsplit is a Chinese word thesaurus based on PHP development.

PHP word breaker residing in Unicode encoding dictionary

    • Only applicable to PHP5, necessary function iconv

    • This program is used RMM inverse matching algorithm for word segmentation, thesaurus needs to be specially compiled, this class provides a makedict () method

    • Simple operation Flow: SetSource, Startanalysis, GetResult

    • Use special format to encode the main dictionary without loading dictionaries to memory operations

Use

    • First make sure to use PHP 5.3 +

    • Installing composer

Composer Install
Require __dir__. ' /vendor/autoload.php '; $split = new Split (); Var_dump ($split->simple ("Hello Phpsplit"); $this->asserttrue (True);
Array (3) {  [0] = =  string (0) ""  [1] = =  string (6) "Hello"  [2] = =  string (8) "Phpsplit"}

Word Segmentation result suffix description

noun n, time term T, place word s, locality F, numeral M, quantifier Q, distinguishing word B, pronoun r, verb v, adjective A, state word z, adverb D, preposition p, conjunctions C, auxiliary u, modal word y, interjection e, quasi-sound word o, idiom I, Chinese idiom l, abbreviation J, anterior component H, posterior component K, morpheme G, Non-morpheme Word x, punctuation W

Co-workers added the following 3 categories of tags * proper noun classification mark, namely person name NR, place name NS, group organ unit names NT, other special terms NZ; * Morpheme's sub-class mark, namely the noun morpheme ng, the dynamic morpheme VG, describes the morpheme AG, the morpheme TG, the sub-morpheme DG, etc. The name verb vn (verb with noun characteristics), name-shape word an (adjective with noun characteristics), VD (verb with adverb), sub-type AD (adjective with adverb characteristic)

Total of about 40 or so.

Project home:http://www.open-open.com/lib/view/home/1448200861473

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.