Welcome to the Linux community forum and interact with 2 million technicians & gt; enter SCWS, short for SimpleChineseWordsSegmentation, which is a simple Chinese Word Segmentation System. This is a set of mechanical Chinese Word Segmentation Engines Based on Word Frequency dictionaries. It can divide a full range of Chinese characters into words. Words are the basic units of Chinese characters. When writing, first download the SCWS source code.
Install scws as follows (scws can be used as an independent tool or dynamic library and called in the C/C ++ program, it can also be called as a php extension in the php language)
bzip2 -d SCWS_1.X.X.tar.bz2tar xvf SCWS_1.X.X.tar./configure --prefix=SCWS_HOMEmakemake install
PS: After the above operations are completed, scws has been installed successfully. You can use SCWS in the command line mode or in the C/C ++ program.
Use SWCS through command line
cd SCWS_HOME./scws -i ../etc/test.txt -o ../etc/out.txt -r ../etc/rules.utf8.ini -d ../etc/dict.utf8.xdb -c utf8
PS: the string encoding must be consistent. For scws, the dictionary and dictionary files must be used. the encoding of Rule files must be consistent with that of the processed files.
If you want to use SCWS in C/C ++, You need to perform the following operations. It is very important, especially for installing PHP extensions:
# Cp SCWS_HOME/include/scws/usr/include/scws # For two soft connections (64-bit machines need to sit in/usr/lib64) ln-s SCWS_HOME/lib/libscws. so.1.1.0/usr/lib/libscws. soln-s SCWS_HOME/lib/libscws. so.1.1.0/usr/lib/libscws. so.1
Install PHP Extension
cd SCWS_1.x.x/phpextphpize./configure --with-php-config=PHP_HOME/bin/php-configmakemake install
# Copy the SCWS_HOME/phpext/modules/scws. so generated above to your php extension directory, edit php. ini, and add the following options:
[Scws]
Extension = scws. so
Scws. default. charset = utf8
Scws. default. fpath = SCWS_HOME/etc
Verify PHP extension Installation
Cd SCWS_1.x.x/phpext
Php scws_test.php
# Output:
Test [1]... PASS!
Test [2]... PASS!
Test [3]... PASS!
Test [4]... PASS!
Test [5]... PASS!
Test [6]... PASS!
Test [7]... PASS!
Test [8]... PASS!
Test [9]... PASS!
Test [10]... PASS!
Test [11]... PASS!
Test [12]... PASS!
Test [13]... PASS!
Test [14]... PASS!
Test [15]... PASS!
//-------------------------------------
// TEST result report
// SCWS (Module version: 1.0.0, Library version: 1.2.0)-by hightman
//-------------------------------------
// Total test: 15
// Passed Num: 15 (100.00%)
// Failed Num: 0 (0.00%)
//-------------------------------------
OK PHP extension installation successful