1. Introduction
Research Institute of Computing Technology of Chinese Academy of Sciences on the basis of years of study, it takes one year to develop a Chinese lexical analysis system based on multi-layer hidden code model Ictclas (Institute of Computing Technology, Chinese Lexical analysis System), which has the following functions: Chinese word segmentation, POS tagging, non-sign word recognition. The correct rate of participle is up to 97.58% (the most recent 973 expert Group evaluation Results), the non-sign-on word recognition based on role labeling can get higher than 90% recall rate, in which Chinese name recognition recall rate is close to 98%, Word segmentation and POS labeling processing speed is 31.5kb/s. Ictclas and calculation of the other 14 free release of the results are widely reported by Chinese and foreign media, many domestic free Chinese word segmentation module has more or less reference to the Ictclas code.
2. Download
:
Download and unzip later.
The directory structure after decompression:
Necessary Description:
User.lic, user authorization file.
3. Create a new Eclipse project.
The Ictclas folder under Open API folder is then copied to the SRC directory, and all other folders and files are copied to the project directory. Engineering Structure:
4. Testing
You can create a new test class yourself, or you can use an existing test class.
For example, there is an example in the sample folder: Testmain.java, you can copy it into eclipse. and then run. We can see the results.
Note that Ictclas will automatically generate a ICTCLAS.log file to view this file for a lot of useful information.
5. Various issues.
[Java]View PlainCopy
- A.exception in thread "main" java.lang.UnsatisfiedLinkError:ICTCLAS50. Ictclas_init ([B) Z
- At ICTCLAS50. Ictclas_init (Native Method)
- At Testmain.testictclas_paragraphprocess (Testmain.java:)
- At Testmain.main (Testmain.java:)
Workaround:
In this case, you put the class Ictclas50.java in the Ictclas50_windows_32_jni\api\ictclas\i3s\ac directory in the package ICTCLAS.I3S.AC.
B. Place the Data folder and other files in a Configure folder Ictlas not recognized:
[Java]View PlainCopy
- exception in thread "main" java.lang.unsatisfiedlinkerror: no ictclas50 in java.library.path
- at java.lang.classloader.loadlibrary (unknown source)
- at java.lang.runtime.loadlibrary0 (Unknown source)
- at java.lang.system.loadlibrary (unknown source)
- at ictclas. I3s. AC. Ictclas50.<clinit> (Ictclas50.java:26)
- at testmain.testictclas_paragraphprocess (Testmain.java:32)
- at testmain.main (Testmain.java:15)
That is, the load library file and Data folder, user authorization file User.lic error.
Workaround:
One method is to modify the parameters of the ICTCLAS50 class and the test class to specify the library file.
[Java]View PlainCopy
- Static
- {
- String path = new File (""). GetAbsolutePath () +"\\<span style=" Font-family:simsun; Line-height:25.1875px">configure</span>\\ictclas50.dll";
- System.loadlibrary ("ICTCLAS50");
- System.load (path);
- }
Then modify the Argus value of the Testictclas_paragraphprocess () method in the Testmain class, telling Ictclas that you have changed the project directory.
Some of the code is as follows:
[Java]View PlainCopy
- ICTCLAS50 testICTCLAS50 = new ICTCLAS50 ();
- String Argu = ".";
- String Argu = new File (""). GetAbsolutePath () +"\\configure";
- //Initialize
- if (Testictclas50.ictclas_init (Argu.getbytes ("GB2312") = = false)
- {
- System.out.println ("Init fail!");
- return;
- }
There is also the Testictclas_fileprocess () method.
Modified Project Catalog:
Chinese Academy of Sciences participle ICTCLAS5.0_JNI use method