Original: http://blog.csdn.net/k21325/article/details/53052855
Abstract: In order to solve the problem of Chinese search, the first use of the PHP version of open source SCWS, but the processing of people and place names, there will be truncation of the name of the person names errors. Start using nlpir participle, the effect on the accuracy of the word segmentation is better than SCWS. This article describes how to compile Java under Windows systems to generate a jar file that can be executed.
Nlpir's:
Http://ictclas.nlpir.org/downloads
GitHub's address:
Https://github.com/NLPIR-team/NLPIR
Two versions are different, this article explains how to use Eclipse to build your project.
First, NLPIR official version
After downloading the folder in the Bin directory, as shown in, where Nlpir_windemo.exe is a NLPIR demo program, you can try to run and understand the functionality of Nlpir.
Project source code in the sample directory, contains C, C + +, Hadoop, JAVA, Python and other language examples.
Create a new project with Eclipse to import the Java Engineering Catalog Jnatest_nlpir,
(1) Eclipse-File->import
(2) Select the path where the Jnatest_nlpir is located, click Finish
(3) View Eclipse Engineering
(4) The Nlpirtest.java file contains the main function, and the following statement initializes the Nlpir required library file
The Clibrery class is included in the Nlpirtest.java file,
[Java]View PlainCopy
- Clibrary Instance = (clibrary) native.loadlibrary ("H:\\workspace\\ictclas\\1\\ictclas2015\\lib\\win64\\nlpir", Clibrary. class);
The function LoadLibrary needs to pass the library file location, the source code provides the multi-language class library, our project needs to load the Win64 class library, this folder content is as follows,
(5) Load Word breaker Data folder path
[Java]View PlainCopy
- String Argu = "h:\\workspace\\ictclas\\1\\ictclas2015";
- String System_charset = "UTF-8";
- int charset_type = 1;
- int init_flag = CLibrary.Instance.NLPIR_Init (Argu, Charset_type, "0");
H:\\WORKSPACE\\ICTCLAS\\1\\ICTCLAS2015 is the parent folder of the Data folder.
Once this step is complete, you can debug the code. The API can read the manual.
Second, github download the code
The directory contains the Nlpir SDK directory, and each directory is a component provided by Nlpir. The Nlpir-ictclas directory contains the code for the Nlpir component.
Import Ictclas_java projects in Eclipse, engineering catalogs such as
The main function is not filled in the project, you can add the main function in the Nlpirtest.java file.
[Java]View PlainCopy
- Public class Nlpirtest {
- public static void Main (string[] args) throws exception{
- Nlpirtest t = new nlpirtest ();
- T.testparticiple ();
- }
- public void Testparticiple () throws IOException {
- .....
- }
- .......
- }
Unlike the official website, the loading library file can automatically determine the system type and find the library file in the project's current directory. The Win32, Win64, linux32, linux64 of the project current directory are the folders that contain the library files.
At the same time will automatically load "project current directory" under the data Q Allison for the word breaker directory. These directories are set up so that you can do the debugging work.
Third, the "Find keyword" component in GitHub Key_extract
The project catalog is as follows,
A Java version of sample code is available in project to import the project using Eclipse
Also add the main function in the Keyextractor.java file. The first parameter of Keyextract_getkeywords is the text that needs to be extracted from the keyword, and the second argument is the number of keywords.
[Java]View PlainCopy
- Public static void Main (string[] args) {
- String keywordsstr = CLibraryKeyExtractor.instance.KeyExtract_GetKeyWords (args[0], Max , true);
- System.out.println (KEYWORDSSTR);
- CLibraryKeyExtractor.instance.KeyExtract_Exit ();
- }
In the current folder of the project, there is a data directory, is the word segmentation and Extraction keywords need to use the word breaker. needs to be license into this folder. You can not distinguish between the use of the user file, it is recommended to test all the files in the current project directory Data folder.
These settings are complete, the parameters are passed in Eclipse, and the menu item Run-->run Configure.
Iv. Exporting Jars
On the Eclipse Project catalog, right-click Export
Select Runnablejar to generate the jar file
You can then use the cmd execution, passing the parameters, the effect is as follows
CAS nlpir Chinese participle Java edition