Python calls Chinese Academy of Sciences Nlpir (ICTCLAS2015) detailed Ji Liu (lch614730@163.com)

Source: Internet
Author: User
Tags parent directory

Python calls the Chinese Academy of Sciences Nlpir (ICTCLAS2015) detailed

Nanjing University School of Computer Science and engineering

RUIXIA_NUSTM Research Group Chaoliu ([email protected])

-------------------------

The main content of the explanation:

1. Nlpir version and download

2. Code issues

3. UserDict Use Problems

-------------------------

Friendly tip: If it is swig problem, handle it yourself. The first download of Swig,swig can help us bind DLLs or so files written in C or C + + to multiple languages, including Python. Download the installation package to a directory under Windows to add the directory to the path of the environment variable to use SWIG (you can of course enter the full path to use swig). You can open a command-line window and enter swig inside, if "must specify an input file appears. Use-help for available options. " It means everything goes well.

1.NLPIR (ICTCLAS2015): http://ictclas.nlpir.org/downloads, Directory is as follows:

The content in the "combo pack (which I want to create a file and only the contents of this folder)" needs to:

Where: Bin folder (own new) under Include: Importuserdict folder under zip unzip files. The Data folder is the contents of the Data folder of the parent directory, Nlpir, __init__.py, and nlpir.py are the files under the Sample\pythonsample folder.

2. A add all the contents of the "bundle" folder to your own project, and note that the path to the DLL in the nlpir.py code is modified, as shown in the red circle: where 32bit or 64bit is determined by your own Python.exe version, not your own operating system.

B, for Word segmentation, part-of-speech tagging, separators for easy programming, the SEG function in the nlpir.py code can be rewritten as:

------------------------------------------------------------------------------------------

‘‘‘
Chao Liu (njust nustm ruixia)
‘‘‘
def nlpir_seg_pos (paragraph,flag = True,echo = '/'):
# Nlpir Participle Parameters (paragraph: string, flag: whether to label part of speech, Echo: Speech Division)
Para_seg_pos = ' '
atoms = segment (paragraph)
For a in atoms:
If Len (A.spos) < 1:continue
i = Paragraph[a.start:a.start + a.length]#.decode (' Utf-8 ') #.encode (' ASCII ')
#yield (i, A.spos)
If flag = = False:
Para_seg_pos = Para_seg_pos + (str (i) + ")
Else
Para_seg_pos = Para_seg_pos + (str (i) + echo + A.spos + ")

Return Para_seg_pos.rstrip ()

-----------------------------------------------------------------------------------------

3. UserDict User dictionary Import problem, download the zip package, there is a folder importuserdict, there is a Readme.txt file, the specific operation please see:

A gadget with attachments enables offline import of user dictionaries, as follows:
1. With the word breaker Data folder peer set up Bin directory, below the establishment of two level directory ICTCLAS2014;
2. Unzip the contents of the attachment and place it under the ICTCLAS2014;
3. Edit the bin/ictclas2014 below Userdic.txt, where the user dictionaries and annotations are placed;
4. Execute the bin/ictclas2014 batch file. You can import the user dictionary to the Field.pdat Field.pos in the data directory.
53,000 the entry will be divided into more time, it may take about 2 hours.

4. Other issues

If the PY code in a different folder to invoke the PY program, you need to add a __init__.py file in all levels of the directory, there is no need to store any code.

Just start adding the Python system path in the PY, and then import the file py, for example:

Import Sys

Sys.path.append (' Libsvm-3.20/python ')//can add an absolute path, or you can add a relative path.

From svmutil Import *

Python calls CAs nlpir (ICTCLAS2015) Ji Liu ([email protected])

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.