Install python Chinese Word divider pymmseg

Source: Internet
Author: User

Recently, I used python as a crawler project and felt the power of python. During this period, I tried python text processing and used Chinese word segmentation. Therefore, I recorded the installation and use of pymmseg, as a memo.

Pymmseg project is https://code.google.com/p/pymmseg-cpp/downloads/list

Select to download the source code package, compile it by yourself, and save the trouble of incompatibility. I chose pymmseg-cpp-src-1.0.2.tar.gz. The installation process for windows and linux platforms is as follows:

64-bit win7 pymmseg installation process:

After installing vs2008, I used the vs compiler and found the "Visual Studio Tools"/"Visual Studio 2008x64 Win64 command prompt" in the vs Start Menu ", when the command line window is started, the environment variables of the compiler and connector are automatically configured. You can directly use commands such as cl and link. (64-bit machines must be compiled by a 64-bit compiler. Otherwise, a 32-bit compiler may cause a problem when loading the dll in a 64-bit system. Similarly, the 32-bit command window is selected)

3. Enter the decompressed directory through the command line window above. Here I am pymmseg-cpp, and then enter the subdirectory, mmseg-cpp, and execute

 python build.py

The compilation process of mmseg is as follows:

4. copy the entire directory of pymmseg-cpp to the $ PYTHON_HOME/Lib/site-packages directory and rename it pymmseg.

5. Test availability:

  pymmseg   text =  algor =  tok        % (tok.text, tok.start, tok.end)

The execution result is as follows:

[0 .. 66 .. 1213 .. 1919 .. 2525 .. 3131 .. 3434 .. 4040 .. 4445 .. 4949 .. 5555 .. 5858 .. 6464 .. 6767 .. 6970 .. 7676 .. 8282 .. 8686 .. 9292 .. 9898 .. 104104 .. 106]

By now, pymmseg is fully available on win7 X64.

Centos6.4 64-bit pymmseg installation process:

1. Make sure that you have installed gcc, g ++, and no gcc and g ++ commands are executed:

 -y  -c++

2. Download the source code on windows and decompress it. Go to the directory, enter the subdirectory mmseg-cpp, and execute:

python build.py

The entire process is as follows:

Copy the compiled pymmseg-cpp to the site-packages in the lib library of python and rename it pymmseg. I compile and install python2.75, the lib library address is in/usr/local/lib/python2.7/

Test whether the installation is successful, such:

By now, pymmseg on windows and linux has been installed and can be used hard.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.