R Language for Chinese word segmentation

Source: Internet
Author: User

Chinese word segmentation in two ways: Rwordseg and Jiebar

Environment configuration for the R language:

R_path:

C:\Program files\r\r-3.1.2

Path:

%r_path%

First, the Chinese word segmentation with rwordseg bag

(1) Configure the environment variables for Java:

Java_home:

C:\Program files\java\jdk1.8.0_31

Path:

%java_home%\bin;%java_home%\jre\bin

CLASSPATH:

%java_home%\lib\dt.jar;%java_home%\lib\tools.jar


(2) download RWORDSEG package to local hard drive, current version of RWORDSEG package in https://r-forge.r-project.org/R/?group_id=1054

1 > Install.packages ("Rjava")
2 > Add the following path to the PATH environment variable:

? %java_home%\jre\bin
? %java_home%\jre\bin\server
? %r_path%\library\rjava\jri

3 > Install.packages (" download good rwordseg Package folder address /rwordseg_0.2-1.zip", repos=null,type= "source")
(3) Enter the command:

1 > Library ("Rjava")
2 > Library ("Rwordseg")

3 > Words = "Sanitation workers are dismissed for warming in the cold wind"

4 > segment.options (isnamerecognition = TRUE) #打开人名识别
5 > segmentcn (words)

Operation Result:

[1] "sanitation" "work" "because" "in" "Cold Wind" "in" "Warm Fire" "heating" "was" "dismissed"

Change to words = "My name is R language"

Operation Result: [1] "I" "" Name "" is "" R language "

Second, the Chinese word segmentation with Jiebar bag

(1) Enter the command:

1 > Install.packages ("Jiebar") #安装jiebaR包

2 > Library ("Jiebard") #加载jiebaRD包

3 > Library ("Jiebar")

4 > Words = "Sanitation workers are dismissed for warming in the cold wind"
5 > Test = Worker ()
6 > Test <= words

(2) output result:

[1] "sanitation workers" "because in" "Cold Wind" "in" "Warm Fire" "heating" "was" "dismissed"

Replace words = "My name is R language"

Operation Result: [1] "I" "" Name "" is "" R "" Language "

R Language for Chinese word segmentation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.