Test segmentation results using backoff2005 test scripts in Windows systems

Source: Internet
Author: User
Tags diff

The test of the word breaker is usually backoff2005 script, but the backoff2005 script is run on the Linux system. If you are in a Windows system, how do you use the script? Suppose the user already has a icwb2-data compressed package.

The Perl development Environment must be installed first . :

Https://dwimperl.googlecode.com/files/dwimperl-5.14.2.1-v7-32bit.exe
Next, you need to install the diff tool:

Http://superb-dca3.dl.sourceforge.net/project/gnuwin32/diffutils/2.8.7-1/diffutils-2.8.7-1-bin.zip

Unzip the diff Tool into the E:\diffutils directory and add the E:\diffutils\bin directory to the system's environment variables.

Next, you need to modify the icwb2-data/script/score script:

Change the code of the line to:

$diff = "E:/diffutils/bin/diff";

Change the code of the 52,53 line to:( Note that the d:/tmp directory exists )

$tmp 1 = "d:/tmp/comp01$$";

$tmp 2 = "d:/tmp/comp02$$";

Next, you can execute the test command:

in the Open the command-line tool in the E:\icwb2-data directory and execute the command as follows:

E:\icwb2-data>perl Scripts/score Gold/pku_training_words.utf8 Gold/pku_test_gold

. UTF8 Gold/pku_test_gold.utf8 > Pku_maxent.score

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/58/4C/wKiom1St-WLBkmCvAADUwi-33pY308.jpg "title=" aa.png "alt=" Wkiom1st-wlbkmcvaaduwi-33py308.jpg "/>


The execution of the command takes some time to wait.

after the test command is completed, the The Pku_maxent.score file is generated under the E:\icwb2-data directory and the final result is as follows:

Insertions:0

Deletions:0

Substitutions:0

Nchange:0

Ntruth:27

Ntest:27

TRUE WORDS RECALL:1.000

TEST WORDS PRECISION:1.000

= = = SUMMARY:

= = = Totalinsertions: 0

= = = TotalDeletions: 0

= = = TotalSubstitutions: 0

= = = Totalnchange: 0

= = = Total TRUE WORDCOUNT: 104372

= = = Total TEST WORDCOUNT: 104372

= = = Total TRUE WORDSRECALL: 1.000

= = = Total TEST WORDSPRECISION: 1.000

= = = FMEASURE: 1.000

= = = OOVrate: 0.058

= = = OOV Recallrate: 1.000

= = = IV Recallrate: 1.000

###Gold/pku_test_gold.utf800001043721043721.0001.0001.0000.0581.0001.000

because we use the test set and the word segmentation result set is the same file, so the correct rate, recall, and so on are 100%.


This article is from a "little progress every Day" blog, make sure to keep this source http://sbp810050504.blog.51cto.com/2799422/1600586

Test segmentation results using backoff2005 test scripts in Windows systems

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.