Use Word Segmentation for Word Frequency Statistics

Source: Internet
Author: User

The Org. apdplat. Word. wordfrequencystatistics class in Word Segmentation provides the word frequency statistics function.

The command line script is called as follows:

Write text that requires Word Frequency Statistics to the file: text.txt chmod + x WFS. Sh & WFS. Sh -textfile=text.txt -statisticsresultfile=statistics-result.txtprogram, open the file statistics-result.txt, and view

The calling method in the program is as follows:

// Set wordfrequencystatistics to wordfrequencystatistics = new wordfrequencystatistics (); wordfrequencystatistics. setremovestopword (false); wordfrequencystatistics. setresultpath ("word-frequency-statistics.txt"); wordfrequencystatistics. setsegmentationalgorithm (segmentationalgorithm. maxngramscore); // start word segmentation wordfrequencystatistics. SEG. du MP (); // prepare the file files. write (paths. get ("text-to-seg.txt"), arrays. aslist ("Word Segmentation is a distributed Chinese Word Segmentation component implemented by Java. It provides a variety of dictionary-Based Word Segmentation Algorithms and uses the Ngram model to eliminate ambiguity. "); // Clear the previous statistical result wordfrequencystatistics. reset (); // word segmentation for the file wordfrequencystatistics. SEG (new file ("text-to-seg.txt"), new file ("text-seg-result.txt"); // output wordfrequencystatistics. dump ("file-seg-statistics-result.txt ");

Word Frequency Statistics In the first sentence:

1. Rain 22, Tomorrow 23, molecular 24, course 15, lecture 16, combined with 17, Atom 18, go to 19, 110, about 111, and 112, but also 113. 114, 115, 1

Word Frequency Statistics in the second sentence:

1. 22, 23, based on 14, word 15, component 16, Dictionary 17, Ngram 18, a variety of 19, 110, and 111, 112, 113, and Chinese characters word Segmentation 114, algorithm 115, 116, distributed 117, 118, 119, model 120, lai121, a 122, Java 1



Use Word Segmentation for Word Frequency Statistics

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.