Programmer's 9th smart TV platform topic-high-frequency vocabulary Extraction

Source: Internet
Author: User

In the face of the vast ocean of information, it is sometimes difficult to find the desired resources. Searching for frequently-used words in a large number of texts is a common topic of information search and data compression.
This smart TV station invites you to find the most M phrases (consisting of n words) in a relatively large English text ). The same text file is processed in a unified manner. The text only contains English words, spaces, and response characters. The program with the highest efficiency is compared. This text has recently been released.

Program input: m, n, text file path (M cannot exceed 20, N cannot exceed 8)
Program output: high-frequency phrases and their quantity list
  
Middleware rules: Submit executable programs that meet the preceding requirements. The language is not limited. Click to the end;
We will conduct a fair test on the works of each contestant in a unified environment,
Compare the minimum number of programs used for integration.

Source program

Import java. Io .*;
Import java. util .*;

Class TT
{
Public String phrase;
Public int count;
}
Public class searchphrase {

Private Static linkedhashmap phrase = new linkedhashmap ();
Static TT [] max_phrase;

Private Static vector separatestring (string S)
{
Vector v = new vector ();
String temp = "";
For (INT I = 0; I <S. Length (); I ++)
{
If (S. charat (I )! = '')
{
Temp + = S. charat (I );
}
Else
{
If (temp! = "")
V. Add (temp );
Temp = "";
}
}
If (temp! = "")
V. Add (temp );
Return V;
}
Private Static void swap (INT POs, int count, string phrase)
{
Int I;
If (max_phrase [pos-1]. Count <count)
{
For (I = pos-1; I> 0; I --)
{
If (max_phrase [I-1]. Count> max_phrase [I]. Count)
Break;
}
Max_phrase [POS]. Count = max_phrase [I]. count;
Max_phrase [POS]. phrase = max_phrase [I]. phrase;
Max_phrase [I]. Count = count;
Max_phrase [I]. phrase = phrase;
}

}
Private Static void adjust_max (INT count, string phrase)
{
Int I, J;
If (count <= max_phrase [max_phrase.length-1]. Count) return;
For (I = max_phrase.length-1; I> = 0; I --)
{
If (max_phrase [I]. phrase. Equals (phrase ))
{
Max_phrase [I]. Count = count;
If (I> 0)
{
Swap (I, Count, phrase );
}
Return;
}
}
Max_phrase [max_phrase.length-1]. Count = count;
Max_phrase [max_phrase.length-1]. phrase = phrase;
If (I> 0)
{
Swap (max_phrase.length-1, Count, phrase );
}
}
Private Static void JS (vector V, int N)
{
String S;
For (INT I = 0; I <v. Size ()-N + 1; I ++)
{
S = "";
For (Int J = I; j <I + N; j ++)
{
S + = V. Get (j) + "";
}
Int COUNT = 1;
If (phrase. containskey (S. hashcode ()))
{
Count = integer. parseint (phrase. Get (S. hashcode (). tostring ());
Count ++;
}
Phrase. Put (S. hashcode (), count );
Adjust_max (count, S );
}
}
Public static void main (string [] ARGs ){
Try
{
Long T;
Int M, N;
String path;
M = integer. parseint (ARGs [0]);
N = integer. parseint (ARGs [1]);
Path = ARGs [2];
Max_phrase = new TT [m];
For (INT I = 0; I <m; I ++)
{
Max_phrase [I] = new TT ();
Max_phrase [I]. Count = 0;
Max_phrase [I]. phrase = "";
}
T = (New java. util. Date (). gettime ();
Java. Io. filereader Fr = new java. Io. filereader (PATH );
Java. Io. bufferedreader BR = new bufferedreader (FR );
String S;

Vector v = NULL;


While (S = Br. Readline ())! = NULL)
{
V = separatestring (s );
JS (v, N );
}
For (INT I = 0; I <m; I ++)
{
System. Out. println (max_phrase [I]. phrase );
System. Out. println (max_phrase [I]. Count );
System. Out. println ();
}
T = (New java. util. Date (). gettime ()-T;
System. Out. Print (t );
System. Out. println ("Ms ");
}
Catch (exception E)
{
System. Out. println (E. getmessage ());
}


}

}

Test result 1: M = 20 N = 8

Under games played won drawn lost goals
71

Tabulated under games played won drawn lost goals
70

Games played won drawn lost goals for against
70

May Xinhua following are the results from
69

Played won drawn lost goals for against and
59

Won drawn lost goals for against and points
59

Jan Xinhua following are the results from
48

Chinas economy efficiency indicators of the sector
39

The industrial statistics include all stateowned specified ISES and
39

Industrial statistics include all stateowned parameter ISES and
39

Statistics include all stateowned specified ises and the nonstateowned
39

Include all stateowned specified ises and the nonstateowned ones
39

All stateowned specified ises and the nonstateowned ones
39

Stateowned specified ises and the nonstateowned ones with annual
39

Enterprises and the nonstateowned ones with annual sales
39

And the nonstateowned ones with annual sales income
39

Xinhua Chinas economical efficiency indicators of the Sector
39

The nonstateowned ones with annual sales income over
39

Nonstateowned ones with annual sales income over million
39

Up percent over the same period last year
35

13594 MS

Test result 2 M = 10 N = 5

Xinhua following are the results
295

May Xinhua following are
209

Following are the results from
183

Are the results from
176

April Xinhua following are
141

Jan Xinhua following are
122

Billion yuan billion US dollars
120

Won drawn lost goals
88

Played won drawn lost goals
88

Dec Xinhua following are
87

12437 MS

 

The above source code uses the simplest method. If you have a better and more efficient method, follow up !!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.