Word Frequency Statistics:
Function 1: Small file input Keyboard enter command under Console.
Enter file name in console: Use scanner to get console data
System.out.println ("Please enter the file path to be counted"); Scanner sc = new Scanner (system.in); String road = Sc.nextline (); FileInputStream fis = new FileInputStream (road);//file path to be read InputStreamReader ISR = new InputStreamReader (FIS);// Character stream BufferedReader infile = new BufferedReader (ISR); Buffer
Get words from a read TXT file, use regular, convert non-word parts into spaces
String words[];file = File.tolowercase ();//regular non-letters, symbols, etc. with spaces instead of file = File.replaceall ("[^a-za-z]", ""); file = File.replaceall ("\\s+", "" "); words = File.split (" \\s+ ");
Depositing the acquired key-value pairs into HashMap
Sorts the words by the word frequency (value of the key pair) in descending order. Override the Sort method in the collection class to complete the descending order.
List<map.entry<string,integer>> list =new arraylist<map.entry<string,integer> (Map.entrySet () ); Collections.sort (list,new comparator<map.entry<string,integer>> () { @Override public int Compare (entry<string, integer> arg0, entry<string, integer> arg1) {//TODO auto-generated method stub Return Arg1.getvalue (). CompareTo (Arg0.getvalue ());});
Outputs a key-value pair that completes the sort. Use Util. The entry of the map packet traverses the HASHMAP output
For (map.entry<string, integer>mapping:list) {System.out.println (Mapping.getkey () + "," +mapping.getvalue ());}
Operation Result:
Function 2. Support command line input file name of English works
>wf gone_with_the_wand
total 1234567 words
Function 3. Support Command line input store the directory name of the English work file, batch statistics.
>dir folder
Gone_with_the_wand
Runbinson
Janelove
>WF folder
Gone_with_the_wand
Total 1234567 Words
Function 4. Read an English single piece from the console
In the console you can enter the English article name or article content
Week2 Word frequency Statistics First update