Project requirements:
1. Design a word frequency statistics small software, for a given English article to do the statistical statistics of words.
2. The corresponding punctuation in the article does not count toward statistics.
3. Output the statistical results in a sort order from large to small.
Design:
1. Because the function is relatively simple, the C language is used to write directly.
2. The project contains statistical functions that use the defined structures to count the words and the number of times respectively.
3. Read the Word as a string, and perform a punctuation analysis of each of the characters.
4. After the statistics are completed, the number of times is sorted by the bubble sort method.
5. Loop out the entire statistic result.
Some core code:
Structure definition:
struct addup{ char word[]; int count;} R;
Read text:
Chartemp[ -]; R fin[10000]={" /",0}; FP=fopen ("F:/1.txt","R"); while(!feof (FP)) {fscanf (FP),"%s", temp); Q=strlen (temp);N++; for(i=0; i<n;++i)if(strcmp (fin[i].word,temp) = =0) {Fin[i].count++; N--; Break; } if(i>=N) {strcpy (fin[n-1].word,temp); Fin[n-1].count++; } }
Punctuation Determination:
for(i=0; i<q;i++){ if(temp[i]==','|| temp[i]=='.'|| temp[i]=='?'|| temp[i]=='!'|| temp[i]=='"') Temp[i]=' /';}
Bubble Sort:
for (i=0; i<n;i++) for (j=0; j<n-i;j++) { if (fin[j].count<fin[j+1].count) { ls[0]=fin[ j+1]; Fin[j+1]=fin[j]; FIN[J]=ls[0]; } }
Output Result:
Freopen ("F:/2.txt","W", stdout); for(i=0; i<n;i++) {printf ("%s:", Fin[i].word); S=0; for(j=0; j<fin[i].count;++j) S++; printf ("%d times", s); printf ("\ n"); } fclose (stdout);
Test Case:
Because the word frequency statistics words repetition probability is not too high, so small articles may not be able to produce good test results, selected the Martin Luther King's "I Have a Dream" speech to statistics.
Word Volume: 1666
Test results:
C language Word frequency statistics design