C language implementation of Word frequency statistics

Source: Internet
Author: User

Demand:

1. Design a word frequency statistic software to count the frequencies of words given in English articles.

2. The punctuation included in the article does not count toward statistics.

3. Output the statistical results in a sort order from large to small.

Design:

1. Because it is a cross-professional 0.0 Not C + + and Java, can only be written in the C language only learned, or very laborious.

2. Define a structure that contains two members of the word and frequency to count the frequency of words (the memory is dynamically allocated and can handle large text).

3. Use the fopen function to read the specified document.

4. Use the FGETC function to get the character, and then treat it differently depending on whether the obtained character is a letter.

5. Use the Quick Sort method to sort the statistical results.

5. Loop out the entire statistic result.

Part of the code:

Structure definition:

struct fre_word{    int  num;     Char a[

Allocate Initial Memory:

struct Fre_word *W;    W= (struct fre_word *)malloc(*p*sizeof(struct fre_word)); // allocating initial memory to structs

Read text:

printf (" Enter the name of the read-in file:");    scanf ("%s", filename);                                     // Enter the name    of the word frequency to be counted if " R ")) = =NULL    ) {        printf (" cannot open file \ n");        Exit (0);    }

Word Matching:

/**************** Set word occurrences to 1****************************/     for(i=0;i< -; i++) {(W+i)->num=1; }/**************** Word matching ****************************************/I=0;  while(!feof (FP))//file has not been read finished{ch=fgetc (FP); (W+i)->a[j]=' /'; if(ch>= $&&ch<= -|| ch>= the&&ch<=122)//CH If the letter is credited{(W+i)->a[j]=ch; J++; Flag=0;//set the flag to determine if there are consecutive punctuation or spaces        }        Else if(! (ch>= $&&ch<= -|| ch>= the&&ch<=122) &&flag==0)//CH is not a letter and the previous character is a letter{i++; J=0; Flag=1;  for(m=0; m<i-1; m++)//match words, if they already exist num+1            {                if(STRICMP (W+m)->a, (w+i-1)->a) = =0) {(W+M)->num++; I--; }            }        }/**************** dynamically allocating memory ****************************************/        if(i== (p* -))//use I to determine the current memory is full{p++; W=(structfre_word*)realloc(W, -*p* (sizeof(struct( Fre_word)));  for(n=i;n<= -*p;n++)//assigns the initial value to the newly allocated structure body(w+n)->num=1; }    }

Quick sort:

voidQuickstructFre_word *f,intIintj) {intm,n,temp,k; Charb[ -]; M=i; N=K; K=f[(I+J)/2].num;//the selected reference     Do     {          while(f[m].num>k&&m<j) m++;//find elements smaller than k from left to right         while(f[n].num<k&&n>i) n--;//find elements larger than k from right to left        if(m<=N) {//If the condition is found and satisfied, the interchangetemp=F[m].num;            strcpy (B,F[M].A); F[m].num=F[n].num;            strcpy (F[M].A,F[N].A); F[n].num=temp;            strcpy (F[N].A,B); M++; N--; }     }     while(m<=N); if(m<j) quick (f,m,j);//using recursion    if(n>i) quick (f,i,n); }

Result output:

     for (n=0; n<=i;n++)    {        printf (" words appearing in the document:");        printf ("%-18s", (w+n),a);        printf (" the number of occurrences is:");        printf ("%d\n", (w+n),num);    }

Test Case:

After reading the previous students ' blogs and the teacher's comments, they used a long text to test the president's inaugural speech.

Some test results:

C language implementation of Word frequency statistics

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.