Application Introduction:
Statistics English article frequency is a very common requirement, this article uses Python implementation.
Thread Analysis:
1, place each word in the English article in the list, and count the length of the list,
2, traverse the list, count the occurrences of each word, and store the results in the dictionary;
3, Using the length of the list obtained in step 1, find the frequency of each word and store the result in the frequency dictionary;
3, the dictionary is sorted by the "value" of the dictionary key-value pair, and the output result (also can take advantage of the slice output frequency of the most or least specific number, because the sorted sorted () After the function is processed, the word and its frequency information is stored in the tuple, and all tuples are then formed into the list.
Code implementation:
Fin = open (' The_magic_skin _honore_de_balzac.txt ') #the txt is up #to you Lines=fin.readlines () fin.close () "Transform The article into Word list ' Def words_list (): chardigit= ' ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123 456789 ' all_lines = ' for line in lines:one_line= ' for ch in line:if ch in Chardi Git:one_line = one_line + ch All_lines = all_lines + one_line return all_lines.split () ' C Alculate the total number of article List s are the article List ' Def total_num (s): Return Len (s) ' Calculate the Occurrence The Every word t is the article List ' Def word_dic (t): Fre_dic = Dict () to I in range (len (t) ): Fre_dic[t[i]] = Fre_dic.get (t[i],0) + 1 return fre_dic "" Calculate the occurrence times of every word w
is dictionary of the occurrence times of every word ' def word_fre (W): For key in w:w[key] = W[key]/Total Return w ' ' Sort tHe dictionary v is the frequency of words ' def word_sort (v): Sort_dic = sorted (V.items (), key = Lambda e:e[1]) Return sort_dic ' "is entrance to functions output is the ten words with the largest" "Total = Frequency Num (words_list ()) Print (Word_sort (Word_fre (Word_dic (Words_list ())) [-10:])