First, the procedure analysis
1. read file to Buffer
def process_file (DST): # Read file to Buffer
Try: # Open File
f = open (DST, ' R ')
Except Ioerror,s:
Print S
Return None
Try: # Read file to Buffer
Bvffer = F.read ()
Except
Print "Read File error!"
Return None
F.close ()
Return Bvffer
2. Buffer string divided into dictionary with Word frequency
def process_buffer (Bvffer):
If Bvffer:
Word_freq = {}
# Add the processing buffer bvffer code below, count the frequency of each word, store it in the dictionary word_freq
Bvffer.lower ()
char={"[Email protected]#$%^&* () _-+=<>?/,.:;{} []|\ ' ""}
For ch in char:
Bvffer=bvffer.replace (CH, ")
Words=bvffer.strip (). Split ()
For word in words:
Word_freq[word]=word_freq.get (word,0) + 1
Return Word_freq
3, the dictionary by the word frequency sorting and output the top ten key value pairs
def output_result (word_freq):
If Word_freq:
Sorted_word_freq = sorted (Word_freq.items (), Key=lambda v:v[1], reverse=true)
For item in SORTED_WORD_FREQ[:10]: # output TOP 10 words
Print Item
4, the main program output the first ten results and analysis results
if __name__ = = "__main__":
Import Argparse
Parser = Argparse. Argumentparser ()
Parser.add_argument (' DST ')
args = Parser.parse_args ()
DST = Args.dst
Bvffer = Process_file (DST)
Word_freq = Process_buffer (Bvffer)
Output_result (Word_freq)
Second, Code style description
1. Indent with 4 spaces
2. Use blank lines to separate functions and classes, and large chunks of code within functions
3. Spaces are used around the operator and after commas, but no space is added to the brackets
4. Fold the line to make sure it doesn't exceed 79 characters
Third, the program operation results
Four, performance analysis and improvement
1. Performance analysis
1.1, the module consumes time visualization operation
- Need to install: Graphviz, "pip install Graphviz"; Refer to using Cprofile to analyze Python program performance: Links
Download convert dot python code gprof2dot official download, unzip, copy "gprof2dot.py" to the path of the current parse file, or the path set by your system PATH environment variable.
- Perform conversion steps
The conversion diagram is as follows:
Personal Programming Exercises