This article mainly introduces python data processing and programming instances. For more information, see the previous example of data processing and programming in the python series from scratch (I, in addition to the student's score, the file adds the student's name and birthdate information, so it will become: output the first three best scores and year of birth for each student based on their names
Data Preparation: create four text files respectively
James2.txt James Lee, 2002-3-14,2-34,3: 21, 2.34, 2.45, 3.01, 2-22
Julie2.txt Julie Jones, 2002-8-17, 2.59, 2.11, 3-10, 2-3.21, 3: 10, 3-21
Mikey2.txt Mikey McManus, 2002-2-24, 2: 22, 3.01, 3.02, 3.02, 2.49
Sarah2.txt Sarah Sweeney, 2002-6-17, 2: 58, 2.58, 2-25, 2-55, 2: 54, 2.18
On the basis of the previous section, modify some code and implement the new requirements as follows:
The code is as follows:
Import OS
Print (OS. getcwd ())
OS. chdir ('C: \ Python33 \ HeadFirstPython \ hfpy_code \ chapter6 ') # Change the workspace to the directory where the file is located
# Define the function get_filedata to take values from the file
Def get_filedata (filename ):
Try:
With open (filename) as f: # with statement to open and automatically close a file
Data = f. readline () # read characters from the file line by line
Data_list = data. strip (). split (',') # Remove spaces between characters and separate them with commas
Return ({
"Name": data_list.pop (0 ),
"Date_of_birth": data_list.pop (0 ),
"Times": str (sorted (set ([modify_time_format (s) for s in data_list]) [0: 3])
}) # Store the associated name, date of birth, time key and value in the dictionary and return
Handle T IOError as ioerr:
Print ('file error' + str (ioerr) # handle exceptions and print errors
Return (None)
# Define the function modify_time_format to unify the time expression in all files into "Minute. second"
Def modify_time_format (time_string ):
If "-" in time_string:
Splitter = "-"
Elif ":" in time_string:
Splitter = ":"
Else:
Splitter = "."
(Mins, secs) = time_string.split (splitter) # Use the separator splitter to separate characters and then store them into mins and secs respectively.
Return (mins + '.' + secs)
# Define the get_prev_three function to return the top three non-repeated time scores in the file
Def get_prev_three (filename ):
New_list = [modify_time_format (each_t) for each_t in get_filedata (filename)] # use list derivation to generate a new list of records after unified time expression
Delete_repetition = set (new_list) # use the set function to delete repeated items in the new list and generate a new set.
In_order = sorted (delete_repetition) # sort new sets without repeatability using the copy sort sorted function
Return (in_order [0: 3])
# Output james's top three non-repeated scores and birthdate
James = get_filedata('james2.txt ')
Print (james ["name"] + "'s fastest times are:" + james ["times"])
Print (james ["name"] + "'s birthday is:" + james ["date_of_birth"])
# Output julie's top three non-repeated scores and birthdate
Julie = get_filedata('julie2.txt ')
Print (julie ["name"] + "'s fastest times are:" + julie ["times"])
Print (julie ["name"] + "'s birthday is:" + julie ["date_of_birth"])
# Output the top three of mikey's no-repeated scores and birthdate
Mikey = get_filedata('mikey2.txt ')
Print (mikey ["name"] + "'s fastest times are:" + mikey ["times"])
Print (mikey ["name"] + "'s birthday is:" + mikey ["date_of_birth"])
# Output the top three non-repeated scores and birthdate of sarah
Sarah = get_filedata('sarah2.txt ')
Print (sarah ["name"] + "'s fastest times are:" + sarah ["times"])
Print (sarah ["name"] + "'s birthday is:" + sarah ["date_of_birth"])
By creating the class AthleteList that inherits the built-in list, you can define the method in the class to implement the same function:
The code is as follows:
Import OS
Print (OS. getcwd ())
OS. chdir ('C: \ Python33 \ HeadFirstPython \ hfpy_code \ chapter6 ') # Change the workspace to the directory where the file is located
# Define the class AthleteList to inherit the python built-in list
Class AthleteList (list ):
Def _ init _ (self, name, dob = None, times = []):
List. _ init _ ([])
Self. name = name
Self. dob = dob
Self. extend (times)
Def get_prev_three (self ):
Return (sorted (set ([modify_time_format (t) for t in self]) [0: 3])
Def get_filedata (filename ):
Try:
With open (filename) as f: # with statement to open and automatically close a file
Data = f. readline () # read characters from the file line by line
Data_list = data. strip (). split (',') # Remove spaces between characters and separate them with commas
Return (
AthleteList (data_list.pop (0), data_list.pop (0), data_list)
) # Store the associated name, date of birth, time key and value in the dictionary and return
Handle T IOError as ioerr:
Print ('file error' + str (ioerr) # handle exceptions and print errors
Return (None)
Def modify_time_format (time_string ):
If "-" in time_string:
Splitter = "-"
Elif ":" in time_string:
Splitter = ":"
Else:
Splitter = "."
(Mins, secs) = time_string.split (splitter) # Use the separator splitter to separate characters and then store them into mins and secs respectively.
Return (mins + '.' + secs)
James = get_filedata('james2.txt ')
Print (james. name + "'s fastest times are:" + str (james. get_prev_three ()))
Julie = get_filedata('julie2.txt ')
Print (julie. name + "'s fastest times are:" + str (julie. get_prev_three ()))
Mikey = get_filedata('mikey2.txt ')
Print (mikey. name + "'s fastest times are:" + str (mikey. get_prev_three ()))
Sarah = get_filedata('sarah2.txt ')
Print (sarah. name + "'s fastest times are:" + str (sarah. get_prev_three ()))