Scene: Coach Kelly has 4 players James\sarah\julie\mikey, each running 600 meters, the coach will time and recorded in a computer file, a total of 4 files: james.txt\sarah.txt\julie.txt\ Mikey.txt, the time data of 4 contestants were recorded separately.
Expectations: Coaches need a quick way to quickly learn about the fastest 3 times each player can run .
1. Read the player data in the file into their respective lists, and display the list on the screen;
Data.strip (). Split (', '): This is a "method string Chain", which reads this method chain from left to right.
2. Sort the data
In-place sorting: refers to arranging the data in the order specified, then replacing the original data with the sorted data, and the original order is lost . Ascending is sort () BIF, and descending requires parameter reverse=ture.
Copy Sort: The data is arranged in the order specified, and then an ordered copy of the original data is returned, the order of the original data is retained , and only one copy is sorted. In ascending order sorted () BIF, the parameter reverse=ture should be added in descending order.
>>> data=[6,3,1,2,4,5]>>>data[6, 3, 1, 2, 4, 5]>>> Data.sort ()#Sort in situ>>>data[1, 2, 3, 4, 5, 6]>>>#in-place sorting results show that the order of the original data has changed>>> >>> data=[6,3,1,2,4,5]>>>data[6, 3, 1, 2, 4, 5]>>> data2=sorted (data)#Copy Sort>>>data2[1, 2, 3, 4, 5, 6]>>>data[6, 3, 1, 2, 4, 5]>>>#copy sort results show that the order of the original data has not changed>>> >>> data=[6,3,1,2,4,5]>>>data[6, 3, 1, 2, 4, 5]>>>data.sorted () Traceback (most recent): File"<pyshell#16>", Line 1,inch<module>data.sorted () Attributeerror:'List'object has no attribute'Sorted'>>>
>>> data=[6,3,1,2,4,5]
>>> data
[6, 3, 1, 2, 4, 5]
>>> Data.sort (reverse=true)# Sort in descending order
>>> data
[6, 5, 4, 3, 2, 1]
>>>
>>> data=[6,3,1,2,4,5]
>>> data
[6, 3, 1, 2, 4, 5]
>>> data2=sorted (data,reverse=true)# Copy Sort Descending
>>> data2
[6, 5, 4, 3, 2, 1]
>>>
Use copy sort to sort the coach's 4 player data:
Function string chain: Allows a series of functions to be applied to the data, unlike the method string chain, which is read from right to left .
Sort results show: 2-55 actually before 2.18, data format is not uniform, minute and second separators cause sorting confusion.
Python can sort the strings in ascending order with a dash-dot-colon;
Need to fix the data!!
3. Data correction
Define a function sanitize (), receive a string from each contestant's list as input, then process the string, replace the dash or colon from which it was found with a dot, and return the cleaned string.
If the string already contains a point number, no more substitutions are required;
Converts the existing data to a clean version.
defsanitize (time_string):if '-' inchTime_string:splitter='-' elif ':' inchTime_string:splitter=':' Else: return(time_string) (mins, secs)=Time_string.split (Splitter)return(mins +'.'+secs) with open ('James.txt') as Jaf:data=Jaf.readline () James= Data.strip (). Split (',') with open ('Julie.txt') as Juf:data=Juf.readline () Julie= Data.strip (). Split (',') with open ('Mikey.txt') as Mif:data=mif.readline () Mikey= Data.strip (). Split (',') with open ('Sarah.txt') as Saf:data=Saf.readline () Sarah= Data.strip (). Split (',') Clean_james=[]clean_julie=[]clean_mikey=[]clean_sarah= [] foreach_tinchjames:clean_james.append (sanitize (each_t)) foreach_tinchjulie:clean_julie.append (sanitize (each_t)) foreach_tinchmikey:clean_mikey.append (sanitize (each_t)) foreach_tinchsarah:clean_sarah.append (sanitize (each_t))Print(sorted (clean_james))Print(sorted (Clean_julie))Print(sorted (clean_mikey))Print(Sorted (Clean_sarah))
Clean success!! Format not only consistent, but also the data order!!
But the code repeats, the code creates 4 lists to hold the data read from the file, and creates 4 lists to save the cleansed data, all over the iteration ....
There's a better way.---conversion list, Python provides a useful tool (derivation list) to complete the conversion, involving deduction lists to reduce the code that is required to convert one list to another list.
4. Deduction List
List conversions need to do 4 things:
1) Create a new list to store the converted data;
2) iterative processing of the data in the original list;
3) Complete the conversion at each iteration;
4) Add the converted data to the new list;
Clean_james = [] #1. Create for in James: #2. Iteration Clean_ James.append (sanitize (each_t)) #3. Conversion 4. Append
for in James] # list derivation, one line of code to complete creation, iteration, transformation, append
List deduction to handle the trainer's 4 time-of-count values list:
As we expect, the output is exactly the same as before.
Note: You may want to use the function chain sorted (sanitize (t)) in the list deduction, but don't do it!! Because Sanitize (t) is the output of a data item, sorted () Bif wants to sort a list instead of a single data item.
5. Iterate Delete Duplicates
Task: Generate 3 times the fastest time for each player
Specify individual list items using notation: james[0], james[1], james[2]
List shards: James[0:3]
Use iterations to delete duplicates and get the fastest time for each contestant 3 times.
defsanitize (time_string):if '-' inchTime_string:splitter='-' elif ':' inchTime_string:splitter=':' Else: return(time_string) (mins, secs)=Time_string.split (Splitter)return(mins +'.'+secs) with open ('James.txt') as Jaf:data=Jaf.readline () James= Data.strip (). Split (',') with open ('Julie.txt') as Juf:data=Juf.readline () Julie= Data.strip (). Split (',') with open ('Mikey.txt') as Mif:data=mif.readline () Mikey= Data.strip (). Split (',') with open ('Sarah.txt') as Saf:data=Saf.readline () Sarah= Data.strip (). Split (',')#replace the sorted data with the original unordered and inconsistent dataJames = Sorted ([sanitize (t) forTinchJames]) Julie= Sorted ([sanitize (t) forTinchJulie]) Mikey= Sorted ([sanitize (t) forTinchMikey]) Sarah= Sorted ([sanitize (t) forTinchSarah]) Unique_james= []#Create a new list that stores unique data foreach_tinchJames#iterative processing based on existing data ifeach_t not inchUnique_james:#If this data is not in the new listUnique_james.append (each_t)#append this unique data to the new listPrint(Unique_james[0:3])#get the top 3 items from a list shard and print to the screenUnique_julie= [] foreach_tinchJulie:ifeach_t not inchunique_julie:unique_julie.append (each_t)Print(Unique_julie[0:3]) Unique_mikey= [] foreach_tinchMikey:ifeach_t not inchunique_mikey:unique_mikey.append (each_t)Print(Unique_mikey[0:3]) Unique_sarah= [] foreach_tinchSarah:ifeach_t not inchunique_sarah:unique_sarah.append (each_t)Print(Unique_sarah[0:3])
>>> ===== restart:d:\workspace\headfirstpython\chapter5\page162\page162.py =====['2.01','2.22','2.34']['2.11','2.23','2.59']['2.22','2.38','2.49']['2.18','2.25','2.39']>>>
It worked!!
However, the code that removes duplicates from the list is itself duplicated. You can extract the duplicated code into a small function.
6. Quickly delete duplicates with collections
1) Delete duplicates with collection
The most prominent feature of the Python collection is that the data items in the collection are unordered and do not allow duplication. If you attempt to add existing data to the collection, Python ignores it.
Distances=set () # creates a new empty collection and assigns it to a variable distances={10.6,11,8,10.6,'both' , 7} # duplicates 10.6 In the list of data values provided will be ignored Distances=set (James) # All duplicates in James are ignored
Factory function: Used to create a new data item of some kind. For example, set () is a factory function. In the real world, factories produce products, and the concept is named.
Code optimization:
That's great!!!
ch5-processing data, extraction-collation-derivation