Sometimes I ask myself, how do I not know in Python 3 in a simpler way to do "such" things, when I seek answers, over time, I certainly found more concise, effective and less bug code. Overall (not just this article), the total number of "those" things was more than I had imagined, but this was the first non-obvious feature, and I later sought out more effective/simple/maintainable code.
Dictionary
Keys () and items () in the dictionary
You can do a lot of interesting things in the keys and items in the dictionary, which are similar to collections (set):
AA = {' Mike ': ' Male ', ' Kathy ': ' Female ', ' Steve ': ' Male ', ' Hillary ': ' female '} BB = ' mike ': ' Male ', ' Ben ': ' Male ', ' Hilla Ry ': ' Female '} aa.keys () & Bb.keys () # {' Mike ', ' Hillary '} # these is Set-likeaa.keys ()-Bb.keys () # {' Kathy ', ' Stev E '}# If you want to get the common key-value pairs in the both Dictionariesaa.items () & Bb.items () # {(' Mike ', ' Male '), (' Hillary ', ' female ')}
Too concise!
In the dictionary, Lieutenant, a key exists.
How many times have you written the following code?
Dictionary = {}for k, v in LS: if not k in dictionary: dictionary[k] = [] dictionary[k].append (v)
This code is actually not that bad, but why do you always need to use an if statement?
From collections Import defaultdictdictionary = Defaultdict (list) # defaults to Listfor K, v in LS: Dictionary[k].appe nd (v)
This makes it clearer that there is no redundant and ambiguous if statement.
Update a dictionary with another dictionary
From itertools Import Chaina = {' x ': 1, ' Y ': 2, ' z ': 3}b = {' Y ': 5, ' s ': +, ' x ': 3, ' Z ': 6} # Update A with BC = Dict (Chain (A.items (), B.items ())) C # {' Y ': 5, ' s ': Ten, ' X ': 3, ' Z ': 6}
It looks good, but it's not concise enough. See if we can do better:
c = a.copy () c.update (b)
Clearer and more readable!
Get maximum value from a dictionary
If you want to get the maximum value in a dictionary, you might like this directly:
AA = {K:sum (range (k)) for K in range}aa # {0:0, 1:0, 2:1, 3:3, 4:6, 5:10, 6:15, 7:21, 8:28, 9:36}max (aa.val UEs ()) #36
This is effective, but if you need key, then you need to find the key on the basis of value. However, we can use zip to make the presentation flatter and return a key-value form like this:
Max (Zip (aa.values (), Aa.keys ()) # (9) = = value, key pair
Similarly, if you want to traverse a dictionary from the largest to the smallest, you can do this:
Sorted (Zip (aa.values (), Aa.keys ()), reverse=true) # [(36, 9), (28, 8), (21, 7), (15, 6), (10, 5), (+ 6, 4), (3, 3), (1, 2), ( 0, 1), (0, 0)]
Open any number of items in a list
We can use the magic of * to get any items placed in the list:
def compute_average_salary (person_salary): Person , *salary = person_salary return person, (sum (salary)/float (Len (Salary))) person, average_salary = Compute_average_salary (["Mike", 40000, 50000, 60000]) person # ' Mike ' Average_salary # 50000.0
It's not that interesting, but if I tell you it can be like this:
def compute_average_salary (person_salary_age): Person , *salary, age = person_salary_age return person, (SUM ( Salary)/float (len (Salary))), age person, average_salary, age = Compute_average_salary (["Mike", 40000, 50000, 60000, 42]) Age # 42
It looks very concise!
When you think of a dictionary that has a string type of key and a list of value, instead of traversing a dictionary and then processing the value sequentially, you can use a more flattened representation (list in set list), like this:
# Instead of doing thisfor K, V in Dictionary.items (): process (v) # We are separating head and the rest, and process T He values# as a list similar to the above. Head becomes the key valuefor head, *rest in LS: process (REST) # If not very clear, consider the following EXAMPLEAA = {K:list (range (k)) for K in range (5)} # Range returns an Iteratoraa # {0: [], 1: [0], 2: [0, 1], 3: [0, 1, 2], 4: [0, 1, 2, 3]}for K, V in Aa.items (): sum (v) #0 #0#1#3#6 # INSTEADAA = [[ii] + list (range (JJ)) for II, JJ in Enumerate (range (5) )]for head, *rest in AA: print (sum (rest)) #0 #0#1#3#6
You can unzip the list into Head,*rest,tail and so on.
Collections as a counter
Collections is one of my favorite libraries in Python, and in Python, in addition to the original default, if you still need other data structures, you should look at this.
Part of my daily basic work is to calculate a large number of words that are not very important. Some might say that you can use these words as a dictionary key, their values as value, and when I'm not in touch with the counter in collections, I might agree with you (yes, so many introductions are because of counter).
Suppose you read the Python language of Wikipedia and convert it into a string and put it in a list (tagged in good order):
Import reword_list = list (map (lambda K:k.lower () strip (), Re.split (R ' [;,:(. s)]s* ', python_string))) word_list[:10] # [' Python ', ' is ', ' a ', ' widely ', ' used ', ' general-purpose ', ' high-level ', ' programming ', ' language ', ' [17][18][19] ']
It all looks good so far, but if you want to count the words in this list:
From collections Import Defaultdict # again, collections!dictionary = defaultdict (int.) for Word in word_list: Dictiona Ry[word] + = 1
It's not that bad, but if you have a counter, you'll save your time and do something more meaningful.
From collections Import Countercounter = Counter (word_list) # Getting The most common Wordscounter.most_common [(' th E ', 164), (' and ', 161), (' a ', 138), (' Python ', 138), (' of ', 131), (' is ', 102), ('-', ' ","), (' in ', '), (', ') ']counter.key S () [: ten] # just like a dictionary[", ' limited ', ' all ', ' Code ', ' managed ', ' multi-paradigm ', ' exponentiation ', ' fromosing ' , ' Dynamic ']
Very concise, but if we look at the available methods contained in counter:
Dir (counter) [' __add__ ', ' __and__ ', ' __class__ ', ' __cmp__ ', ' __contains__ ', ' __delattr__ ', ' __delitem__ ', ' __dict__ ', ' __doc__ ', ' __eq__ ', ' __format__ ', ' __ge__ ', ' __getattribute__ ', ' __getitem__ ', ' __gt__ ', ' __hash__ ', ' __init__ ', ' __ Iter__ ', ' __le__ ', ' __len__ ', ' __lt__ ', ' __missing__ ', ' __module__ ', ' __ne__ ', ' __new__ ', ' __or__ ', ' __reduce__ ', ' __ Reduce_ex__ ', ' __repr__ ', ' __setattr__ ', ' __setitem__ ', ' __sizeof__ ', ' __str__ ', ' __sub__ ', ' __subclasshook__ ', ' __ Weakref__ ', ' clear ', ' copy ', ' Elements ', ' fromkeys ', ' get ', ' has_key ', ' Items ', ' iteritems ', ' Iterkeys ', ' itervalues ', ' Keys ', ' Most_common ', ' Pop ', ' Popitem ', ' setdefault ', ' Subtract ', ' update ', ' values ', ' viewitems ', ' Viewkeys ', ' Viewvalues ']
Do you see the __add__ and __sub__ methods, yes, counter support the addition and subtraction operations. So if you have a lot of text that you want to calculate words, you don't need Hadoop, you can use counter (as a map) and add them up (equivalent to reduce). So you have a mapreduce built on counter, and you'll probably thank me later.
Flat nesting lists
Collections also has a _chain function, which can be used as a flat nesting lists
From collections Import Chainls = [[KK] + list (range (KK)) to KK in range (5)]flattened_list = List (Collections._chain (*LS) )
Open two files at a time
If you're working on a file (such as one line) and you're writing the processed lines to another file, you might be tempted to write as follows:
With open (Input_file_path) as Inputfile: with open (Output_file_path, ' W ') as OutputFile: For line in inputfile:< C3/>outputfile.write (line)
In addition, you can open multiple files in the same line, like this:
With open (Input_file_path) as Inputfile, open (Output_file_path, ' W ') as OutputFile: For line in Inputfile: Outputfile.write (line)
This is even more concise!
Find Monday from a bunch of data
If you have a data that you want to standardize (like before or after Monday), you might look like this:
Import Datetimeprevious_monday = Some_date-datetime.timedelta (Days=some_date.weekday ()) # Similarly, you could map to NE XT Monday as Wellnext_monday = some_date + Date_time.timedelta (Days=-some_date.weekday (), Weeks=1)
This is how it is implemented.
Working with HTML
If you want to crawl a site out of interest or interest, you may be facing HTML tags all the time. To parse a wide variety of HTML tags, you can use Html.parer:
From Html.parser import Htmlparser class Htmlstrip (Htmlparser): def __init__ (self): self.reset () self.ls = [] def handle_data (self, D): self.ls.append (d) def get_data (self): return '. Join (SELF.LS) @ Staticmethod def strip (snippet): Html_strip = Htmlstrip () html_strip.feed (snippet) Clean_text = Html_strip.get_data () return Clean_text snippet = Htmlstrip.strip (Html_snippet)
If you just want to avoid HTML:
Escaped_snippet = Html.escape (html_snippet) # back-to-HTML snippets (this was new in Python 3.4) Html_snippet = Html.unescape (Escaped_snippet) # and so forth ...