Sometimes I ask myself, how do I not know the simpler way to do "This" in Python 3, and when I look for answers, I certainly find more concise, efficient and less bug-like code over time. Overall (not just this article), the total number of "those" things is more than I expected, but this is the first of the less obvious features, and later I sought more effective/simple/maintainable code.
Dictionaries
Keys () and items () in the dictionary
You can do a lot of interesting things with the keys and items in the dictionary, and they are similar to collections (set):
AA = {' Mike ': ' Male ', ' Kathy ': ' Female ', ' Steve ': ' Male ', ' Hillary ': ' Female '}
bb = {' Mike ': ' Male ', ' Ben ': ' Male ', ' H Illary ': ' Female '}
Aa.keys () & Bb.keys () # {' Mike ', ' Hillary '} # These are Set-like
()-Aa.keys () # {' Kathy ', ' Steve '} # If you want to get the common key-value pairs in the two dictionaries aa.items
() & Bb.items () # {(' Mike ', ' Male '), (' Hillary ', ' female ')}
It's too simple!
In the dictionary, the presence of a key is validated.
How many times have you written the following code?
Dictionary = {}
for K, v. in LS:
if isn't k in dictionary:
dictionary[k] = []
dictionary[k].append (v)
This code is not that bad, but why do you always need to use the IF statement?
From collections import defaultdict
dictionary = defaultdict (list) # defaults to list
for K, v. in LS:
Dictiona Ry[k].append (v)
This makes it clearer that there is no redundant and ambiguous if statement.
Use another dictionary to update a dictionary
From Itertools import chain
a = {' X ': 1, ' Y ': 2, ' Z ': 3}
b = {' Y ': 5, ' s ': Ten, ' X ': 3, ' Z ': 6}
# Update a wit H b
C = dict (Chain (A.items (), B.items ()))
C # {' Y ': 5, ' s ': Ten, ' X ': 3, ' Z ': 6}
It looks good, but it's not concise enough. To see if we can do better:
c = a.copy ()
c.update (b)
Clearer and more readable!
Get the maximum value from a dictionary
If you want to get the maximum value in a dictionary, it might be as straightforward as this:
AA = {K:sum (range (k)) for K in range (Ten)}
AA # {0:0, 1:0, 2:1, 3:3, 4:6, 5:10, 6:15, 7:21, 8:28, 9:36}
Max (Aa.values ()) #36
This works, but if you need key, then you need to find the key on the basis of value. However, we can use a zip to flatten the presentation and return a key-value form such as the following:
Max (Zip (aa.values (), Aa.keys ()))
# (9) => value, key pair
Similarly, if you want to traverse a dictionary from the largest to the smallest, you can do so:
Sorted (Zip (aa.values (), Aa.keys ()), reverse=true)
# [(36, 9), (28, 8), (21, 7), (15, 6), (10, 5), (6, 4), (3, 3), 1 2), (0, 1), (0, 0)]
Open any number of items in a list
We can use the magic of * to get any items into the list:
def compute_average_salary (person_salary): Person
, *salary = person_salary return person
, (sum (salary)/float (Len (Salary)))
person, average_salary = Compute_average_salary (["Mike", 40000, 50000, 60000]) person
# ' Mike '
average_salary # 50000.0
It's not that interesting, but if I told you, it could be like this:
def compute_average_salary (person_salary_age): Person
, *salary, age = person_salary_age return person
, (sum ( Salary)/float (len (Salary)), age person
, average_salary, age = Compute_average_salary (["Mike", 40000, 50000, 60000]) Age
# 42
It looks very neat!
When you think of a dictionary with a string type key and a list value, instead of traversing a dictionary and then processing value sequentially, you can use a more flattened representation (list), as follows:
# Instead of doing this
for K, V in Dictionary.items ():
process (v)
# We are separating head and rest, and Process the values
# as a list similar to the ' above head becomes ' key value
for head, *rest in LS:
process (rest)
# If not very clear, consider the following example
AA = {k:list (range (k)) to K in range (5)} # Range returns a ITER Ator
AA # {0: [], 1: [0], 2: [0, 1], 3: [0, 1, 2], 4: [0, 1, 2, 3]} to
K, V in Aa.items ():
sum (v)
#0
#0
#1
#3
#6
# Instead
AA = [[ii] + list (range (JJ)) for II, JJ Enumerate (range (5))]
For head, *rest in AA:
print (SUM (rest)
#0
#0
#1
#3
#6
You can unzip the list into Head,*rest,tail and so on.
Collections as a counter
Collections is one of my favorite libraries in Python, and in Python, in addition to the original default, if you need another data structure, you should look at this.
Part of my daily basic work is to compute a large number of words that are not very important. One might say that you can use these words as a key to a dictionary, their values as value, and I may agree with you when I am not in touch with the counter in collections (yes, so much is done because of counter).
Let's say you read wikipedia in the Python language and turn it into a string and put it in a list (marked in a good order):
import re
word_list = List (map (lambda k:k.lower (), strip (), Re.split (R ' [;,:(. s)]s* ', python_string))
Word_ LIST[:10] # [' Python ', ' is ', ' a ', ' widely ', ' used ', ' general-purpose ', ' high-level ', ' programming ', ' language ', ' [17][18] [19] ']
So far it looks good, but if you want to count the words in this list:
From collections Import Defaultdict # again, collections!
Dictionary = defaultdict (int) for
word in word_list:
Dictionary[word] = 1
It's not that bad, but if you have a counter, you'll save your time for more meaningful things.
From collections import Counter
Counter = Counter (word_list)
# Getting the most common a words
Common
[(' The ', 164), (' and ', 161), (' a ', 138), (' Python ', 138), (' is ', 131), (' are ', 102), (' to ', ","), (' in
', (")]
Counter.keys () [:] # Just like a dictionary
[', ' limited ', ' all ', ' Code ', ' managed ', ' Multi-parad IgM ',
' exponentiation ', ' fromosing ', ' Dynamic ']
Very concise, but if we look at the available methods included in the counter:
Dir (counter)
[' __add__ ', ' __and__ ', ' __class__ ', ' __cmp__ ', ' __contains__ ', ' __delattr__ ', ' __delitem__ ', ' __dict __ ',
' __doc__ ', ' __eq__ ', ' __format__ ', ' __ge__ ', ' __getattribute__ ', ' __getitem__ ', ' __gt__ ', ' __hash__ ', '
_ _init__ ', ' __iter__ ', ' __le__ ', ' __len__ ', ' __lt__ ', ' __missing__ ', ' __module__ ', ' __ne__ ', ' __new__ ', '
__or__ ', ' _ _reduce__ ', ' __reduce_ex__ ', ' __repr__ ', ' __setattr__ ', ' __setitem__ ', ' __sizeof__ ',
' __str__ ', ' __sub__ ', ' __ Subclasshook__ ', ' __weakref__ ', ' clear ', ' copy ', ' Elements ', ' fromkeys ', ' Get ',
' Has_key ', ' Items ', ' iteritems ', ' Iterkeys ', ' itervalues ', ' Keys ', ' Most_common ', ' pops ', ' popitem ', ' SetDefault ',
' subtract ', ' update ', ' Values ', ' Viewitems ', ' Viewkeys ', ' viewvalues ']
Did you see the __add__ and __sub__ methods, yes, counter support the addition and subtraction operation. So, if you have a lot of text you want to compute the words, you don't need Hadoop, you can use counter (as a map) and add them up (equivalent to reduce). Then you have the MapReduce built on the counter, and you'll probably thank me later.
Flat nesting lists
Collections also has a _chain function that can be used as a flat nesting lists
From collections import chain
ls = [[KK] + list (range (KK)) for KK in range (5)]
flattened_list = List (collections._ Chain (*LS))
Open two files at the same time
If you are working on a file (such as a line of lines) and you want to write the processed rows to another file, you may be tempted to write as follows:
With open (Input_file_path) as Inputfile:
with open (Output_file_path, ' W ') as OutputFile: For line in
inputfile:< C8/>outputfile.write (Process (line))
In addition, you can open multiple files in the same line, as follows:
With open (Input_file_path) as Inputfile, open (Output_file_path, ' W ") as OutputFile: For line in
Inputfile:
Outputfile.write (Process (line))
This is more concise!
find Monday from a pile of data
If you have a data that you want to standardize (for example, before or after Monday), you might be like the following:
Import datetime
Previous_monday = Some_date-datetime.timedelta (Days=some_date.weekday ())
# Similarly, you Could map to next Monday
as OK next_monday = Some_date + Date_time.timedelta (Days=-some_date.weekday (), Weeks=1)
This is how it is implemented.
working with HTML
If you're climbing a site out of interest or interest, you may be facing HTML tags all the time. To parse a variety of HTML tags, you can use Html.parer:
From Html.parser import Htmlparser
class Htmlstrip (Htmlparser):
def __init__ (self):
self.reset ()
Self.ls = []
def handle_data (self, D):
self.ls.append (d)
def get_data (self): return
'. Join ( SELF.LS)
@staticmethod
def strip (snippet):
Html_strip = Htmlstrip ()
html_strip.feed (snippet)
Clean_text = Html_strip.get_data () return
clean_text
snippet = Htmlstrip.strip (Html_snippet)
If you just want to avoid HTML:
Escaped_snippet = Html.escape (html_snippet)
# back to HTML snippets (this is new in Python 3.4)
Html_snippet = html . unescape (Escaped_snippet)
# and so forth ...