Python dictionary dict and list reading speed problem, range merge

Source: Internet
Author: User

Reading speed issues for Python dictionaries and lists

Recently in the genome data processing, need to read large data (2.7G) into the dictionary, and then the processed data to the dictionary key value matching, in the processed file after each read a row to find out whether in the dictionary keys, the following two code of the efficiency difference is very large:

First paragraph:

if (Pos in Fre_dist.keys ()):
Newvalue= Fre_dist[pos]

Second paragraph:

if (pos in fre_dist):
Newvalue=fre_dist[pos]

When processing 30,000 pieces of data, the second piece of code is the speed of the first piece of code.

The reason is: the first code Fre_dist.keys () becomes List,python when retrieving the list is slower, the second code fre_dist is a dictionary, Python in the dictionary when retrieving the speed is relatively fast.

The lesson of blood.

DICT structure, I think most people will think of the for-key in Dictobj method, it is true that this method is applicable in most cases. But not completely safe, see the following example:

Copy CodeThe code is as follows: #这里初始化一个dict
>>> d = {' A ': 1, ' B ': 0, ' C ': 1, ' d ': 0}
#本意是遍历dict, if you find that the value of the element is 0, delete it.
>>> for K in D:
... if d[k] = = 0:
... del (d[k])
...
Traceback (most recent):
File "<stdin>", line 1, in <module>
Runtimeerror:dictionary changed size during iteration
#结果抛出异常了, two elements of 0, also delete only one.
>>> D
{' A ': 1, ' C ': 1, ' d ': 0}

>>> d = {' A ': 1, ' B ': 0, ' C ': 1, ' d ': 0}
#d. Keys () is an array of subscripts
>>> D.keys ()
[' A ', ' C ', ' B ', ' d ']
#这样遍历, there is no problem, because in fact this is the D.keys () this list constant traversal.
>>> for K in D.keys ():
... if d[k] = = 0:
... del (d[k])
...
>>> D
{' A ': 1, ' C ': 1}
#结果也是对的
>>>

In fact, this example is I simplified, I am in a multi-threaded program to find this problem, so, my advice is: when traversing dict, to develop the use for K in D.keys () habit.
However, if it is multi-threaded, so it is absolutely safe? Not necessarily: When all two threads have finished D.keys (), if two threads are to delete the same key, the first delete will be successful, after the deletion of that will certainly be reported Keyerror, this seems to be only by other means to ensure.


Another article: Dict performance comparison of two kinds of traversal modes

About the performance issues with parentheses and without parentheses in tangled dict traversal

Copy CodeThe code is as follows:
for (d,x) in Dict.items ():
Print "Key:" +d+ ", Value:" +str (x)

For d,x in Dict.items ():
Print "Key:" +d+ ", Value:" +str (x)

As we can see, the dict of the number of bars at 2001 is a bit higher with parentheses, but less execution time with no parentheses after more than 200 of the data.

The dictionary is denoted by curly braces ({}), where the Xiangcheng appears, and a key corresponds to a value;key and value
Separated by a colon (:), separated by a comma (,) between the different items.

Python Shell:

n = {' username ': ' zz ', ' Password ': 123}n.keys () Dict_keys ([' username ', ' password ']) n.values () Dict_keys ([' ZZ ', 123]) N.items () dict_items ([' username ', ' Zc '), (' Password ', 123)]) (K,V) in N.items ():        print ("This ' s key:%r"%k)        Print ("This ' s value:%r"%v ") this ' s key: ' username ' this ' s value: ' Zc ' this ' s key: ' password ' this ' s value:123


Zip (): the element that takes each of the arrays in turn, and then combines

n = [1,2,3]m = [' A ', ' B ', ' c ']a = Zip (m,n) for I in A:    print (i) (' a ', 1) (' B ', 2) (' C ', 3)
n = [1,2,3]m = [' A ', ' B ', ' c ']a = Zip (m,n) for (M,n) in a:        print (M,N) a 1b 2c 3

Range Merge:

For I in range (48,58) +range (65,91):
C8=CHR (i);

Python dictionary dict and list reading speed problem, range merge

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.