Python reads large files more and more slowly (judging key in map, never use in keys ())

Source: Internet
Author: User
Tags python list

Background:

Today Le Leche writes code, reads a four hundred or five hundred trillion file, and then does a bunch of processing. The results were processed for a day that had not yet come out. What is the problem?

Solve:

1. Lele has printed the time at different points in time, the direct print times (). Discovering a rule, the execution speed is getting slower.

2. Why is it getting slower?

1) Possible cause 1,GC problem, there is an article written inside, Python list append will be more and more slow, the solution is to prohibit GC:

Using Gc.disable () and gc.enable ()

2) Change the above, still not, and then see an article in the write, it may be because of git caused, because git will keep syncing, append, and then delete the. Git folder, the result is still not working.

3) Continue to query, send the next and where there may be a problem. Dict in Dict.key () to determine if key is in dict, the efficiency of this is very low. See an article comparing efficiency:
① using in Dict.keys () efficiency:

② using Has_key () efficiency: found that has_key () efficiency is relatively stable. So modify, solve the problem. something:At first, it is true to use Has_key (), the results of the upload code at the time, the company code check too, the hint can not use this function, can only be changed to in Dict.key () This way, why companies do not let this pass? After some Baidu, found the reason: In the Python3, directly the Has_key () function to delete, so prohibit the use. What if it's forbidden? Originally python in is very intelligent, can automatically determine whether the key in the dictionary exists. Therefore the most formal practice is not has_key (), not in Dict.keys (), but in Dict.

Appendix:

In, in Dict.keys (), Has_key () methods Combat comparison:

>>> a = {' name ': ' Tom ', ' age ': Ten, ' Tel ':110}>>> a{' age ': Ten, ' tel ': ' ' name ': ' Tom '}>>> print ' Age ' in atrue>>> print ' age ' in A.keys () true>>>>>> print A.has_key ("Age") True

Resources:

Https://www.douban.com/group/topic/44472300/

Http://www.it1352.com/225441.html

52160942

Python reads large files more and more slowly (judging key in map, never use in keys ())

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.