Some suggestions on how to avoid some common problems during the programming of Python novice _python

Source: Internet
Author: User
Tags exception handling in python

This article collects the irregular but occasionally subtle issues I see in the code written by the Python novice developer. The purpose of this article is to help novice developers through the stages of writing ugly Python code. To take care of the target audience, this article makes some simplification (for example: ignoring the generator and the powerful iterative tool Itertools when discussing iterators).

For those novice developers, there are always reasons to use the anti-pattern, and I've tried to give these reasons where possible. But often these patterns can cause code to be less readable, more prone to bugs, and inconsistent with Python's code style. If you are looking for more information, I highly recommend the Python tutorial or dive into Python.
iterations

Use of range

New Python programmers like to use range to implement simple iterations, within the length of the iterator to get every element in the iterator:

For I in range (len (alist)):
  print Alist[i]

It should be borne in mind that range is not intended to implement simple iterations of a sequence. Although the For loop implemented with range seems natural compared to those defined for loops with a number, it is easy to make bugs on iterations of a sequence, and not as straightforward as constructing an iterator that looks clear:

For item in alist:
  print Item

Range abuse can easily cause an unexpected size difference of one (off-by-one) error, usually because novice programmers forget that the range generated object includes the first parameter of range and not the second, similar to substring in Java and many other functions of this type. Novice programmers who think no more than the end of the sequence will create bugs:

# The method of iterating through the whole sequence error
alist = [' Her ', ' name ', ' is ', ' Rio '] for
I in range (0, Len (alist)-1): # size is a (off by one)!
  Print I, Alist[i]

Common reasons for improper use of range:
1. Indexes need to be used in loops. This is not a reasonable reason to use indexes in the following ways:

For index, value in enumerate (alist):
  print Index, value

2. You need to iterate over two loops at the same time, and use the same index to get two values. In this case, you can use a zip to achieve this:

For word, number in zip (words, numbers):
  print word, number

3. A part of the iteration sequence is required. In this case, only iterative sequence slices are required to be implemented, and note that the necessary annotations are added to indicate the intention:

For word in words[1:]: # does not include the first element
  print Word

One exception: When you iterate over a large sequence, the slice operation causes more overhead. If there are only 10 elements in the sequence, there is no problem, but the cost becomes very important if you have 10 million elements, or if you slice in a performance-sensitive inner loop. In this case, you might consider using xrange instead of range [1].

An important use of range in addition to iterating over a sequence is when you really want to generate a sequence of numbers instead of generating an index:

# print foo (x) for 0<=x<5 to
x in range (5):
  print foo (x)

Use list resolution correctly

If you have a loop like this:

# A ugly, slow way to build a list
words = [' Her ', ' name ', ' are ', ' Rio ']
alist = [] for
word in words:
  Ali St.append foo (word)

You can use list resolution to override:

words = [' Her ', ' name ', ' is ', ' Rio ']
alist = [foo (word) for word in words]

Why did you do that? On the one hand, you avoid the error of correctly initializing the list, on the other hand, this code makes it look clean and tidy. For those who have functional programming backgrounds, using the map function may feel more familiar, but it seems to me to be less python.

Some of the other common reasons not to use list resolution:

1. You need to loop the nesting. This time you can nest the entire list parsing, or multiple lines using loops in list parsing:

words = [' Her ', ' name ', ' is ', ' Rio ']
letters = [] for word in words: for letter in
  Word:
    letters.append (le tter)

Use list resolution:

words = [' Her ', ' name ', ' is ', ' Rio ']
letters = [letter to Word in words for letter in
         Word]

Note: In list parsing with multiple loops, loops are in the same order as if you didn't use list parsing.

2. You need a condition to judge within the loop. You just need to add this condition to the list parsing:

words = [' Her ', ' name ', ' is ', ' Rio ', ' 1 ', ' 2 ', ' 3 ']
alpha_words = [word for word in words if isalpha (word)]

A reasonable reason for not using list parsing is that you cannot use exception handling in list parsing. If some elements in an iteration may cause an exception, you need to transfer the possible exception handling through a function call in list resolution, or simply not use list resolution.
Performance flaws

Check content in linear time

In grammar, checking whether a list or set/dict contains an element appears to be no different on the surface, but underneath it is quite different. If you need to repeatedly check whether a data structure contains an element, it is best to use set instead of list. (If you want to associate a value with the element you want to check, you can use dict; This can also implement constant check times.) )

# Suppose to start with list
lyrics_list = [' Her ', ' name ', ' is ', ' Rio ']
 
# to avoid the following notation
words = Make_wordlist () # Suppose to return many words
f to be tested or word in words:
  if Word in lyrics_list: # linear check time
    print Word, "is in the lyrics"
 
# It's best to write
Lyrics_set = set  (lyrics_list) # linear time creation set
words = Make_wordlist () # Suppose to return many words to test for
word in words:
  if Word in Lyrics_set: # Constant check time
    print Word, "is in the lyrics"

[Translator Note: The key values of set elements and dict in Python are hashed, so the time complexity of finding them is O (1).

It should be remembered that the creation of a set introduces a one-time cost, and the creation process will take a linear time even if the member checks to spend constant time. So if you need to check the members in the loop, it's best to take the time to create the set, because you only need to create it once.
Variable disclosure

Cycle

Generally speaking, in Python, a variable's scope is wider than what you would expect in other languages. For example: In Java, the following code will not compile:

Get the index of the lowest-indexed item on the array
//That's > MaxValue for
(int i = 0; i < y.length; i++) {
  if (Y[i] > MaxValue) {break
    ;
  }
}
I appear here illegal: there is no I
processarray (y, i);

In Python, however, the same code always goes smoothly and gets the expected result:

For IDX, value in Enumerate (y):
  if value > Max_value:
    break
 
processlist (y, idx)

This code will work correctly unless the child y is empty, at which point the loop will never execute, and the call to the Processlist function throws a Nameerror exception because IDX is undefined. If you use the Pylint Code Checker, you will be warned: use a variable idx that may not be defined.

The solution is always obvious, you can set IDX to some special values before the loop, so you know what you're looking for if the loop never executes. This pattern is called Sentinel mode. So what values can be used as sentinels? In the C-language era or earlier, when int ruled the world of programming, the general pattern for a function that needed to return a desired error result was return-1. For example, when you want to return the index value of an element in a list:

def find_item (item, alist):
  # None-1 more python result
  =-1
  to IDX, Other_item in Enumerate (alist):
    if Other_item = = Item: Result
      =
      idx
 
  

Generally, none is a good sentinel value in Python, even though it is not consistently used by the Python standard type (e.g. Str.find [2])

External scope

The Python programmer novice often likes to put everything in the so-called outer scope--python file that is not contained in a code block (such as a function or class). The outer scope is equivalent to the global namespace; For this part of the discussion, you should assume that the contents of the global scope are accessible anywhere in a single python file.

The outer scope appears to be very powerful for constants declared at the top of a file that define the entire module to be accessed. It is advisable to use a distinctive name for any variable in an external scope, for example, by using In_all_caps, the constant name. This will not easily cause the following bugs:

Import SYS
 
# The bug in the function declaration?
def print_file (Filenam): "" "
  print every line of a file.
  " "" "" "" with open (filename) as Input_file: For line in
    input_file:
      print Line.strip ()
 
if __name__ = = "__main__": C18/>filename = sys.argv[1]
  print_file (filename)

If you look closer, you will see that the Print_file function is defined with a filenam named parameter name, but the function body refers to filename. However, this program can still run very well. Why, then? In the Print_file function, when a local variable filename is not found, the next step is to look in the global scope. Because the Print_file invocation is in the outer scope (even if there is indentation), the filename declared here is visible to the Print_file function.

So how do you avoid such a mistake? First, do not set any value [3] for global variables such as in_all_caps in the outer scope. Parameter parsing is best given to the main function, so any internal variables in the function do not survive in the outer scope.

This also reminds people to focus on the global key word global. If you simply read the value of a global variable, you do not need global key word. You only need to use the Global keyword when you want to change the object referenced by the global variable name. Here you can get more information about this discussion of the global keyword on Stack Overflow.
Code Style

Hail to PEP8.

PEP 8 is a common style guide to Python code that you should keep in mind and follow as much as possible, although some people have good reasons not to agree with some of the finer styles, such as the number of spaces indented or the use of blank lines. If you don't follow PEP8, you should have better reasons than "I just don't like that style". The style guides below are all extracted from the PEP8, which seems to be what programmers often need to keep in mind.

Test is empty

If you want to check whether a container type (for example: List, dictionary, collection) is empty, simply test it instead of using a similar method to check Len (x) >0:

numbers = [-1,-2,-3] # This'll be
empty
positive_numbers = [num to num in numbers if num > 0]
if posit Ive_numbers:
  # do something awesome

If you want to save the positive_numbers in another place, you can use BOOL (positive_number) as the result to save; BOOL is used to judge if condition to judge the truth of the statement.

Test for None

As mentioned above, none can be used as a good sentinel value. So how do you check it?

If you explicitly want to test none, and not just test other items that have a value of false (such as an empty container or 0), you can use:

If X is not None:
  # does something with X

If you use None as a sentinel, this is also the pattern the Python style expects, such as when you want to differentiate between none and 0.

If you are just testing whether a variable is a useful value, a simple if pattern is usually sufficient:

If x:
  # do something with X

For example, if X is expected to be a container type, but X may make the return result value of another function to none, you should consider this situation immediately. You need to pay attention to whether you have changed the value passed to X, otherwise you may think true or 0. 0 is a useful value and the program doesn't execute the way you want it to.

Translator Note:

[1] in python2.x, a list object is generated in range, xrange a Range object, and Python 3.x abolishes the unified Range object for Xrange,range generation, with the list factory function to explicitly generate the list;
[2] string.find (str) returns the index value that str started in string, and returns-1 if it does not exist;
[3] do not set any value to the local variable name in the function. To prevent an error while calling a local variable inside a function, calling a variable of the same name in an external scope.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.