Simple Python2.7 programming beginner experience, python2.7 experience
If you have never used Python, I strongly recommend that you read Python introduction because you need to know the basic syntax and type.
Package Management
One of the best parts of the world in Python is a large number of third-party packages. Similarly, it is very easy to manage these packages. By convention, the packages required by the project are listed in the requirements.txt file. Each package occupies one row and usually contains the version number. Here is an example of using Pelican in this blog:
pelican==3.3Markdownpelican-extended-sitemap==1.0.0
One drawback of Python packages is that they are installed globally by default. We will use a tool that enables each project to have an independent environment called virtualenv. We also need to install a more advanced package management tool called pip, which can work with virtualenv.
First, install pip. Most python installation programs have built-in easy_install (python's default package management tool), so we use easy_install pip to install pip. This should be the last time you used easy_install. If easy_install is not installed, you can obtain it from the python-setuptools package in linux.
If the Python version you are using is later than or equal to 3.3, Virtualenv is already part of the standard library, so there is no need to install it again.
Next, you want to install virtualenv and virtualenvwrapper. Virtualenv enables you to create an independent environment for each project. This is especially useful when packages of different versions are used in different projects. Virtualenv wrapper provides some good scripts to make some things easier.
sudo pip install virtualenvwrapper
After virtualenvwrapper is installed, it lists virtualenv as a dependency package, so it is automatically installed.
Open a new shell and enter mkvirtualenv test. If you open another shell, you will not be in this virtualenv. You can start it through workon test. If your work is completed, you can use deactivate to stop it.
IPython
IPython is a substitute for the standard Python interactive programming environment. It supports Automatic completion, quick access to documents, and many other functions that should be available in the standard interactive programming environment.
When you are in a virtual environment, you can easily use pip install ipython for installation, and use ipython in the command line to start
Another good feature is "Notebook", which requires additional components. After the installation is complete, you can use ipython notebook, and there will be a good web UI, you can create a notebook. This is very popular in the field of scientific computing.
Test
I recommend nose or py. test. I use nose in most cases. They are basically similar. I will explain some details about nose.
Here is an example of how to use nose for testing. All functions starting with test _ in a file starting with test _ will be called:
def test_equality(): assert True == False
As expected, when we run nose, our test fails.
(test)jhaddad@jons-mac-pro ~VIRTUAL_ENV/src$ nosetests F======================================================================FAIL: test_nose_example.test_equality----------------------------------------------------------------------Traceback (most recent call last): File "/Users/jhaddad/.virtualenvs/test/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/Users/jhaddad/.virtualenvs/test/src/test_nose_example.py", line 3, in test_equality assert True == FalseAssertionError ----------------------------------------------------------------------
There are also some convenient methods in nose. tools to call
from nose.tools import assert_truedef test_equality(): assert_true(False)
If you want to use a method similar to JUnit, you can also:
from nose.tools import assert_truefrom unittest import TestCase class ExampleTest(TestCase): def setUp(self): # setUp & tearDown are both available self.blah = False def test_blah(self): self.assertTrue(self.blah)
Start test:
(test)jhaddad@jons-mac-pro ~VIRTUAL_ENV/src$ nosetests F======================================================================FAIL: test_blah (test_nose_example.ExampleTest)----------------------------------------------------------------------Traceback (most recent call last): File "/Users/jhaddad/.virtualenvs/test/src/test_nose_example.py", line 11, in test_blah self.assertTrue(self.blah)AssertionError: False is not true ----------------------------------------------------------------------Ran 1 test in 0.003s FAILED (failures=1)
The excellent Mock library is included in Python 3, but if you are using Python 2, you can use pypi to obtain it. This test will perform a remote call, but this call will take 10 s. This example is obviously fabricated by humans. We use mock to return sample data instead of calling it.
import mock from mock import patchfrom time import sleep class Sweetness(object): def slow_remote_call(self): sleep(10) return "some_data" # lets pretend we get this back from our remote api call def test_long_call(): s = Sweetness() result = s.slow_remote_call() assert result == "some_data"
Of course, our test takes a long time.
(test)jhaddad@jons-mac-pro ~VIRTUAL_ENV/src$ nosetests test_mock.py Ran 1 test in 10.001s OK
It's too slow! So we will ask ourselves, what are we testing? Do we need to test whether remote calls are useful, or do we need to test what to do after we get the data? In most cases, it is the latter. Let's get rid of this stupid remote call:
import mock from mock import patchfrom time import sleep class Sweetness(object): def slow_remote_call(self): sleep(10) return "some_data" # lets pretend we get this back from our remote api call def test_long_call(): s = Sweetness() with patch.object(s, "slow_remote_call", return_value="some_data"): result = s.slow_remote_call() assert result == "some_data"
Okay, let's try again:
(test)jhaddad@jons-mac-pro ~VIRTUAL_ENV/src$ nosetests test_mock.py .----------------------------------------------------------------------Ran 1 test in 0.001s OK
Much better. Remember, this example has been simply simplified. In my personal opinion, I will only ignore calls from a remote system, not from my database.
Nose-progressive is a good module, which can improve the nose output so that errors are displayed when they occur, rather than staying at the end. This is a good thing if your testing takes some time.
Pip install nose-progressive and add -- with-progressive to your nosetests
Debugging
IPDB is an excellent tool. I have used it to find many incredible bugs. Pip install ipdb install the tool, and then import ipdb; ipdb. set_trace () in your code. Then you will get a good interactive prompt when your program is running. It executes a line of the program each time and checks the variables.
Python has a built-in tracing module that helps me figure out what happened. There is a useless python program:
a = 1b = 2a = b
Here is the tracing result of this program:
(test)jhaddad@jons-mac-pro ~VIRTUAL_ENV/src$ python -m trace --trace tracing.py 1 ? --- modulename: tracing, funcname: <module>tracing.py(1): a = 1tracing.py(2): b = 2tracing.py(3): a = b --- modulename: trace, funcname: _unsettracetrace.py(80): sys.settrace(None)
This function is useful when you want to understand the internal structures of other programs. If you have used strace before, they work in a similar way.
In some cases, I use pycallgraph to track performance issues. It can create a chart of the time and number of function calls.
Finally, objgraph is very useful for finding memory leaks. Here is a good article on how to use it to find memory leaks.
Gevent
Gevent is a good library that encapsulates Greenlets so that Python has the asynchronous calling function. Yes. Great. My favorite function is the Pool, which abstracts the asynchronous call part and provides us with a simple way to use it. An asynchronous map () function:
from gevent import monkeymonkey.patch_all() from time import sleep, time def fetch_url(url): print "Fetching %s" % url sleep(10) print "Done fetching %s" % url from gevent.pool import Pool urls = ["http://test.com", "http://bacon.com", "http://eggs.com"] p = Pool(10) start = time()p.map(fetch_url, urls)print time() - start
It is very important to note the patch for gevent monkey at the top of the Code. If it is not available, it cannot be correctly executed. If we want Python to call fetch_url three times in a row, we usually expect this process to take 30 seconds. Use gevent:
(test)jhaddad@jons-mac-pro ~VIRTUAL_ENV/src$ python g.py Fetching http://test.comFetching http://bacon.comFetching http://eggs.comDone fetching http://test.comDone fetching http://bacon.comDone fetching http://eggs.com10.001791954
It is very useful if you have many database calls or get them from remote URLs. I don't like callback very much, so this abstraction works very well for me.
Conclusion
Well, if you see this, you may have learned something new. These tools have a significant impact on me over the past year. It takes a lot of time to search for them, so I hope this article will reduce the effort that others need to make good use of this language.