When I started learning python, there were some things I wanted to know early on. It took me a lot of time to learn these things. I would like to compile these points into an article. The target audience for this article is an experienced programmer who has just begun to learn the Python language, and wants to skip the previous months to study the similar tools that Python uses for those they already use. The two sections, package management and standard tools, are also helpful for beginners.
My experience is based primarily on Python 2.7, but most of the tools are valid for any version.
If you have never used Python, I highly recommend that you read the Python introduction because you need to know the basic syntax and type.
Package Management
One of the best places in the Python world is a large number of third-party packages. Similarly, managing these packages is also very easy. As a rule, the packages required for the project are listed in the Requirements.txt file. Each package occupies a single row and usually contains a version number. Here's an example of this blog using pelican:
pelican==3.3markdownpelican-extended-sitemap==1.0.0
One flaw with Python packages is that they are globally installed by default. We are going to use a tool that will allow us to have a separate environment for each of our projects, the tool called Virtualenv. We also have to install a more advanced package management tool called Pip, and he can work with virtualenv.
First, we need to install PIP. Most python installers are built-in easy_install
(Python's default package management tool), so we use easy_install pip
them to install PIP. This should be the last time you easy_install
've used it. If you do not install it, it seems to be available in the easy_install
Linux system from the python-setuptools
package.
If you're using a Python version above or equal to 3.3, then virtualenv is already part of the standard library, so there's no need to install it anymore.
Next, you want to install Virtualenv and virtualenvwrapper. VIRTUALENV enables you to create an independent environment for each project. This is especially useful when your different projects use different versions of the package. Virtualenv Wrapper provides some good scripts that can make things easier.
sudo pip install Virtualenvwrapper
When Virtualenvwrapper is installed, it will list the virtualenv as a dependent package, so it will be installed automatically.
Open a new shell, enter mkvirtualenv test
. If you open another shell, you are not in this virtualenv, you can workon test
start by. If your work is done, you can use it deactivate
to deactivate it.
IPython
Ipython is a substitute for the standard Python interactive programming environment that supports auto-completion, quick access to documents, and many other features that the standard interactive programming environment should have.
When you are in a virtual environment, you can simply use it to pip install ipython
install it and use it on the command line ipython
to start
Another good feature is "notebook", which requires additional components. After the installation is complete, you can use it ipython notebook
, and there will be a nice Web UI where you can create a notebook. This is very popular in the field of scientific computing.
Test
I recommend it nose
or py.test
. I use most of the situation nose
. They are basically similar. I'll explain some of the details of nose.
Here is an example of the ridiculous use of nose for the creation of a person to test. All functions that begin with the beginning of a test_
file test_
will be called:
Def test_equality (): assert True = = False
As expected, our test did not pass when we ran the nose.
(test) [Email protected] ~virtual_env/src$ nosetests f============================================================== ========fail:test_nose_example.test_ Equality----------------------------------------------------------------------Traceback (most recent): File "/users/jhaddad/.virtualenvs/test/lib/python2.7/site-packages/nose/case.py", line 197, in RunTest Self.test (*self.arg) File "/users/jhaddad/.virtualenvs/test/src/test_nose_example.py", line 3, in test_equality assert True = = Falseassertionerror----------------------------------------------------------------------
There are also some convenient ways to invoke Nose.tools in the
From Nose.tools import Assert_truedef test_equality (): assert_true (False)
If you want to use a more junit-like approach, it's also possible:
From nose.tools import assert_truefrom unittest import Testcaseclass exampletest (TestCase): def setup (self): # Setup & TearDown is both available Self.blah = False def test_blah (self): self.asserttrue (Self.blah)
To start the test:
(test) [Email protected] ~virtual_env/src$ nosetests f============================================================== ========fail:test_blah (test_nose_example. Exampletest)----------------------------------------------------------------------Traceback (most recent ): File "/users/jhaddad/.virtualenvs/test/src/test_nose_example.py", line one, in Test_blah self.asserttrue ( Self.blah) Assertionerror:false is not True----------------------------------------------------------------------Ran 1 Test in 0.003sFAILED (Failures=1)
The excellent mock library is included in Python 3, but if you are using Python 2, you can use PyPI to get it. This test will make a remote call, but this call will take 10s of time. This example is obviously artificially fabricated. We use a mock to return the sample data instead of actually making the call.
Import mockfrom Mock import patchfrom time import Sleepclass Sweetness (object): def slow_remote_call (self): Sleep () return "Some_data" # Lets pretend we get this back from our remote API Calldef Test_long_call (): s = Swe Etness () result = S.slow_remote_call () assert result = = "Some_data"
Of course, our testing takes a long time.
(test) [Email protected] ~virtual_env/src$ nosetests test_mock.py Ran 1 Test in 10.001sOK
It's too slow! So we're going to ask ourselves, what are we testing? Do we need to test whether remote calls are useful, or do we want to test what we do when we get the data? Most of the cases are the latter. Let's get rid of this stupid remote call:
Import mockfrom Mock import patchfrom time import Sleepclass Sweetness (object): def slow_remote_call (self): Sleep () return "Some_data" # Lets pretend we get this back from our remote API Calldef Test_long_call (): s = Swe Etness () with Patch.object (S, "Slow_remote_call", return_value= "Some_data"): result = S.slow_remote_call () assert result = = "Some_data"
OK, let's try it again:
(test) [Email protected] ~virtual_env/src$ nosetests test_mock.py .-------------------------------------------------- --------------------Ran 1 Test in 0.001sOK
Much better. Remember, this example is a ridiculous simplification. Personally, I just ignore calls from remote systems, not my database calls.
Nose-progressive is a good module that can improve the output of nose so that errors are displayed when they occur, rather than being left to the end. If your test takes some time, then this is a good thing.
pip install nose-progressive
and add it to your nosetests
--with-progressive
Debugging
IPDB is an excellent tool, I have used it to detect a lot of strange bugs. pip install ipdb
Install the tool, then in your code import ipdb; ipdb.set_trace()
, and then you will get a good interactive hint when your program runs. It executes one line of the program at a time and checks for variables.
Python has a good tracking module built in to help me figure out what's going on. Here's a Python program that doesn't work:
A = 1b = 2a = b
Here is the tracking result for this program:
(test) [Email protected] ~virtual_env/src$ python-m trace--trace tracing.py 1↵ ---modulename:tracing, funcname: < module>tracing.py (1): a = 1tracing.py (2): b = 2tracing.py (3): a = b---modulename:trace, funcname: _UNSETTRACETRACE.P Y (+): sys.settrace (None)
This is useful when you want to figure out the internal structure of other programs. If you've used strace before, they work in a very similar way.
On some occasions, I use Pycallgraph to track performance issues. It can create a graph of the time and number of function calls.
Finally, Objgraph is useful for finding memory leaks. Here's a good article on how to use it to find memory leaks.
Gevent
Gevent is a good library that encapsulates the greenlets, making Python a function of asynchronous invocation. Yes, very good. My favorite feature is pool, which abstracts the asynchronous invocation section, giving us a simple way to use it, an asynchronous map () function:
From gevent import Monkeymonkey.patch_all () From time import sleep, timedef fetch_url (URL): print "fetching%s"% url
sleep () print "done fetching%s"% urlfrom gevent.pool import poolurls = ["http://test.com", "http://bacon.com", "http://eggs.com"]p = Pool (Ten) Start = Time () p.map (Fetch_url, URLs) print time ()-Start
It is important to note that the patch at the top of the code for Gevent Monkey does not work correctly without it. If we let Python call Fetch_url 3 times in a row, we usually expect this process to take 30 seconds. Using gevent:
(test) [email protected] ~virtual_env/src$ python g.py fetching http://test.comFetching http://bacon.comFetching/http Eggs.comdone fetching Http://test.comDone fetching Http://bacon.comDone fetching http://eggs.com10.001791954
This is useful if you have a lot of database calls or get from remote URLs. I don't really like the callback function, so this abstraction works well for me.
Conclusion
Well, if you see this, you're probably already learning something new. These tools have had a major impact on me over the past year. It took a lot of time to find them, so hopefully this article will reduce the effort that other people need to make good use of the language.
Summary of Python beginners who have programming experience