Seven development libraries that Python developers should know

Last Update:2013-12-17 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

During my years of Python programming experience and the exploration and roaming process on Github, I found some very good Python Development Kits which greatly simplified the development process, this article is to recommend these sdks to you.

Note that I have excluded libraries such as SQLAlchemy and Flask, because they are too good to mention.

Start as follows:

1. PyQuery (with lxml)

Installation Method pip install pyquery

Beauul ul Soup is the most frequently recommended method for parsing HTML in Python, and it does do well. It provides a good Python-style API and is easy to find relevant documents online. However, when you need to parse a large number of documents in a short time, you will encounter performance problems, which is simple, but it's really slow.

It is a performance comparison chart of the year 08:

We found that the performance of lxml is so good, but there are very few documents, and it is very clumsy to use! Choose a database that is easy to use but slow, or a database that is fast but complex to use?

Who said they must choose one? What we need is a convenient and fast XML/HTML parsing library!

PyQuery can meet the demanding requirements of ease of use and resolution speed.

Let's look at the following lines of code:

 
 
  
  from pyquery import PyQuery  
  
  page = PyQuery(some_html)  
  
   
  
  last_red_anchor = page('#container > a.red:last')

It's easy, like jQuery, But it's Python.

However, there are also some shortcomings. You need to re-encapsulate the text when using iterations:

 
 
  
  for paragraph in page('#container > p'):  
  
      paragraph = PyQuery(paragraph)  
  
      text = paragraph.text()

2. dateutil

Installation Method: pip install dateutil

Processing date is very painful, thanks to dateutil

 
 
  
  from dateutil.parser import parse  
  
   
  
  >>> parse('Mon, 11 Jul 2011 10:01:56 +0200 (CEST)')  
  
  datetime.datetime(2011, 7, 11, 10, 1, 56, tzinfo=tzlocal())  
  
   
  
  # fuzzy ignores unknown tokens  
  
   
  
  >>> s = """Today is 25 of September of 2003, exactly  
  
  ...        at 10:49:41 with timezone -03:00.""" 
  
  >>> parse(s, fuzzy=True)  
  
  datetime.datetime(2003, 9, 25, 10, 49, 41,  
  
                    tzinfo=tzoffset(None, -10800))

3. fuzzywu.pdf

Installation Method: pip install fuzzywu.pdf

Fuzzywuzzy allows you to perform fuzzy Comparison on two strings. This is useful when you need to process human-generated data. The following code uses the Levenshtein distance comparison method to match the user input array and possible options.

 
 
  
  from Levenshtein import distance  
  
   
  
  countries = ['Canada', 'Antarctica', 'Togo', ...]  
  
   
  
  def choose_least_distant(element, choices):  
  
      'Return the one element of choices that is most similar to element' 
  
      return min(choices, key=lambda s: distance(element, s))  
  
   
  
  user_input = 'canaderp' 
  
  choose_least_distant(user_input, countries)  
  
  >>> 'Canada'

This is good, but it can be better:

 
 
  
  from fuzzywuzzy import process  
  
   
  
  process.extractOne("canaderp", countries)  
  
  >>> ("Canada", 97)

4. watchdog

Installation Method: pip install watchdog

Watchdog is a Python API and shell utility used to monitor file system events.

5. sh

Installation Method: pip install sh

Sh allows you to call any program, like a function:

 
 
  
  from sh import git, ls, wc  
  
   
  
  # checkout master branch  
  
  git(checkout="master")  
  
   
  
  # print(the contents of this directory  
  
  print(ls("-l"))  
  
   
  
  # get the longest line of this file  
  
  longest_line = wc(__file__, "-L")

6. pattern

Installation Method: pip install pattern

Pattern is a Python Web data mining module. It can be used for data mining, natural language processing, machine learning, and network analysis.

7. path. py

Installation Method: pip install path. py

When I started learning Python, OS. path was part of my favorite stdlib. Although it is easy to create a group of files in a directory.

 
 
  
  import os  
  
   
  
  some_dir = '/some_dir' 
  
  files = []  
  
   
  
  for f in os.listdir(some_dir):  
  
      files.append(os.path.joinpath(some_dir, f))

However, listdir is in OS rather than OS. path.

With path. py, processing the file path becomes simple:

 
 
  
  from path import path  
  
   
  
  some_dir = path('/some_dir')  
  
   
  
  files = some_dir.files()

Other usage:

 
 
  
  >>> path('/').owner  
  
  'root' 
  
   
  
  >>> path('a/b/c').splitall()  
  
  [path(''), 'a', 'b', 'c']  
  
   
  
  # overriding __div__  
  
  >>> path('a') / 'b' / 'c' 
  
  path('a/b/c')  
  
   
  
  >>> path('ab/c').relpathto('ab/d/f')  
  
  path('../d/f')

Is it much better?

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Seven development libraries that Python developers should know

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support