Eleven Python libraries you may not know
There are so many Python packages that almost no one can master. Only PyPI can list 47,000 packages separately!
Recently, when I heard a lot of data scientists switch to Python, I couldn't help but think that although they have some huge benefits of pandas, scikit-learn, and numpy, however, I missed some Python libraries that were a little older but also helpful.
In this article, I will introduce some little-known libraries. Even if you are already a Python expert, you should also look at it. Maybe one or two of them are something you have never seen before!
1)Delorean
Delorean is a cool date/time library. It is one of the most natural date/time munging libraries I have used in Python. A bit like moment in JavaScript. The document is also good. In addition to being technically helpful, it also allows countless people to come back for reference.
From delorean import Delorean
EST = "US/Eastern"
D = Delorean (timezone = EST)
2)Prettytable
Probably you haven't heard of prettytable yet, because it is listed on GoogleCode-basically the code equipment that is in exile.
Despite being moved to Siberia, prettytable is still a powerful and beautiful build output on the terminal or in the browser. Therefore, if you are creating a new plug-in for IPython laptop, use prettytable for your HTML _ repr.
From prettytable import PrettyTable
Table = PrettyTable (["animal", "ferocity"])
Table. add_row (["wolverine", 100])
Table. add_row (["grizzly", 87])
Table. add_row (["Rabbit of Caerbannog", 110])
Table. add_row (["cat",-1])
Table. add_row (["platypus", 23])
Table. add_row (["doldolphin", 63])
Table. add_row (["albatross", 44])
Table. sort_key ("ferocity ")
Table. reversesort = True
+ ---------------------- + ---------- +
| Animal | ferocity |
+ ---------------------- + ---------- +
| Rabbit of Caerbannog | 110 |
| Maid | 100 |
| Grizzly | 87 |
| Doldolphin | 63 |
| Albatross | 44 |
| Platypus | 23 |
| Cat |-1 |
+ ---------------------- + ---------- +
3)Snowballstemmer
The reason why I installed snowballstemmer for the first time is that I think the name is cool. But it is actually a very beautiful small library. Snowballstemmer can work in 15 different languages and comes with porter stemmer for guidance.
From snowballstemmer import EnglishStemmer, SpanishStemmer
EnglishStemmer (). stemWord ("Gregory ")
# Gregori
SpanishStemmer (). stemWord ("amarillo ")
# Amarill
4)Wget
Do you still remember web crawlers you wrote for some purpose? It was originally created by wget. Download the website recursively? Capture each image on each page? Avoid cookie traces? Only wget is required.
The movie Mark Zuckerberg has even praised it.
The first is Kirkland. They make everything public and Allow Indexing of their Apache configuration, so you only need a little wget magic to download the Facebook of the entire Kirkland. It's too easy!
The Python version has almost all the functions you need and is very easy to use.
Import wget
Wget. download ("http://www.cnn.com /")
# 100% [..................................... ........................................] 280385/280385
Note that another option that users of Linux and OSX may operate on is from sh import wget. However, the wget module of Python does have some controversial points.
5)PyMC
I don't know how PyMC is often excluded from the combination. Scikit-learn seems to be everyone's favorite, but it is really great), but in my opinion, PyMC does not get its place.
From pymc. examples import disaster_model
From pymc import MCMC
M = MCMC (disaster_model)
M. sample (iter = 10000, burn = 1000, thin = 10)
[----------------- 100% -----------------] 10000 of 10000 complete in 1.4 sec
PyMC is a library for Bayesian analysis. It is mainly described in the document "Bayesian Methods for Hackers" by Cam David son Pilon and has highlighted the issue in many popular data science/python blogs, however, it has never been promoted like scikit-learn.
6)Sh
Sh allows you to import shell commands as functions to Python. It is particularly useful when doing some simple things in bash, but you may have forgotten how to use it in Python, that is, recursive search files.
From sh import find
Find ("/tmp ")
/Tmp/foo
/Tmp/foo/file1.json
/Tmp/foo/file2.json
/Tmp/foo/file3.json
/Tmp/foo/bar/file3.json
7)Fuzzywu.pdf
The simplest database I have ever used is fuzzywu.pdf. If you have time, read the source code ). Fuzzywuek is a fuzzy string matching library built by some people in SeatGeek.
Fuzzywuzzy can be used to compare string ratios, Token ratios, and many other matching metrics. This is especially useful for creating feature vectors) or matching records in different databases.
From fuzzywuw.import fuzz
Fuzz. ratio ("Hit me with your best shot", "Hit me with your pet shark ")
#85
8)Progressbar
You know the for loop that you call _ main _ in a pile of mess to execute print "still going ..." Script? So why don't you step your game and use progressbar?
If its name is true, progressbar is indeed a progress bar ). Although this is not a specific activity in data science, it does improve those ultra-long running scripts.
Unfortunately, as another discard of GoogleCode, it does not get a lot of attention. The document has two spaces for indentation ...... 2 Ah !). I hope you will have more mercy on this hardworking and competent baby, amen.
From progressbar import ProgressBar
Import time
Pbar = ProgressBar (maxval = 10)
For I in range (1, 11 ):
Pbar. update (I)
Time. sleep (1)
Pbar. finish ()
# 60% | ##################################### #################### |
9)Colorama
Since you have set a good progress bar for logs, why not make them colorful! You can also remind yourself when a serious error occurs.
Colorama is super easy to use. Just pop up your script and add any text you want to change the color:
10)Uuid
In my mind, we actually only need a few tools for programming: hashing, key/value storage, and the Globally unique Identifier universally unique ids, uuid ). Uuid is built into the UUID library of Python. It implements UUID standards for versions 1, 3, 4, and 5, which is very convenient for such work as ensuring uniqueness.
It sounds silly, but what if you want to create a marketing activity or email delivery record and make sure everyone has their own Promotion Code or ID number?
If you are worried about using your id, haha, you have to worry about it! The number that UUID can generate is equivalent to the number of child in the universe.
Import uuid
Print uuid. uuid4 ()
# E7bafa3d-274e-4b0a-b9cc-d898957b4b61
Uuid
11)Bashplotlib
Finally, let me introduce it with a thick face-bashplotlib is one of my works. It allows you to draw histograms and scatter plots using standard input. Therefore, although you may not have it replace ggplot or matplotlib as a routine drawing library, it is very novel. At least, it can be used as a way to beautify logs.
$ Pip install bashplotlib
$ Scatter -- file data/texas.txt -- pch x
I hope these Python libraries will be helpful for your development!
Http://www.codeceo.com/article/11-python-libs-you-not-know.html.
11. Python Libraries You Might Not Know