Python Libraries might not Know

Source: Internet
Author: User
Tags stem words uuid

Python Libraries might not Knowby Greg | January 20, 2015

There is tons of Python packages out there. So many this no one man or woman could possibly catch them all. PyPi alone have over 47,000 packages listed!

Recently, with-many data scientists making the switch to Python, I-couldn ' t help but think this while they ' re getting s Ome of the great benefits of pandas, Scikit-learn, and NumPy, they ' re missing out on some older yet equally helpful Python Libraries.

In this post, I ' m going to highlight some lesser-known libraries. Even experienced pythonistas should take a look, there might be one or both in there you ' ve never seen!

1) DeLorean

Delorean is a really cool Date/time library. Apart from has a sweet name, it's one of the more natural feeling date/time munging libraries I ' ve used in Python. It's sort of like  moment  in JavaScript, except I laugh every time I import it. The docs is also good and in addition to being technically helpful, they also do countless  back to the future< /em> references.

from delorean import DeloreanEST = "US/Eastern"d = Delorean(timezone=EST)

2) prettytable

There ' s a chance you haven ' t heard prettytable of because it's listed on Googlecode, which is basically the coding equivalent of Siberia.

Despite being exiled to a cold, snowy and desolate place, was prettytable great for constructing output that looks good in the Te Rminal or in the browser. So if you ' re working to a new plug-in for the IPython Notebook, check out for prettytable your HTML __repr__ .

FromPrettytableImport PrettytableTable= Prettytable(["Animal", "Ferocity"])Table.Add_row(["Wolverine", 100])Table.Add_row(["Grizzly", 87])Table.Add_row(["Rabbit of Caerbannog", 110])Table.Add_row(["Cat", -1])Table.Add_row(["Platypus", 23])Table.Add_row(["Dolphin", 63])Table.Add_row(["Albatross", 44])Table.Sort_key("Ferocity")Table.Reversesort= True+----------------------+----------+|Animal|Ferocity|+----------------------+----------+| RabbitOfCaerbannog | 110 ||Wolverine| 100 ||Grizzly| 87 | | Dolphin |63 ||  Albatross |44 ||  platypus |23 ||  cat |-1 |+----------------------+----------+           
3) Snowballstemmer

Ok so the first time I installed snowballstemmer , it is because I thought the name was cool. But it ' s actually a pretty slick little library. Would snowballstemmer stem words in the different languages and also comes with a porter stemmer to boot.

from snowballstemmer import EnglishStemmer, SpanishStemmerEnglishStemmer().stemWord("Gregory")# GregoriSpanishStemmer().stemWord("amarillo")# amarill
4) wget

Remember every time wrote that web crawler for some specific purpose? Turns out somebody built it...and it ' s called wget . Recursively download a website? Grab every image from a page? Sidestep cookie traces? Done, do, and done.

Movie Mark Zuckerberg even says it himself

First up are Kirkland, they keep everything open and allow indexes in their Apache configuration, so a little wget magi C is enough to download the entire Kirkland Facebook. Kid stuff!

The Python version comes with just, about every feature, you could ask for and are easy-to-use.

import wgetwget.download("http://www.cnn.com/")# 100% [............................................................................] 280385 / 280385

Note that another option for Linux and OSX users would is to use the Do: from sh import wget . However the Python wget module does has a better argument handline.

5) PYMC

I ' m not sure how PyMC gets left scikit-learn out of the mix so often seems to be everyone's darling (as it should, it ' s fan Tastic), but in my opinion, not enough love was given to PyMC .

FromPymc.ExamplesImportDisaster_modelFromPymcImportMcmcm= Mcmc (disaster_model ) m. (iter=10000,< Span class= "PLN" > Burn=1000, Thin=10) [----------------- Span class= "lit" >100%-----------------] 10000  10000 complete in Span class= "lit" >1.4 sec            

If you don ' t already know it, the IS- PyMC a library for doing Bayesian analysis. It ' s featured heavily in Cam Davidson-pilon's Bayesian Methods for Hackers and have made cameos on a lot of popular data SC Ience/python blogs, but had never received the cult following akin to scikit-learn .

6) SH

I can ' t risk you leaving this page and not knowing sh about. sh lets you import shell commands into Python As functions. It's super useful for doing things that is easy on bash but what can ' t remember how to does in Python (i.e. recursively sear Ching for files).

FromShImportFindfind("/tmp")/Tmp/foo/tmp/ foo/file1. Json/tmp/foo/ file2./tmp/foo/ file3./tmp/foo/ bar/file3. JSON                  
7) Fuzzywuzzy

Ranking in the top ten of simplest libraries I ' ve ever used (if you had 2-3 minutes, you can read through the source), is a fuzzy string matching library built by the fine people at SeatGeek.

fuzzywuzzyImplements things like string comparison ratios, token ratios, and plenty of the other matching metrics. It ' s great for creating feature vectors or matching up records in different databases.

from fuzzywuzzy import fuzzfuzz.ratio("Hit me with your best shot", "Hit me with your pet shark")# 85
8) ProgressBar

You know those scripts are you having where do a in that print "still going..." giant mess of A for loop your call your __main__ ? Yeah well instead of doing so, why does don ' t you step up your game and start using progressbar ?

progressbarDoes pretty much exactly what do you think it does...makes progress bars. And while this isn ' t exactly a data science specific activity, it does put a nice touch on those extra long running script S.

Alas, as another googlecode outcast, it's not getting much love (thedocs has 2 spaces for indents ... 2!!!). Do what's right and give it a good ole pip install .

FromProgressBarImport ProgressBarImportTimepbar= ProgressBar (maxval=10) for I in Range (1, 11 Pbar. (i)  Time.sleep (1) pbar finish () # 60% |######################### ############################### |  
9) Colorama

So while you ' re making your logs has nice progress bars, why isn't also make them colorful! It can actually is helpful for reminding yourself when things is going horribly wrong.

coloramais super easy-to-use. Just pop it into your scripts and add any text you want to print to a color:

) UUID

I ' m of the mind that there is really only a few tools one needs in programming:hashing, Key/value stores, and Universall Y unique IDs. Is the uuid built in Python UUID library. It implements versions 1, 3, 4, and 5 of the UUID standards and is really handy for doing things like...err...ensuring Uni Queness.

That's might sound silly, but how many times has you had records for a marketing campaign, or an e-mail drop and you want T o Make sure everyone gets their own promo code or ID number?

And if you ' re worried on running out of IDs, then fear not! The number of UUIDs you can generate are comparable to the number of atoms in the universe.

import uuidprint uuid.uuid4()# e7bafa3d-274e-4b0a-b9cc-d898957b4b61

Well if you were a uuid probably would is.

One) Bashplotlib

Shameless self-promotion, is one of bashplotlib my creations. It lets you plot histograms and scatterplots using stdin. So while you might don't find it replacing Ggplot or matplotlib as your everyday plotting library, the novelty value is quit E High. At the very least, use it as a-to-spruce up your logs a bit.

$ pip install bashplotlib$ scatter --file data/texas.txt --pch x

Python Libraries might not Know

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.