Comprehensive learning Path–data Science in Python

Source: Internet
Author: User

http://blog.csdn.net/pipisorry/article/details/44245575

A good article on how to learn python and use Python for data science, data analysis, and machine learning

Comprehensive(integrated) Learning Path–data Science in Python


Journey from a pythonnoob(Novice) to a kaggler on Python

So, you want to become a data scientist or May is you is already one and want toExpand(expansion)Your toolRepository(storage room). You are landed at the right place. The aim of this page was to provide a comprehensive learning path to people new to Python for data analysis. This path provides a comprehensiveOverview(summary)Of steps you need to learn-use Python for data analysis. If you already has some background, or don ' t need all the Components(ingredient), feel free toAdapt(adaptation)Your own paths and let us know how to made changes in the path.

Step 0:warming up

Before starting your journey, the first question to answer are:

Why use Python?

Or

How would Python is useful?

Watch The first minutes of this talk from Jeremy, founder of Datarobot at Pycon, Ukraine to get a idea of what use Ful Python could be.

Step 1:setting up your machine

now that you had made up your mind, it was time to set up your machine. the easiest Toproceed (start) is to Justdownload Anaconda from Continuum.io. It comes packaged with most of the things you'll need ever. The Majordownside (downtrend) of taking Thisroute (route) is so you'll need to wait for Continuum to update their packages, even when there might is an update available to Theunderlying (potentially) libraries. If you is a starter, that should hardly matter.

If you face any challenges in installing(install), you can find moredetailed instructions for various OS here

Step 2:learn The basics of Python language

You should start by understanding the basics of the language, libraries and datastructure(structure). The Python track Fromcodecademy are one of the best places to start your journey. By end of this course, you should is comfortable writing small scripts on Python, but also understand classes and objects.

specifically learn: Lists, tuples, dictionaries, List comprehensions(understanding), Dictionary comprehensions

Assignment: Solve the Python tutorial(Tutoring) questions on Hackerrank. These should get your brain thinking on Python scripting

Alternate Resources: If Interactive(interactive) coding isn't your style of learning, you can also look at Thegoogle Class for Pyth On. It is a 2 day class series and also covers some of the parts discussed later.

Step 3:learn Regular Expressions in Python

You'll need to use them a IoT for data cleansing(purify), especially if you is working on text data. The best of learn Regular expressions is to go through the Google class and keep this cheat sheet handy.

Assignment: Do the baby names exercise

If you still need more practice, follow this tutorial(individual guide) for text cleaning. It'll challenge you on various stepsinvolved(included) in datawrangling(controversy).

Step 4:learn Scientific libraries in Python–numpy, SciPy, matplotlib and Pandas

This is the WHERE fun begins! Here are a brief introduction to various libraries. Let ' s start practicing some common operations.

  • Practice the NumPy tutorial thoroughly, especially NumPy arrays(array). This would form a goodFoundation(Base) for things to come.
  • Next, look at the SciPy tutorials. Go through the introduction and the basics and do the remaining onesbasis(Basic) your needs.
  • If you guessed matplotlib tutorials Next, you are wrong! They is too comprehensive(integrated) for our need here. Instead look at Thisipython notebook till line (i.e. till animations (lively))
  • Finally, let us look at Pandas. Pandas provide DataFrame functionality(function) (like R) for Python. This is also the where you should spend good time practicing. Pandas would become the mosteffective(valid) tool for all mid-size data analysis. Start with a short introduction,10 minutes to pandas. Then move over to a more detailedtutorial on pandas.

can also look at exploratory(exploration) Data analysis with Pandas anddata munging with Pandas

Additional Resources:

    • If you need a book on Pandas and NumPy, "Python(Monty Python) for Data analysis by Wes McKinney"
    • There is a lot of tutorials(individual guidance) as part of Pandasdocumentation(document material). You can has a look at Themhere

Assignment: Solve this assignment(allocated) from CS109 course from Harvard.

Step 5:effective Data visualization

Go through this lecture form CS109. You can ignore(dismiss lawsuit) the initial 2 minutes, but what follows after that isawesome(scary) ! Follow this lecture up Withthis assignment

Step 6:learn Scikit-learn and machine learning

Now, we come to the meat of this entire process. Scikit-learn is the most useful library onpython(python)For machine learning. Here is AbriefOverview(summary)The library. Go through lecture lecture fromCS109 course from Harvard. You'll go through an overview of machine learning, supervised learningAlgorithms(algorithm)Likeregressions(return), Decision Trees,Ensemble(All)Modeling and non-supervised learning algorithms likeClustering(aggregation). Followindividual(personal)Lectures with theassignments from those lectures.

Additional Resources:

    • If There is a book, you must read, it's programming collective intelligence–a classic (classic) , but still one of the Best books on the subject.
    • additionally (additional) , you can also follow one of the best courses onmachine learning course from Yaser Abu-mostafa. If you need more explanation for the techniques, you can opt for Themachine learning course from Andrew Ng and follow The exercises on Python.
    • tutorials (Individual guidance) On Scikit Learn

Assignment: Try out this challenge on Kaggle

Step 7:practice, practice and practice

Congratulations, you made it!

You are now having all the need in technical skills. It is a matter of practice and what better place to practice than compete with fellow Data scientists on Kaggle. Go, dive into one of the "live competitions currently running Onkaggle and give all-you has learnt a try!

Step 8:deep Learning

Now so you had learnt most of the machine learning techniques, it was time to give deep learning a shot. There is a good chance so already know what's deep learning, but if you still need a briefintro(Introduction) ,here it is.

I am myself new to deep learning, so please take the these suggestions with apinch(deficient) of salt. The mostcomprehensive(integrated) resource isdeeplearning.net. You'll find everything here–lectures, datasets, challenges, tutorials. You can also try Thecourse from Geoff Hinton a try in a bid to understand the basics of neural Networks.

P.S need to use Big Data libraries, give Pydoop and Pymongo a try. They is isn't included here as the Big Data learning path is a entire topic in itself.

Fromhttp://blog.csdn.net/pipisorry/article/details/44245575

ref:http://www.analyticsvidhya.com/ learning-paths-data-science-business-analytics-business-intelligence-big-data/ learning-path-data-science-python/

Comprehensive learning Path–data Science in Python

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.