If you have decided to use Python as your programming language, the next question in your mind will be: "What Python libraries are available for data analysis?" "
Numpy
For scientific computing, it is the foundation of all the higher-level tools that Python creates. Here are some of the features it provides:
1. N-dimensional arrays, a fast and efficient multi-dimensional array that uses memory, provides vectorization mathematical operations.
2. You may not need to use loops to standard mathematical operations on the data rows in the entire array.
3. It is very easy to transfer data to external libraries written in low-level languages such as C or C + +, as well as to allow external libraries to return data as numpy arrays.
NumPy does not provide advanced data analysis capabilities, but with an understanding of numpy arrays and array-oriented computations, it can help you to use tools like pandas more effectively.
Scipy
The SciPy library relies on NumPy, which provides convenient and fast n-dimensional vector array operations. The SciPy library is built to work with the NumPy array and provides a number of user-friendly and effective numerical routines such as numerical integration and optimization. The SCIPY provides modules for common tasks in optimization, linear algebra, integrals, and other data science.
Pandas
Pandas contains advanced data structures, as well as tools that make data analysis quick and easy. It is built on top of NumPy, making numpy-centric applications easier.
1. Data structure with axes, support automatic or explicit data alignment. This prevents common errors caused by data not being aligned and processing different sources of data with different indexes.
2. Using pandas makes it easier to handle missing data.
3. Merging relational operations that can be found in popular databases such as SQL-based databases.
Pandas is the best tool for data cleansing/finishing (munging).
Matplotlib
Matlplotlib is a visual module of Python. It makes it easy for you to make line charts, pie charts, histograms, and other professional graphics. With Matplotlib, you can customize any aspect of the chart you make. When used in Ipython, Matplotlib has some interactive features, such as zoom and pan. It supports different GUI backend (back ends) under all operating systems, and can output graphics to common vector and graphic formats such as PDF, SVG, JPG, PNG, BMP, and GIF.
Scikit-learn
Scikit-learn is a Python module for machine learning. It builds on scipy and provides a common machine learning algorithm that allows users to use it through a unified interface. Scikit-learn helps you to quickly implement popular algorithms on your data set.
The last small part recommended a more friendly for beginners A data analysis book "Using Python for Data analysis"
Python Learning Group: 125240963 receive the "Data analysis using Python" PDF book
Beginners want to learn data analysis, these five Python libraries, is simply for beginners to tailor-made