Python Big Data and machine learning NumPy first Experience

Source: Internet
Author: User
Tags jupyter

This article is the 6th in a series of Python Big Data and machine learning articles that will introduce the NumPy libraries necessary to learn Python big data and machine learning.

The knowledge you will be able to learn through this article series is as follows:

  • Using Python for big data and machine learning

  • Apply spark for Big data analysis

  • Implement machine learning Algorithms

  • Learn to process numeric data using the NumPy library

  • Learn to use the Pandas Library for data analysis

  • Learn to use the Matplotlib library for Python drawing

  • Learn to use the Seaborn Library for statistical plotting

  • Dynamic visualization using the plotly library

  • Using Scikit-learn to process machine learning tasks

  • K-means Clustering

  • Logistic regression

  • Linear regression

  • Random Forest and decision tree

  • Natural language processing and junk mail filtering

  • Neural network

  • Support Vector Machine

In addition, the small part will embrace the changes and add other meaningful content according to the needs of the review. For example, add some related questions and so on.

What is NumPy

NumPy is a very important Python numerical computing extension Library, and the basic Python big data ecosystem relies on it, and because of the C-language library, it is very fast. Can say that we want to learn Python big data, must learn is the NumPy library.

Installing NumPy

If you have installed Anaconda based on the previous article, you have installed the NumPy library by default. If you want to install separately please continue to look down.

Commands to install using Conda:

Conda Install NumPy

Commands to install using PIP:

Pip Install NumPy

NumPy Array

This series of articles is mainly used in the NumPy array (arrays);

There are two basic forms of the NumPy array: vector (vector) and matrics (matrix)

Vectors are one-dimensional, while matrics are two-dimensional.

Open Jupyter and enter the following:

Import NumPy as NP

My_list = [A]

arr = Np.array (my_list)

Arr

The following results were run:

These are the general forms of vectors.

Continue to enter the following:

My_mat = [[1,2,3],[4,5,6],[7,8,9]]

Np.array (My_mat)

The following results were run:

These are the two-dimensional matrics matrices.

NumPy has its own range function.

Np.arange (0,10)

The results of the operation are as follows:

You can also specify the step Np.arange (0,10,2)

The results of the operation are as follows:

Generates a vector of all elements of 0 Np.zeros (3)

The results of the operation are as follows:

Generates a matrix of all elements of 0 Np.zeros ((5,5))

The results of the operation are as follows:

The same generation of vector and matrix with all elements 1 is np.ones (4), Np.ones ((2,3))

The results of the operation are as follows:

Np.linspace (0,5,20)

The first parameter is the starting point, the second argument is the end point, and the third parameter is the number of copies from the start to end distance.

The results of the operation are as follows:

Np.eye (4) Generate 4*4 matrix with the main diagonal of 1

The results of the operation are as follows:

Np.random.rand (5) generating random vectors

The results of the operation are as follows:

Np.random.rand (5,5) generates a random vector of 5*5

The results of the operation are as follows:

NP.RANDOM.RANDN (2) generates a standard normal distribution curve.

The results of the operation are as follows:

NP.RANDOM.RANDN (bis) Two-dimensional standard normal distribution curve

The results of the operation are as follows:

Tips:

Press the TAB key in the Jupyter input box to prompt the Lenovo menu, and press Shift+tab to prompt the function usage

Press the TAB key

Press the Shift+tab key

Np.random.randint (1,100) generates 1 random integers from 1 to 100 and does not contain 100

The results of the operation are as follows:

Np.random.randint (1,100,10) generates 10 random integers from 1 to 100 and does not contain 100

The results of the operation are as follows:

Some of the functions supported by the array type:

The reshape function can modify the dimensions of an array. For example:

arr = Np.arange (25)

Arr.reshape (5,5)

The results of the operation are as follows:

Max function: Max value

Min Function: Minimum value

Argmax function: Returns the index of the maximum value

Argmin function: Returns the index of the minimum value

Ranarr = Np.random.randint (1,100,10)

Ranarr.max ()

Ranarr.min ()

Ranarr.argmax ()

Ranarr.argmin ()

The results of the operation are as follows:

The shape function, which returns the size of the array

Dtype, return data type

Call Simplification:

From Numpy.random import Randint

We'll be able to use randint directly.

Randint (2,10)

The results of the operation are as follows:

Python Big Data and machine learning NumPy first Experience

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.