Python common libraries-getting started with NumPy and Sklearn

Source: Internet
Author: User
Tags shallow copy true true

Numpy and Scikit-learn are common third-party libraries for Python. The NumPy library can be used to store and handle large matrices, and to some extent make up for Python's lack of computational efficiency, precisely because the presence of numpy makes Python a great tool in the field of numerical computing; Sklearn is the famous machine learning library in Python, It encapsulates a large number of machine learning algorithms, contains a large number of open data sets, and has complete documentation, making it the most popular tool for learning and practicing machine learning.

1. NumPy Library

First import the NumPy library

Import NumPy as NP
1.1 Numpy.array and List
# python built-in array structure # numpy Array Structure

Python has a built-in array structure (list), why should we use the NUMPY array structure? To answer this question, let's look at the features of Python's built-in array structure. When we use list, we find that the data type stored in the list array is not the same, it can be a string, it can be an integer data, or even a class instance. This kind of storage is very useful, which brings a lot of traversal for our use, but it also assumes the flaw or shortage of consuming a lot of memory. Why do you say that? In fact, the storage of each element in the list array requires 1 pointers and a data, that is, the list is actually stored in the data storage address (pointer), it is more than the original ecological array of storage pointer memory consumption. So, when we want to reduce memory consumption, we might as well replace the list with Np.array, which saves a lot of space, and the NumPy array is an excellent container for performing faster numerical calculations.

1.2 NumPy Common operations Create an array
# Create a one-dimensional array np.asarray ([Np.array]) ([#  Create a multidimensional array Np.zeros (# 3 rows 2 columns all 0 matrices # all 1 matrices # 3 Rows 2 columns all fill 5

The difference between Np.array and Np.asarray:

def Asarray (A, Dtype=none, order=None):    return Array (A, Dtype, Copy=false, Order=order )

As you can see, the main difference is that the array will replicate a new object, occupy a new memory space, and Asarray will not perform this operation . An array resembles a deep copy, and an array resembles a shallow copy.

Numerical calculation

Basic Computing

arr1 = Np.array ([[], [4,5,6]]) arr2= Np.array ([[6,5], [4,3], [2,1]])#View arr DimensionsPrint(arr1.Shape)#(2, 3)#slicesNp.array ([1,2,3,4,5,6])[: 3]  #Array ([+])Arr1[0:2,0:2] #Two-dimensional slicing#multiplicationNp.array ([i])*Np.array ([2,3,4])#multiply the corresponding elements by array ([2,6,])Arr1.Dot(b)#matrix multiplication#Matrix summationNp.sum(ARR1)#sum of all elementsNp.sum (ARR1,Axis=0)#column sum Array ([5, 7, 9])Np.sum (arr1, Axis=1)#row sum Array ([6, +])#Maximum MinimumNp.Max(ARR1, AXIS=0/1) NP. min (A, Axis=0/1)

Advanced calculation

arr = Np.array ([[], [3,4], [5,6]])#How to access a Boolean arrayPrint((arr>2))    """[ [FALSE] [true True] [true true]]"""Print(Arr[arr>2])#[3 4 5 6]#Modify ShapeArr.Reshape(2,3)    """Array ([[1, 2, 3], [4, 5, 6]])"""arr.Flatten()#Flatten Array ([1, 2, 3, 4, 5, 6])Arr.T # Transpose

2. Sklearn Library

Sklearn is an important machine learning library for Python, which encapsulates a large number of machine learning algorithms, such as classification, regression, dimensionality reduction and clustering, and includes three modules: supervised learning, unsupervised learning, and data transformation. Sklearn has a comprehensive documentation that makes it easy to get started with, and it has a large set of data sets that save time in getting and organizing datasets. As a result, it has become an important machine learning library for a wide range of applications. Here is a brief introduction to the common methods under Sklearn.

Supervised learning

Sklearn.neighbors#Nearest Neighbor AlgorithmSklearn.svm#Support Vector MachineSklearn.kernel_ridge#nuclear-Ridge regressionSklearn.discriminant_analysis#discriminant AnalysisSklearn.linear_model#Generalized linear Modelsklearn.ensemble#Integrated LearningSklearn.tree#Decision TreeSklearn.naive_bayes#naive BayesianSklearn.cross_decomposition#Cross-decompositionSklearn.gaussian_process#Gaussian processsklearn.neural_network#Neural NetworkSklearn.calibration#Probabilistic CalibrationSklearn.isotonic#Conservative regressionSklearn.feature_selection#Feature SelectionSklearn.multiclass#multi-class multi-label algorithm

Each of these models contains multiple algorithms that can be directly import at the time of invocation, such as:

 from Import  = Logisticregression ()
Unsupervised learning
sklearn.decomposition # matrix factor decomposition Sklearn.cluster # Clustering # Manifold Learning # Gaussian mixture Model # Unsupervised Neural Networks # Covariance estimation
Data transformation
sklearn.feature_extraction # Feature Extraction sklearn.feature_selection # Feature Selection # pretreatment # Random Projection # Nuclear approximation
Data set

  

  

In addition, Sklearn also has a unified API interface, and we can usually implement different machine learning algorithms by using exactly the same interface, generally implementing the process:

Step1. Data loading and preprocessing

Step2. Define the classifier, for example: Lr_model = logisticregression ()

  Step3. Training the model with training set: lr_model.fit (x, y)

Step4. Using a well-trained model for prediction: y_pred = lr_model.predict (x_test)

Step5. Performance evaluation of the model:lr_model.score (x_test, y_test)

Python common libraries-getting started with NumPy and Sklearn

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.