Numpy and Scikit-learn are common third-party libraries for Python. The NumPy library can be used to store and handle large matrices, and to some extent make up for Python's lack of computational efficiency, precisely because the presence of numpy makes Python a great tool in the field of numerical computing; Sklearn is the famous machine learning library in Python, It encapsulates a large number of machine learning algorithms, contains a large number of open data sets, and has complete documentation, making it the most popular tool for learning and practicing machine learning.
1. NumPy Library
First import the NumPy library
Import NumPy as NP
1.1 Numpy.array and List
# python built-in array structure # numpy Array Structure
Python has a built-in array structure (list), why should we use the NUMPY array structure? To answer this question, let's look at the features of Python's built-in array structure. When we use list, we find that the data type stored in the list array is not the same, it can be a string, it can be an integer data, or even a class instance. This kind of storage is very useful, which brings a lot of traversal for our use, but it also assumes the flaw or shortage of consuming a lot of memory. Why do you say that? In fact, the storage of each element in the list array requires 1 pointers and a data, that is, the list is actually stored in the data storage address (pointer), it is more than the original ecological array of storage pointer memory consumption. So, when we want to reduce memory consumption, we might as well replace the list with Np.array, which saves a lot of space, and the NumPy array is an excellent container for performing faster numerical calculations.
1.2 NumPy Common operations Create an array
# Create a one-dimensional array np.asarray ([Np.array]) ([# Create a multidimensional array Np.zeros (# 3 rows 2 columns all 0 matrices # all 1 matrices # 3 Rows 2 columns all fill 5
The difference between Np.array and Np.asarray:
def Asarray (A, Dtype=none, order=None): return Array (A, Dtype, Copy=false, Order=order )
As you can see, the main difference is that the array will replicate a new object, occupy a new memory space, and Asarray will not perform this operation . An array resembles a deep copy, and an array resembles a shallow copy.
Numerical calculation
Basic Computing
arr1 = Np.array ([[], [4,5,6]]) arr2= Np.array ([[6,5], [4,3], [2,1]])#View arr DimensionsPrint(arr1.Shape)#(2, 3)#slicesNp.array ([1,2,3,4,5,6])[: 3] #Array ([+])Arr1[0:2,0:2] #Two-dimensional slicing#multiplicationNp.array ([i])*Np.array ([2,3,4])#multiply the corresponding elements by array ([2,6,])Arr1.Dot(b)#matrix multiplication#Matrix summationNp.sum(ARR1)#sum of all elementsNp.sum (ARR1,Axis=0)#column sum Array ([5, 7, 9])Np.sum (arr1, Axis=1)#row sum Array ([6, +])#Maximum MinimumNp.Max(ARR1, AXIS=0/1) NP. min (A, Axis=0/1)
Advanced calculation
arr = Np.array ([[], [3,4], [5,6]])#How to access a Boolean arrayPrint((arr>2)) """[ [FALSE] [true True] [true true]]"""Print(Arr[arr>2])#[3 4 5 6]#Modify ShapeArr.Reshape(2,3) """Array ([[1, 2, 3], [4, 5, 6]])"""arr.Flatten()#Flatten Array ([1, 2, 3, 4, 5, 6])Arr.T # Transpose
2. Sklearn Library
Sklearn is an important machine learning library for Python, which encapsulates a large number of machine learning algorithms, such as classification, regression, dimensionality reduction and clustering, and includes three modules: supervised learning, unsupervised learning, and data transformation. Sklearn has a comprehensive documentation that makes it easy to get started with, and it has a large set of data sets that save time in getting and organizing datasets. As a result, it has become an important machine learning library for a wide range of applications. Here is a brief introduction to the common methods under Sklearn.
Supervised learning
Sklearn.neighbors#Nearest Neighbor AlgorithmSklearn.svm#Support Vector MachineSklearn.kernel_ridge#nuclear-Ridge regressionSklearn.discriminant_analysis#discriminant AnalysisSklearn.linear_model#Generalized linear Modelsklearn.ensemble#Integrated LearningSklearn.tree#Decision TreeSklearn.naive_bayes#naive BayesianSklearn.cross_decomposition#Cross-decompositionSklearn.gaussian_process#Gaussian processsklearn.neural_network#Neural NetworkSklearn.calibration#Probabilistic CalibrationSklearn.isotonic#Conservative regressionSklearn.feature_selection#Feature SelectionSklearn.multiclass#multi-class multi-label algorithm
Each of these models contains multiple algorithms that can be directly import at the time of invocation, such as:
from Import = Logisticregression ()
Unsupervised learning
sklearn.decomposition # matrix factor decomposition Sklearn.cluster # Clustering # Manifold Learning # Gaussian mixture Model # Unsupervised Neural Networks # Covariance estimation
Data transformation
sklearn.feature_extraction # Feature Extraction sklearn.feature_selection # Feature Selection # pretreatment # Random Projection # Nuclear approximation
Data set
In addition, Sklearn also has a unified API interface, and we can usually implement different machine learning algorithms by using exactly the same interface, generally implementing the process:
Step1. Data loading and preprocessing
Step2. Define the classifier, for example: Lr_model = logisticregression ()
Step3. Training the model with training set: lr_model.fit (x, y)
Step4. Using a well-trained model for prediction: y_pred = lr_model.predict (x_test)
Step5. Performance evaluation of the model:lr_model.score (x_test, y_test)
Python common libraries-getting started with NumPy and Sklearn