Python scientific computing package numpy usage example details, pythonnumpy
This article describes how to use the Python scientific computation package numpy. We will share this with you for your reference. The details are as follows:
1. Data Structure
Numpy uses a matrix data structure similar to Matlab called ndarray to manage data, which is more powerful than the list of python and the array class of the standard library and easier to process data.
1.1 array generation
In numpy, you must specify the data type to generate an array. The default value is int32, which is an integer. You can specify the data type by using the dtype parameter. Generally, int32, bool, float32, uint32, and complex are used, represents integer, Boolean, floating point, unsigned integer, and plural
Generally, there are several ways to generate an array:
Generate with list as the parameter (convert back to list using tolist ):
In[3]: a = array([1, 2, 3])In[4]: aOut[4]: array([1, 2, 3])In[5]: a.tolist()Out[5]: [1, 2, 3]
Specify the start point, end point, and step to generate an equal-difference sequence or an equal-ratio sequence:
In[7]: a = arange(1, 10, 2)In[8]: aOut[8]: array([1, 3, 5, 7, 9])
In[13]: a = linspace(0, 10, 5)In[14]: aOut[14]: array([ 0. , 2.5, 5. , 7.5, 10. ])
In [148]: a = logspace (0, 3, 10) #0 indicates that the starting point is 10 ^ 149, 3 indicates that the starting point is 10 ^ 3, and the base number is specified by the base parameter In []: aOut [148]: array ([1 ., 2.15443469, 4.64158883, 10 ., 21.5443469, 46.41588834, 100 ., 215.443469, 464.15888336, 1000.])
Generate from the iterator:
In[17]: iter = (i for i in range(5))In[18]: a = fromiter(iter, dtype=int32)In[19]: aOut[19]: array([0, 1, 2, 3, 4])
Generate from the function:
In[156]: def f(i, j):... return abs(i-j)... In[157]: fromfunction(f, (4, 4))Out[156]: array([[ 0., 1., 2., 3.], [ 1., 0., 1., 2.], [ 2., 1., 0., 1.], [ 3., 2., 1., 0.]])
You can also use zeros, ones, empty, and other functions to quickly create arrays.
The matrix is regarded as a two-dimensional array:
In[24]: b = array([arange(5), arange(1, 6), arange(2, 7)])In[25]: bOut[25]: array([[0, 1, 2, 3, 4], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6]])
The same method can be used to expand to higher dimensions.
In addition, we can also generate an array (called a structure array) of custom data format to record the information of a row of data in a workbook or database:
In[61]: t = dtype([('name', str, 40), ('number', int32), ('score', float32)])In[62]: tOut[62]: dtype([('name', '<U40'), ('number', '<i4'), ('score', '<f4')])In[63]: students = array([('Tom', 10, 80), ('Jenny', 11, 90.5), ('Mike', 9, 98.5)], dtype=t)In[64]: studentsOut[64]: array([('Tom', 10, 80.0), ('Jenny', 11, 90.5), ('Mike', 9, 98.5)], dtype=[('name', '<U40'), ('number', '<i4'), ('score', '<f4')])In[65]: students[1]Out[65]: ('Jenny', 11, 90.5)
Later we will see that pandas provides a more refined way to process records.
1.2 array index
Simple subscript index:
In[30]: a[2]Out[30]: 2In[31]: b[2, 1]Out[31]: 3
Like python, the index starts from 0. A negative index is also acceptable:
In[32]: a[-1]Out[32]: 4In[33]: b[-1, -2]Out[33]: 5
Index multiple values at a time using an integer array as the subscript:
In[162]: arange(11, 20)[array([2, 4, 8])]Out[161]: array([13, 15, 19])
You can also use a Boolean value to index:
In[40]: idx = array([True, False, False, True, True])In[41]: a[idx]Out[41]: array([0, 3, 4])
This can be applied to advanced indexes, such as conditional indexes:
b[b>3]Out[42]: array([4, 4, 5, 4, 5, 6])
All the elements greater than 3 in B are returned in array format. The reason we can write this is that B> 3 returns a Boolean array, which is consistent with B, the value of each position is the result of comparing each element in B with 3:
In[43]: b>3Out[43]: array([[False, False, False, False, True], [False, False, False, True, True], [False, False, True, True, True]], dtype=bool)
1.3 array slices
The ndarray supports various forms of slicing, which can be labeled as a clue or a value as a clue. to distinguish the two, a new array is generated:
a = arange(11, 20)In[54]: aOut[54]: array([11, 12, 13, 14, 15, 16, 17, 18, 19])
Slices by Subscript:
In[55]: a[1:4]Out[55]: array([12, 13, 14])In[56]: a[1:8:2]Out[56]: array([12, 14, 16, 18])In[57]: a[1::2]Out[57]: array([12, 14, 16, 18])In[58]: a[:8:]Out[58]: array([11, 12, 13, 14, 15, 16, 17, 18])
The three parameters in square brackets are the start point, end point, and step. The default values are 0,-1, and 1 respectively. Note that the end point is not included. You can simply set the step to-1 to flip the array:
In[60]: a[::-1]Out[60]: array([19, 18, 17, 16, 15, 14, 13, 12, 11])
Ndarray also supports multi-dimensional array slicing, which can be generated by modifying the shape attribute of a one-dimensional array or calling its reshape method:
In[68]: a = arange(0, 24).reshape(2, 3, 4)In[69]: aOut[69]: array([[[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]])
The index of a multi-dimensional array is actually not much different from that of a one-dimensional array. You can use: To select ALL:
In[70]: a[:, 0, 0]Out[70]: array([ 0, 12])In[71]: a[0, :, 0]Out[71]: array([0, 4, 8])In[72]: a[0, 0, :]Out[72]: array([0, 1, 2, 3])In[73]: a[0, 0:2, 0:3]Out[73]: array([[0, 1, 2], [4, 5, 6]])
Multiple colons can also be used...
To replace:
In[74]: a[...,3]Out[74]: array([[ 3, 7, 11], [15, 19, 23]])
Finally, slice objects can be used to represent slices.1:10:2
Generate slices in the form of similar:
In[169]: idx = slice(None, None, 2)In[171]: a[idx,idx,idx]Out[170]: array([[[ 0, 2], [ 8, 10]]])
Equivalenta[::2, ::2, ::2]
1.4 array transformation
You can flatten the preceding 3D array:
In[75]: a.flatten()Out[75]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23])
Transpose:
In[77]: b.transpose()Out[77]: array([[0, 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6]])
Modify the shape attribute to change the dimension:
In[79]: a.shape = 4, 6In[80]: aOut[80]: array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17], [18, 19, 20, 21, 22, 23]])
1.5 array combination
First, create an array of the same size as:
In[83]: b = 2*a
You can combine multiple methods, such as horizontal combination:
In[88]: hstack((a, b))Out[88]: array([[ 0, 1, 2, 3, 4, 5, 0, 2, 4, 6, 8, 10], [ 6, 7, 8, 9, 10, 11, 12, 14, 16, 18, 20, 22], [12, 13, 14, 15, 16, 17, 24, 26, 28, 30, 32, 34], [18, 19, 20, 21, 22, 23, 36, 38, 40, 42, 44, 46]])
Vertical combination:
In[89]: vstack((a, b))Out[89]: array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17], [18, 19, 20, 21, 22, 23], [ 0, 2, 4, 6, 8, 10], [12, 14, 16, 18, 20, 22], [24, 26, 28, 30, 32, 34], [36, 38, 40, 42, 44, 46]])
You can use the concatenate function to implement both of these methods. by specifying the axis parameter, the default value is 0 and vertical combination is used.
You can also perform a deep combination:
In[91]: dstack((a, b))Out[91]: array([[[ 0, 0], [ 1, 2], [ 2, 4], [ 3, 6], [ 4, 8], [ 5, 10]], [[ 6, 12], [ 7, 14], [ 8, 16], [ 9, 18], [10, 20], [11, 22]], [[12, 24], [13, 26], [14, 28], [15, 30], [16, 32], [17, 34]], [[18, 36], [19, 38], [20, 40], [21, 42], [22, 44], [23, 46]]])
It is like stacking the point data of two-dimensional planes along the vertical axis.
1.6 split an array
Horizontal segmentation:
In[94]: hsplit(a, 3)Out[94]: [array([[ 0, 1], [ 6, 7], [12, 13], [18, 19]]), array([[ 2, 3], [ 8, 9], [14, 15], [20, 21]]), array([[ 4, 5], [10, 11], [16, 17], [22, 23]])]
Vertical segmentation:
In[97]: vsplit(a, 2)Out[96]: [array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11]]), array([[12, 13, 14, 15, 16, 17], [18, 19, 20, 21, 22, 23]])]
You can use the split function to achieve these two effects at the same time by setting the axis parameter difference.
Similarly, you can use the Functiondsplit
Deep segmentation.
In addition, you can use some ndarray attributes to view the array information:
In [125]:. ndim # dimension Out [124]: 2In [126]:. size # Total number of elements Out [1, 125]: 24In [2, 127]:. itemsize # bytes occupied by the element in the memory [126]: 4In [128]:. shape # dimension Out [127]: (4, 6) In [130]:. T # transpose, equivalent to the transponse function Out [129]: array ([0, 6, 12, 18], [1, 7, 13, 19], [2, 8, 14, 20], [3, 9, 15, 21], [4, 10, 16, 22], [5, 11, 17, 23], dtype = int32)
In addition, the flat attribute of multi-dimensional arrays provides a flat iterator-flatiter object, which enables us to iterate high-dimensional arrays like a one-dimensional array:
In[134]: for item in array([1, 2, 3, 4]).reshape(2, 2).flat:... print(item)...1234
The flatiter object can directly obtain multiple elements and directly assign values and modify the values:
In[140]: af = a.flatIn[141]: af[:]Out[140]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], dtype=int32)In[143]: af[3] = 15In[144]: af[:]Out[143]: array([ 0, 1, 2, 15, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], dtype=int32)
1.7 matrix generation
As mentioned above, a two-dimensional array can be used to simulate a matrix. In fact, numpy provides a special data structure for processing a matrix --matrix
, It usesmat
Function construction generation:
In[8]: m = mat('1 2 3; 4 5 6; 7 8 9')In[9]: mOut[9]: matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
Two-dimensional arrays and matrices can be easily converted to each other:
In[11]: array(m)Out[11]: array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])In[12]: mat(_)Out[12]: matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
It is more convenient to use matrix to process a matrix. There are more methods for use, such:
Inverse:
In[17]: m.IOut[17]: matrix([[ -4.50359963e+15, 9.00719925e+15, -4.50359963e+15], [ 9.00719925e+15, -1.80143985e+16, 9.00719925e+15], [ -4.50359963e+15, 9.00719925e+15, -4.50359963e+15]])
Block Matrix:
In[25]: I = eye(3)In[26]: bmat('m I; I m')Out[26]: matrix([[ 1., 2., 3., 1., 0., 0.], [ 4., 5., 6., 0., 1., 0.], [ 7., 8., 9., 0., 0., 1.], [ 1., 0., 0., 1., 2., 3.], [ 0., 1., 0., 4., 5., 6.], [ 0., 0., 1., 7., 8., 9.]])
2. Data Processing
2.1 condition judgment and search
You can use the where function to obtain indexes that meet the conditions for later processing:
In[219]: a = arange(24).reshape(4, 6)In[220]: where(a>8)Out[219]: (array([1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3], dtype=int32), array([3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5], dtype=int32))
Usecompress
The function can filter out the values that meet the conditions in a one-dimensional array:
In[28]: a[0, :].compress(a[0, :] > 2)Out[28]: array([3, 4, 5])
2.2 CSV file read/write
CSV (comma-separated value) format allows you to easily save arrays or matrices. Compared with the pickle method in python, you can use a general text editor to open and view CSV files at any time. It is easy to save and read CSV files.
In[190]: bOut[189]: array([[ 0, 2, 4, 6, 8, 10], [12, 14, 16, 18, 20, 22], [24, 26, 28, 30, 32, 34], [36, 38, 40, 42, 44, 46]])In[191]: savetxt("b.txt", b, delimiter=",")In[192]: b1, b2 = loadtxt("b.txt", delimiter=",", usecols=(3, 4), unpack=True)In[193]: b1, b2Out[192]: (array([ 6., 18., 30., 42.]), array([ 8., 20., 32., 44.]))
Saved Parametersdelimiter
Optional. It is used to separate the elements of the array. You must specify this value when reading the array. only part of the data can be read,usecols
This parameter is used to specify the selected columns,unpack
If set to True, these columns are stored separately.
You can specify the converters parameter to convert a string (such as a time) during read/write operations.
In[252]: def datestr2num(s): return datetime.datetime.strptime(str(s, encoding="utf-8"), "%Y-%m-%d").date().weekday()weeks, numbers = loadtxt("b.txt", converters={0:datestr2num}, unpack=True)In[253]: weeksOut[252]: array([ 2., 4.])
2.3 General Functions
Usefrompyfunc
A function maps a function Acting on a single value to a function acting on an array:
In[49]: def f(i):... return 2*i... In[50]: ff = frompyfunc(f, 1, 1)In[52]: ff(a)Out[52]: array([[0, 2, 4, 6, 8, 10], [12, 14, 16, 18, 20, 22], [24, 26, 28, 30, 32, 34], [36, 38, 40, 42, 44, 46]], dtype=object)
frompyfunc
The two parameters define the number of input and output parameters respectively.
In addition, numpy provides some common functions, such as add, subtract, multiply, and divide for addition, subtraction, multiplication, and division. Common functions have four methods: reduce, accumulate, reduceat, and outer. Take the add function as an example:
In[64]: add.reduce(a[0, :])Out[64]: 15In[65]: add.accumulate(a[0,:])Out[65]: array([ 0, 1, 3, 6, 10, 15], dtype=int32)In[69]: add.reduceat(a[0, :], [0, 5, 2, 4])Out[69]: array([10, 5, 5, 9], dtype=int32)In[70]: add.outer(a[0, :], a[1, :])Out[70]: array([[ 6, 7, 8, 9, 10, 11], [ 7, 8, 9, 10, 11, 12], [ 8, 9, 10, 11, 12, 13], [ 9, 10, 11, 12, 13, 14], [10, 11, 12, 13, 14, 15], [11, 12, 13, 14, 15, 16]])
It can be seen that reduce recursively applies a general function to all elements to obtain the final result. accumulate also applies recursion to all elements, but it retains the intermediate result and returns it; reduceat calculates the sum of elements based on the specified start point. If the end point is smaller than the start point, the value at the end point is returned.
3 scientific computing
3.1 statistical analysis
3.1.1 basic statistical analysis
The average function can easily calculate the weighted average value or use mean to calculate the arithmetic average value:
In[204]: a = array([1, 2])In[205]: average(a, weights=[1,2])Out[204]: 1.6666666666666667
The basic statistical analysis functions are as follows:
Median:median
Variance:var
Standard Deviation:std
Difference:diff
Maximum value:max
,min
,argmax
,argmin
(The last two obtain the subscript with the greatest value)
3.1.2 Random Process Analysis
3.2 Linear Algebra
The first element is 0 ~ Matrix of random numbers within 1:
In[47]: a = mat(fromiter((random.random() for i in range(9)), dtype = float32).reshape(3, 3))In[48]: aOut[48]: matrix([[ 0.45035544, 0.53587919, 0.57240343], [ 0.54386997, 0.16267321, 0.97020519], [ 0.6454953 , 0.38505632, 0.94705021]], dtype=float32)
Next we can perform various linear algebra operations on it, such:
Inverse:
In[49]: a.IOut[49]: matrix([[-10.71426678, -14.01229095, 20.83065987], [ 5.42686558, 2.7832334 , -6.13131571], [ 5.09620285, 8.41894722, -10.64905548]], dtype=float32)
Solving Linear Equations (vertices are used to verify the results ):
In[59]: b = fromiter((random.random() for i in range(3)), dtype = float32)In[60]: bOut[60]: array([ 0.56506187, 0.99419129, 0.70462942], dtype=float32)In[61]: linalg.solve(a, b)Out[61]: array([-5.3072257 , 1.51327574, 3.74607611], dtype=float32)In[63]: dot(a, _)Out[63]: matrix([[ 0.56506193, 0.99419105, 0.70462948]], dtype=float32)
Evaluate the feature value and feature vector:
In[64]: linalg.eig(a)Out[64]: (array([ 1.78036737, -0.08517434, -0.13511421], dtype=float32), matrix([[-0.5075314 , -0.82206506, 0.77804375], [-0.56222379, 0.4528676 , -0.57155234], [-0.65292901, 0.34513769, -0.26072171]], dtype=float32))
Determinant:
In[81]: linalg.det(a)Out[81]: 0.020488938