Using Python for data analysis--numpy basics: Arrays and Vector computing
- Ndarry, a multidimensional array with vector operations and complex broadcast capabilities for fast space-saving
- Standard mathematical function for fast operation of whole set of data without For-loop
- Tools for reading and writing disk data, and tools for manipulating memory-mapped files?
- Linear algebra, random number generation, and Fourier transform functions
- Tools for integrating code such as C + +
Ndarry: A multidimensional Array object 1, creating a Ndarry
#一维In [5]: data = [1,2,3]In [6]: import numpy as npIn [7]: arr1 = np.array(data)In [8]: arr1Out[8]: array([1, 2, 3])#二维In [11]: data2 = [[1,2,3],[4,5,6]]In [12]: arr2 = np.array(data2)In [13]: arr2Out[13]:array([[1, 2, 3], [4, 5, 6]])#查看数组的信息In [15]: arr2.shapeOut[15]: (2, 3)In [16]: arr2.dtypeOut[16]: dtype(‘int32‘)
Array creation function
Array ()
Arange (), similar to Python's built-in function range (), but range () returns a list
Ones,zeros creates an array of all 1/0, but passes in a set of parameters, such as Np.ones ((2,3))
Ones_like,zeros_like Create a full 1/0 array that is identical to the array shape passed in
Empty,empty_like creates an empty array, allocates memory, does not store value
Eye,identity creating matrices
2. Operations between arrays and scalars
In [36]: arr2Out[36]:array([[1, 2, 3], [4, 5, 6]])In [37]: arr3Out[37]:array([[11, 12, 13], [14, 15, 16]])#加In [38]: arr2+arr3Out[38]:array([[12, 14, 16], [18, 20, 22]])#乘In [39]: arr2*arr3Out[39]:array([[11, 24, 39], [56, 75, 96]])#减In [40]: arr3-arr2Out[40]:array([[10, 10, 10], [10, 10, 10]])#除In [41]: arr3/arr2Out[41]:array([[11. , 6. , 4.33333333], [ 3.5 , 3. , 2.66666667]])#平方In [42]: arr2**2Out[42]:array([[ 1, 4, 9], [16, 25, 36]], dtype=int32)
3. Indexes and slices
Index:
arr2d[0,0]或者是arr2d[0][0]arr3d[0,0,0]或者是arr3d[0][0][0]
Slices: With : mark
arr2d[:2,:2]arr3d[:2,:2]
Array and list operations are distinguished first
The slice of the array is done on the original array, and the slice operation of the list is the assignment of the data
If you need to slice a copy instead of the source array itself, you need toarr[5:8].copy()
#列表的切片>>> l1 = list(range(10))>>> l1[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]>>> l2 = l1[5:8]>>> l2[5, 6, 7]>>> l2[0]=15>>> l2[15, 6, 7]>>> l1[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]#数组的切片In [50]: arr = np.arange(10)In [51]: arrOut[51]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])In [52]: arr_slice = arr[5:8]In [53]: arr_sliceOut[53]: array([5, 6, 7])In [54]: arr_slice[0]=15In [55]: arr_sliceOut[55]: array([15, 6, 7])In [56]: arrOut[56]: array([ 0, 1, 2, 3, 4, 15, 6, 7, 8, 9])#二维数组的切片In [95]: arr2dOut[95]:array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])In [96]: arr2d[:2]Out[96]:array([[1, 2, 3], [4, 5, 6]])
Multiple slices can be passed in at a time
In [97]: arr2d[:2,:1]Out[97]:array([[1], [4]])In [98]: arr2d[:2,:2]Out[98]:array([[1, 2],#3维In [83]: arr3dOut[83]: [[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]]In [84]: arr3d[1]Out[84]: [[7, 8, 9], [10, 11, 12]]In [85]: arr3d[1][1]Out[85]: [10, 11, 12]In [86]: arr3d[1][1][1]Out[86]: 11In [87]: arr3d[1][1][2]Out[87]: 12
Boolean index
#[True,False,True]就相当有是取第0/2行In [121]: arr2d[[True,False,True]]Out[121]:array([[1, 2, 3], [7, 8, 9]])In [122]: arr2d[[True,False,True],2]Out[122]: array([3, 9])
Fancy Index
#与上边的博布尔型索引一样,也是取第0/2行In [132]: arr2d[[0,2]]Out[132]:array([[1, 2, 3], [7, 8, 9]])#花式索引注意以下问题
Fancy indexes, unlike slices, always copy data into a new array, causing the following behavior
In [136]: arr2d[[0,2],[0,2]]Out[136]: array([1, 9])In [137]: arr2d[[0,2]][:,[0,2]]Out[137]:array([[1, 3], [7, 9]])
Array Transpose and Axis swapping
Transpose is a special form of remodeling that returns a view of the source data and does not replicate.
In [142]: arr2d.TOut[142]:array([[1, 4, 7], [2, 5, 8], [3, 6, 9]])
4. Functions for manipulating elements of an array
Manipulating functions on individual array elements
- ABS Calculation Absolute Value
- sqrt calculating the square root of each element
- Square calculates the squared of each element
- Exp calculates the e-base exponent of each element
- log/log10/log2/log1p log1p is log (1+x)
- Sign calculates the positive and negative signs of each element
- Ceil calculates the smallest integer greater than or equal to the element
- Floor calculates the maximum integer less than or equal to the element
- Rint rounding the element to the nearest integer
- MODF returns the fractional and integral parts of the element, in the form of two independent arrays
- isNaN is not a number to determine whether the elements are numbers
- Isfinite Isinf judged the elements to be poor and infinite.
- Cos/sin/tan
- Arccos/acccosh/arcsin
Functions that operate on two array elements
- Add adds elements from an array
- Subtract elements in the first array minus the elements in the second array
- Multiply array corresponding element multiplication
- Divide Division of Floor_divide Division, discarding remainder
- Power (A, B) calculates the element A in B
- MoD to find the remainder of division
- Copysign assigns the element symbol in the second array to the value in the first array
-
< >= <= = = = To compare the values of the corresponding elements
- Logical_and/logical_or/logical_xor
5. Some operations that can be handled using arrays
Vectorization Convenient operation
Ternary operations
In [6]: xarr = np.array([1.1,1.2,1.3,1.4,1.5])In [7]: yarr = np.array([2.1,2.2,2.3,2.4,2.5])In [8]: cond = np.array([True,False,True,True,False])In [9]: result = [x if c else y for x ,c ,y in zip(xarr,yarr,cond)]In [10]: resultOut[10]: [1.1, 1.2, 1.3, 1.4, 1.5]
# np.where
typically used to generate another array based on an array
In [11]: result2 = np.where(cond,xarr,yarr)In [12]: result2Out[12]: array([1.1, 2.2, 1.3, 1.4, 2.5])
Mathematical and statistical methods
These methods can be either as instance method calls arr2d.sum()
or throughnp.sum(arr2d)
- Sum calculates all the elements and
- Mean calculating the mean value of all elements
- Std/var calculating standard deviations and variances
- Min/max maximum value and minimum value
- Argmin/argmax Index of minimum and maximum values
- Cumsum returns an array of all the elements accumulated and
- Cumprod Cumulative product of all elements
Methods for arrays of Boolean types
#True直接当1计算In [24]: (arr2d<4).sum()Out[24]: 3In [25]: condOut[25]: array([ True, False, True, True, False])In [26]: cond.any()Out[26]: TrueIn [27]: cond.all()Out[27]: False
Sort
- Np.sort () This will copy a copy.
- Arr2d.sort () is the operation on the source data
5. Input and output for array files
To save an array to disk in binary form
accessing text files
- Np.loadtext ()
- Np.savetext ()
6. Linear algebra is not found when the Numpy.linalg
- Note: Transpose arr. T
- Np.dot (ARR1,ARR2) The product of two matrices
- Np.diag return a diagonal element/or convert one-dimensional array to a square with this diagonal
- Trace () calculates the diagonal and
- Det calculating determinant values of f-matrices
- Eig Calculating eigenvalues and eigenvectors
- Inv Calculation Inverse Matrix
- PINV calculating pseudo-inverse matrices
- QR computing QR decomposition
- SVD computes singular value decomposition
- Solve solving linear equations ax=b
- LSTSQ calculation of the least squares solution of ax=b
7. Random number generation Numpy.random complements Python's built-in random
- Seed determines the seeds of random number generation
- Permutation returns the random arrangement of a sequence or returns a randomly arranged range
- Shuffle to a sequence in-place random arrangement
- Rand produces evenly distributed sample values
- Randint randomly selects integers from a given upper and lower range
- Randn sample values that produce a normal distribution
- Binomial produces a sample value of two distributions
- Normal produces a sample value of two distributions
- Beta-generated sample values for beta distributions
- Chisquare sample values that generate chi-square distribution
- Gamma produces a gamma distribution of sample values
- Uniform generation (0,1) evenly distributed sample values
Using Python for data analysis--numpy basics: Arrays and Vector computing