Data Analysis Learning Notes (iii)--numpy: Built-in functions (general functions, mathematical and statistical methods, sets) __ functions

Source: Internet
Author: User
Tags abs arithmetic diff natural logarithm square root
Common Functions

A common function (UFUNC) is a function of performing element-level operations on data in Ndarray

# example Array
a = Np.array ([ -1,2.1,0.2,2.6,9.1])  # [ -1.   2.1  0.2  2.6  9.1]
b = Np.arange (1,len (a) +1)           # [1 2 3 4 5]
A unary function
function Description Example Results
ABS, Fabs Calculates the absolute value of integers, floating-point numbers, and complex numbers, and for non complex values, you can use faster pairs of fabs Np.abs (a) [1.2.1 0.2 2.6 9.1]
sqrt Calculates the square root of each element, equivalent to arr**0.5 Np.sqrt (b)
Square Calculates the square of each element, equivalent to Arr**2 Np.square (b) [1 4 9 16 25 36 49 64 81]
Exp Calculate index E (x) for each element
Log, log10, log2, log1p The natural logarithm (base e), log 10, base 2 log, log (1+x)
Sign Calculates the symbols for each element, 1 (positive), 0 (0),-1 (negative) Np.sign (a) [-1. 1.1. 1.1.]
Ceil Rounding up Np.ceil (a) [-1. 3.1. 3.10.]
Floor Rounding down Np.floor (a) [-1. 2.0. 2.9.]
Rint Rounded, reserved dtype Np.rint (a) [-1. 2.0. 3.9.]
Modf Returns the decimal and integral parts of an element as two separate arrays NP.MODF (a) (Array ([ -0, 0.1, 0.2, 0.6, 0.1]), Array ([-1., 2., 0., 2., 9.])
Nonzero Splits the row and column coordinates of all non-0 elements, and makes up two matrices about rows and columns respectively Np.nonzero (a) (Array ([0, 1, 2, 3, 4])
Clip Cutting elements Np.clip (A, 0, 5) is equivalent to A.clip (0,5) [0.2.1 0.2 2.6 5.]
isNaN Returns a Boolean array, where the element of true is a Nan value
Isfinite, Isinf Returns a Boolean array in which the elements of the true position are either poor or infinite
Cos, conh, sin, sinh, tan, Tanh Common type and hyperbolic trigonometric functions
Arccos, Arccosh, Arcsin, Arcsinh, Arctan, Arctanh Inverse Trigonometric Functions
Logical_not Calculates the true value of not x for each element, equivalent to-arr
Two-element function
function Description Example Results
Add Add Np.add (a,b) is equivalent to A+b [0.4.1 3.2 6.6 14.1]
Subtract The first array minus the second array Np.subtract (a,b) is equivalent to A-b [-2 0.1-2.8-1.4 4.1]
Multiply Multiply Np.multiply (a,b) is equivalent to A*b [-1.4.2 0.6 10.4 45.5]
Divide, Floor_divide Divide or finish division and then rounding down Np.divide (a,b) equals a/b;np.floor_divide (a,b) equivalent to Np.floor (A/b)
Power Pow (a,b), A's B-second side Np.power (A,B)
Maximum, Fmax The maximum value in the element, Fmax ignores Nan Np.maximum (A,b), Np.fmax (a,b)
Minimum, fmin element with the minimum value, Fmin ignores Nan
MoD Model finding Np.mod (A,B)
Copysign Copies the symbol of the value in the second array to the value in the first array Np.copysign (B,a) [-1. 2.3. 4.5.]
Greater, greater_equal, less, less_equal, equal, not_equal >, >=, <, <=, =,!=
Logical_and, Logical_or, Logical_xor Element-level Truth-logic operations, equivalent to infix operators &, |, ^
Mathematics and Statistical methods
function Description Example Results
Sum Sum the elements of all or some axes in the array Np.sum (a) or a.sum () 13.0
Mean Arithmetic mean, the mean of an array of 0 lengths is Nan Np.mean (a) or A.mean () 2.6
Average Weighted average, the weight of the same, can also be regarded as the time arithmetic mean Np.average (a) 2.6
Median Median, the middle number of an ordered sequence of numbers, or even, an average Np.median (a) 2.1
STD, VAR Standard deviation and variance are obtained, and the degrees of freedom are adjustable (the default is N) A.STD (), A.var () 3.4991427521608776, 12.244
Min, max Minimum value and maximum value
Argmin, Argmax Indexes with minimum and maximum elements, respectively A.argmin (), A.argmax () 0, 4
Diff Diff (A, n=1, axis=-1), the difference between the last and the previous one, the parameter n represents the N-round operation, the multidimensional array, which can be controlled by axis Np.diff (a) [3.1-1.9 2.4 6.5]
Cumsum All elements and accumulations and (arrays) A.cumsum () [-1.1.1 1.3 3.9 13.]
Cumprod Cumulative product of all elements (array) Np.cumprod () [-1 -2.1-0.42-1.092-9.9372]

Note:
The above example is a one-dimensional array, if it is a two-dimensional array invocation method is similar, but you can use the parameter axis to specify the direction, 1 is horizontal, 0 is vertical

arr = Np.arange. Reshape (4,6) ' "[
[0  1  2  3  4 5  ]
 [6  7  8 9 10-11]
 [Next]
 [A]]
 "
# sum
arr.sum ()           # 276   sum
arr.sum (axis=0)     #  [MB]
# arithmetic average
Arr.mean ()          #      The arithmetic mean of the total number of 11.5
Arr.mean (Axis=1)    # [2.5  8.5 14.5 20.5] vertical arithmetic average

About weighted average: Average function

arr = Np.arange. Reshape (2,5)
'
[[0 1 2 3 4]
 [5 6 7 8 9]]
 '
arr.mean ()          # 4.5 arithmetic average 
np.a Verage (arr)     # 4.5 can be seen as arithmetic mean
np.average (arr, Axis=1) # [2. 7.], given direction
np.average (arr, Weights=np.arange (arr1.size). Reshape (2,5))  # passed the weight of 6.333333333333333
Collection
# example Array
s0 = Np.array ([1,2,3,2,1,4,5,2])    # [1 2 3 2 1 4 5 2]
S1 = np.arange (0,30,2)  # [0  2  4  6  8
s2 = np.arange (0,30,3)  # [0  3  6  9 12 15 18 21 24 2 7]
function Description Example Results
Unique (x) evaluates the unique element in X and returns an ordered result np.unique (s0) [1 2 3 4 5]
INTERSECT1D (x,y) intersection, and returns an ordered result np.intersect1d (s1,s2) [0 6]
union1d (x,y) set and return ordered results np.union1d (s1,s2) [0 2 3 4 6 8 9 10 12
setdiff1d (x,y) Collection difference, that is, the element is in X and no longer in Y np.se TDIFF1D (S1,S2) [2 4 8]
setxor1d (x,y) Collection symmetric difference, only X and Y The collection of elements in np.setxor1d (s1,s2) [2 3 4 8 9-a]
in 1d (x,y) to get a Boolean row array of "X's elements contained in Y" np.in1d (s2,s1) [True False True to false true false False True false]

Note: The number of elements and shape can be different for array 1 and 2.
The S1 and S2 in the above example are one-dimensional, but the numbers are not the same; To verify that the set operation has nothing to do with shape, we will change the shape of S1 and S2.

S1 = S1.reshape (3,5)
' '
[[0  2  4 6 8] [[]]
 '
s2 = S2.reshape (2,5)
'
[[0  3  6  9] [m]
 ]
 '
np.intersect1d (S1,  S2]   #
np.union1d (S1,S2)       # [0  2  3  4  6  8 9 10 12 14 15 16 18
np.setdiff1d] (S2,S1)     # [3  9 15 21 27]
Add (where, sort, any, all)

where

The WHERE function is a three-mesh operator, where (condition, x, y),
Complete work similar to the following

if (condition):
x
else:
y

Example 1: There are Xarr and yarr two arrays, which need to select data according to condition

Xarr = Np.array (Np.arange (1.1, 1.6, 0.1)) Yarr = Np.array (np.arange, 2.1
, 2.6))
0.1 = cond ([True, Np.array E, True, true, False]

In the Python syntax:

result = [x if c else y as X, y, C in Zip (Xarr, Yarr, cond)]
output:
[1.1, 2.2, 1.3000000000000003, 1.400000000000000 4, 2.5000000000000004]
is very inconvenient, and there are data exception problems

To use the WHERE function in NumPy:

result = Np.where (cond, Xarr, Yarr)
output:
[1.1 2.2 1.3 1.4 2.5]

Example 2: The portion of the ARR array that is less than 0 is reset to 0, and the remainder retains

arr = Np.random.randn (4,4)  
output: [
[0.40336609-1.42094364-1.1257582   0.2787659]
 [-0.64618146- 0.56508989  0.20527747  1.8542685]
 [ -0.39792887  0.94738928-0.68713023  0.60328758]
 [ -0.94495984-1.47217366  0.03280616-0.13120201]]
arr = np.where (arr>0, arr, 0)
output:
[[ 0.40336609 0.         0.         0.2787659]
 [0.         0.         0.20527747 1.8542685]
 [0.         0.94738928 0.         0.60328758]
 [0.         0.         0.03280616 0.        ]]

Example 3: Case of complex nesting

Cond1 = Np.array ([True, False, True, True, false])
Cond2 = Np.array ([True, True, True, False, false]) result
= []

Python syntax:

For I in range (len (cond1)):
    if cond1[i] and Cond2[i]:
        result.append (0)
    elif cond1[i]:
        result.append (1 )
    elif Cond2[i]:
        result.append (2)
    else:
        result.append (3)
print (Result)           # [0, 2, 0, 1, 3]

To use the WHERE function in NumPy:

result = Np.where (cond1&cond2, 0,
             np.where (cond1, 1,
                  np.where (Cond2, 2, 3))-
list (result)     # [ 0, 2, 0, 1, 3]

Note: The where function can simply pass the condition and return the true value array of the Conditional object

arr = Np.random.randn
np.where (arr>0)      # (Array ([1, 2, 3, 6, 9])

If it is a multidimensional array, return is also an array, respectively, to return the index of latitude array

Cond1 = Np.array ([True, False, True, True, false])
Cond2 = Np.array ([True, True, True, False, false])
arr = Np.arr Ay ([cond1,cond2])
np.where (arr)
# (Array ([0, 0, 0, 1, 1, 1]), array ([0, 2, 3, 0, 1, 2])
# that is [(0,0), (0,2), (0,3 ), (1,0), (1,1), (1,2)] Location
Sort sorting
# multidimensional Array, you can specify the direction
arr = np.random.randn. Reshape (4,5)
"
[ -0.94603557-0.18393318  0.11450866  0.40325255  0.45881851]
 [1.17704035-0.41401001  0.75339636-0.43745415  2.7929479]
 [- 0.28784153-1.48745643-0.07142102-0.5482369  -0.22610164]
 [1.35561729-1.08766432  0.83278514- 1.32299757  0.04410116]]
 '
np.sort (arr, axis=0)     # vertical sort (default to horizontal sort) '
[0.94603557 -1.48745643-0.07142102-1.32299757-0.22610164]
 [ -0.28784153-1.08766432  0.11450866-0.5482369   0.04410116]
 [1.17704035-0.41401001  0.75339636-0.43745415  0.45881851]
 [1.35561729- 0.18393318  0.83278514  0.40325255  2.7929479]]
 ' # one-dimensional
 array
arr = Np.array ([2,6,4,2,1,4 ]
Arr.sort ()      # This sort of way will directly change the "original" array, using the Np.sort () method will produce a new sorted array without changing the original array
print (arr)      # [2 6 4 2 1 4]

Example: I want to know what the 25%-point number of a set of data is.

# produces a set
of data arr = Np.random.randn. Reshape (4,5)
# 1. We first convert it to a one-dimensional array and sort it out
arr = Arr.flatten ()
# Sort
Arr.sort ()
# get 25% subscript data
value = Arr[int (0.25*len (arr))] # get 25%-digit
print (arr)
'
[- 1.8819284  -1.84223613-1.55037549-1.19713841-0.91661269-0.69222229
 -0.6796624  -0.65882803- 0.55325753-0.34502426-0.1197655   0.36925446
  0.5343373   0.62780224  0.74335279  0.82012463  1.00546263  1.08559715
  1.29212188  1.47629451]
  '
print (value)    # 0.69222229

All, any

All: Whether true or False if both returns True
Any: Whether true exists or False if True returns True

arr = Np.array ([true,false,true,true,false])
Arr.all ()    # False
arr.any ()    # True

Statistical methods for Boolean arrays

arr = Np.random.randn (
arr>0). SUM ()      

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.