NumPy using arrays for data processing

Source: Internet
Author: User
Tags python list

Describe conditional logic as an array operation

Numpy.where () is an expression of three-mesh operation

1 in [the]: Xarr = Np.array ([1.1,1.2,1.3,1.4,1.5])23 in []: Yarr = Np.array ([ 2.1,2.2,2.3,2.4,2.5])45 in [approx]: Condi = Np.array ([True,false,true,true,false])

Assuming that there are three arrays above, when the value in Condi is true, select a value from Xarr, or select a value from Yarr to form a new array. Use the normal list derivation as follows:

result = [(x if C else y) for x,y,c in Zip (Xarr,yarr,condi)]

But this approach has drawbacks: when it comes to large amounts of data processing, the speed is not fast (slow is Python's "feature"). cannot be applied to a multidimensional array.

Using the Where method is a lot easier.

result = Np.where (Condi,xarr,yarr)

The second and third arguments of numpy.where do not necessarily have to be arrays, they can also be scalars.

Suppose we want to generate a new array based on Condi, and if the value in Condi is true, select the number 1, otherwise the number 0.

in [+]: res = np.where (condi,1, 0) in []: resout[]: Array ([1, 0, 1, 1, 0])

In a multidimensional array, use "+" instead of a positive number, "-" instead of a negative number

in []: arr = np.random.randn (bis)) in [47]: arrout[47]:array ([[-0.33641281,-0.56924078, 0.25727917,-0.35087934],       [-0.00734107,-0.47985579,-1.35289703,-1.31366566],       [-0.71342875,-0.21957414,-1.25596815, 0.0859283 ],       [-0.93246019,-0.61227975,-0.87573005, 1.4124276]]) in []: Np.where (arr>0,"+","-") out[48]:array ([['-','-','+','-'],       ['-','-','-','-'],       ['-','-','-','+'],       ['-','-','-','+']], dtype='<u1')

Where can also implement multi-condition operations

In [Wuyi]: Np.where (Cond1 &cond2, 0, Np.where (cond1,1,np.where (cond2,2,3)))# similar to li=  [] for in Zip (cond1,cond2):    if and y:        li.append (0)     elif x:        li.append (1)    elif  y:        li.append (2)     Else:        li.append (3)
Mathematical and statistical methods

Sum, mean, and STD can be called either as an array method or as a top-level function of numpy.

in [[+]: arr = Np.arange (). Reshape (3,5) in [+]: arrout[]:array ([[0,  1,  2,  3,  4],       5,  6,  7,  8,  9],       [Ten, One, a, a.]]) # method call as Array In [All]: Arr.sum () out[[]: [+]:arr.mean () out[[]: 7.0#  Top method call in [NumPy ]: Np.mean (arr) out[68]: 7.0

Functions such as mean and sum can accept a parameter that calculates the statistical value of the axis upward, and the end result is an array of one dimension less

in []: arr = np.arange. Reshape (3,4,5) in [70]: arrout[70]:array ( [[[[0],1, 2, 3, 4],        [ 5, 6, 7, 8, 9],        [10, 11, 12, 13, 14],        [15, 16, 17, 18, 19]],       [[20, 21, 22, 23, 24],        [25, 26, 27, 28, 29],        [30, 31, 32, 33, 34],        [35, 36, 37, 38, 39]],       [[40, 41, 42, 43, 44],        [45, 46, 47, 48, 49],        [50, 51, 52, 53, 54],        [55, 56, 57, 58, 59]] ) in [[]: arr.sum (axis = 1)#the value of the parameter is the index of shape, and I don't know what shape can go to see NumPy basics that blogout[71]:array ([[30, 34, 38, 42, 46],       [110, 114, 118, 122, 126],       [190, 194, 198, 202, 206]])

SUM (Axis=1) aggregates the array of the specified dimension to sum

Other methods, such as Cumsum and Cumprod, do not converge, but instead produce an array of intermediate results:

in []: arr = Np.array ([[0,1,2],[3,4,5],[6,7,8]]) in [73]: arrout[73]:array ([[0,1, 2],       [3, 4, 5],       [6, 7, 8]]) in [74]: arr.cumsum () out["The": Array ([0, 1, 3, 6, ten, [+], dtype=,int32) in [75]: arr.cumsum (0) out[75]:array ([[0,1, 2],       [ 3, 5, 7],       [ 9, [dtype=]],int32) in [: Arr.cumsum (1) out[76]:array ([[0,1, 3],       [ 3, 7, 12],       [ 6, dtype=)],int32) in [[Arr.cumprod]: 1) out[77]:array ([[0, 0, 0], [3, 12, 60],       [  6, 336]], Dtype=int32)

Use as a top-level function

In [Max]: np.cumsum (arr) out[: Array ([0,  1,  3, 6, ten, 79,,  ], Dtype=int32) in [ ]: np.cumsum (Arr,axis =0) out[]:array ([[0,  1,  2],       3,  5,  7 ],       9, [[]], Dtype=int32)
Methods for Boolean arrays: Sum, any, and all
In [the]: Bools = Np.array ([True,false,true,true,false]) in [the]: bools.sum () out[: 3 in[84 ]:in [[+]: Bools.any () out[:Truein []: Bools.all () out[]: false# top function in [Np.all]: bools [out[]:Falsein []: np.sum (bools) out[[3][88 ]:
Sort

method is basically the same as a Python list

In [the]: arr = np.random.randn (8) in [94]: arrout[94]:array ([-2.97429771,  0.37645009,- 0.04291609, -0.61994895, -0.26251303,       -1.1557209, -0.19910847, -0.11393288]) in [+]: Arr.sort () in []: arrout[]:array ([-2.97429771,-1.1557209,-0.61994895,-0.26251303,- 0.19910847,       -0.11393288, -0.04291609,  0.37645009])

For multidimensional arrays, you can specify the axis parameter, which is used for any one axis to sort up

In [the]: arr = np.random.randn (4,5) in [98]: arrout[98]:array ([[-0.78510617,-0.02370449,-0.12615757,-0.15039283,-1.00503264],       [ 0.24344011,-1.91231612, 0.80572501,-0.6740432,-1.62471378],       [-0.09096377, 1.79134715,-0.28566318,-0.8119145,-0.20454602],       [ 0.02648784, 0.57795444,-0.53447708,-0.74497177,-0.04684859]]) in [[Arr.sort]: (1) in [100]: arrout[100]:array ([[-1.00503264,-0.78510617,-0.15039283,-0.12615757,-0.02370449],       [-1.91231612,-1.62471378,-0.6740432, 0.24344011, 0.80572501],       [-0.8119145,-0.28566318,-0.20454602,-0.09096377, 1.79134715],       [-0.74497177,-0.53447708,-0.04684859, 0.02648784, 0.57795444]]) in [101]: arr = np.random.randn (4,5) in [102]: arrout[102]:array ([[-0.99257127, 0.36384095, 1.14265096, 0.23094948, 1.42900315],       [ 0.07606583, 1.53456921, 1.15069057,-0.78014895, 0.24934741],       [ 0.63191444, 0.23237672, 0.4590821, 0.01904812, 1.63680472],       [-1.24936364,-0.44730791,-0.30612594,-1.05307121, 1.28685507]]) in [103]: Arr.sort (0) in [104]: arrout[104]:array ([[-1.24936364,-0.44730791,-0.30612594,-1.05307121,-0.24934741],       [-0.99257127, 0.23237672, 0.4590821,-0.78014895, 1.28685507],       [ 0.07606583, 0.36384095, 1.14265096, 0.01904812, 1.42900315],       [ 0.63191444, 1.53456921, 1.15069057, 0.23094948, 1.63680472]])

It is important to note that the top-level sort function returns the array to the sorted copy, while the in-place sort modifies the array itself.

In [the]: arr = np.random.randn (4,5) in [106]: Arr_repeat=np.sort (Arr,axis =1) in [107]: arr_repeatout[107]:array ([[-0.64056336, 0.14082859, 0.44317426, 0.60988308, 0.77472024],       [-1.63521891, 0.39869871, 0.55635461, 0.58039867, 0.59073797],       [-1.62714899,-0.66642289,-0.16457651, 0.09046719, 0.5139126 ],       [-0.79493979, 0.12287039, 0.50570075, 1.08870126, 1.34838367]]) in [108]: arrout[108]:array ([[0.60988308, 0.44317426, 0.14082859, 0.77472024, 0.64056336],       [ 0.59073797, 0.55635461, 0.58039867,-1.63521891, 0.39869871],       [-0.16457651,-1.62714899,-0.66642289, 0.5139126, 0.09046719],       [ 0.50570075, 1.34838367, 0.12287039, 1.08870126,-0.79493979]])

Sort also has two parameters kind and order,kind are algorithms for specifying sorting, default is fast, and heap sort and merge Sort "Quicksort,mergesort,heapsort". Order: A string or list that can be set to sort by a property

ImportNumPy as NP>>> Dtype = [('Name','S10'), ('Height', float), (' Age', int)]>>> values = [('Li', 1.8, 41), ('Wang', 1.9, 38), ('Duan', 1.7, 38)]>>> a = Np.array (values, dtype=Dtype)>>> Np.sort (A, order='Height')#Sort by property height, at which time the argument is a stringArray ([('Duan', 1.7, 38), ('Li', 1.8, 41), ('Wang', 1.9, 38)], Dtype=[('Name','| S10'), ('Height','<f8'), (' Age','<i4')])>>> Np.sort (A, order=[' Age','Height']) #Sort by attribute age first, if Age equals, and then by height, when the argument is a listArray ([('Duan', 1.7, 38), ('Wang', 1.9, 38), ('Li', 1.8, 41)], Dtype=[('Name','| S10'), ('Height','<f8'), (' Age','<i4')])
Uniqueness and some other sets of logical operations

The uniqueness is actually to go heavy. Ufunc is Numpy.unique ()

In [119]: My_list = Np.array ([1,3,4,6,7,4,3,1,2]) in [+]: Np.unique (my_list) out[]: Array ([1, 2, 3 , 4, 6, 7])

Note: The array itself does not have a unique method.

Aggregate functions of NumPy

NumPy using arrays for data processing

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.