Data analysis and presentation-NumPy database entry, numpy database for data analysis

Source: Internet
Author: User
Tags base 10 logarithm natural logarithm scalar python list

Data analysis and presentation-NumPy database entry, numpy database for data analysis

This is the note of my course "Python data analysis and display" from songtian, Beijing University of Technology. The course has outstanding emphasis and clear layers. Here, I would like to thank you for your wonderful explanation.

NumPy library entry data dimension

A dimension is the organization of a group of data. A data dimension is a concept that forms a specific relationship between data to express multiple meanings.

One-dimensional data:

One-dimensional data is composed of ordered or unordered data of the peering relationship and is organized in a linear manner. Corresponds to concepts such as list, array, and set.

List and array: The ordered structure of a group of data.

Differences:

List: data types can be different

Array: Same Data Type

Two-dimensional data:

Two-dimensional data is composed of multiple one-dimensional data. It is a combination of one-dimensional data.

A table is a typical two-dimensional data. The header is a part of two-dimensional data.

Multi-dimensional data:

Multidimensional Data is extended by one or two dimensional data in the new dimension. For example, add a table with a time dimension

High-dimensional data:

High-dimensional data only uses the most basic binary relationship to display the complex structure between data. Key-value pairs are used to organize data.

Python representation of data dimension

One-dimensional data: List (ordered) and set (unordered) Types

Two-dimensional data: List type

Multi-dimensional data: List type

High-dimensional data: dictionary type or data representation format (JSON, XML, YAML)

NumPy array object: ndarray

NumPy is an open-source basic Python scientific computing library. NumPy provides a powerful n-dimensional array object ndarray, broadcast function, integration of C/C ++/Fortran code tools, linear algebra, Fourier transformation, random number generation and other functions. NumPy is the basis of data processing or scientific computing libraries such as SciPy and Pandas.

Numpy reference:
import numpy as np

Although the alias can be omitted or changed, we recommend that you use the alias mentioned above.

Benefits of introducing ndarray:

Example: Calculate A2 + B3, where A and B are one-dimensional arrays.

def pySum():    a = [0,1,2,3,4]    b = [9,8,7,6,5]    c = []    for i in range(len(a)):        c.append(a[i]**2 + b[i]**3)    return cprint(pySum())
import numpy as npdef npSum():    a = np.array([0,1,2,3,4])    b = np.array([9,8,7,6,5])    c = a**2 + b**3    return cprint(npSum())

Array objects can remove the loops required for Inter-element operations, making one-dimensional vectors more like a single data. Setting special array objects can improve the computing speed of such applications after optimization.

Observation: In scientific computing, all data types in a dimension are often the same.

Array objects use the same data type, which helps save operation and storage space.

N-dimensional array object: ndarray

Ndarray is a multi-dimensional array object consisting of two parts: actual data, metadata describing the data (data dimension, data type, etc ). Ndarray generally requires that all elements have the same type (homogeneous), and the array subscript starts from 0.

Use np. array () to generate an ndarray (the alias of ndarray in the program is array). np. array () is output in the [] format, and elements are separated by spaces.

  • Axis: dimension for saving data
  • Rank: number of axes

Example: generate an ndarray

In [1]: import numpy as npIn [2]: a = np.array([[0,1,2,3,4],   ...:             [9,8,7,6,5]])   ...:             In [3]: aOut[3]: array([[0, 1, 2, 3, 4],       [9, 8, 7, 6, 5]])In [4]: print(a)    [[0 1 2 3 4]    [9 8 7 6 5]]
Attributes of the ndarray object
Attribute Description
. Ndim Rank, that is, the number of axes or the number of dimensions
. Shape The size of the ndarray object. For the matrix, n rows and m Columns
. Size The number of elements in the ndarray object, which is equivalent to the value of n * m in. shape.
. Dtype Ndarray object element type
. Itemsize The size of each element in the ndarray object, in bytes.

Example: Test ndarray attributes

In [5]: a.ndimOut[5]: 2In [6]: a.shapeOut[6]: (2, 5)In [7]: a.dtypeOut[7]: dtype('int32')In [8]: a.itemsizeOut[8]: 4
Element type of ndarray
Data Type Description
Bool Boolean, True or False
Intc It is consistent with the int type in C language, generally int32 or int64.
Intp The integer used for the index, which is consistent with the C language sszie_t, int32 or int64
Int8 An integer of the byte length. Value: [-128,127]
Int16 A 16-digit integer. Value: [-32768,32767]
Int32 32-bit integer; Value: [-231,231-1]
Int64 An integer of 64-bit length. Value: [-263,263-1]
Uint8 8-digit unsigned integer; Value: [0,255]
Uint16 16-digit unsigned integer; Value: [0,255]
Uint32 32-bit unsigned integer; Value: [0,232-1]
Uint64 64-bit unsigned integer; Value: [0,264-1]
Float16 16-bit half-precision floating point number: 1-bit symbol bit, 5-bit index, 10-bit ending number (Symbol) * 10 index)
Float32 32-bit half-precision floating point number: 1-bit symbol bit, 5-bit index, 23-bit ending number
Float64 64-bit half-precision floating point number: 1-bit symbol bit, 11-bit index, 23-bit ending number
Float64 64-bit half-precision floating point number: 1-bit symbol bit, 11-bit index, 52-bit ending number
Plural: real (. real) + j virtual (. imag)
Complex64 The plural type. Both the real and virtual parts are 32-bit floating point numbers.
Complex128 The plural type. The real and virtual parts are 64-bit floating point numbers.

Comparison: Python syntax only supports integer, floating point, and plural types. Why ndarray supports multiple element types:

  • Scientific Computing involves a large amount of data and imposes high storage and performance requirements.
  • Fine-grained definition of element types helps Numpy properly use buckets and optimize performance.
  • The fine definition of element types helps programmers to evaluate the program scale reasonably.
Non-homogeneous ndarray object

An ndarray can be composed of non-homogeneous objects. Non-homogeneous ndarray elements are of the object type and cannot take advantage of Numpy effectively. Avoid using them whenever possible.

Example: The type of a non-homogeneous ndarray Object is Object.

In [9]: x = np.array([[0,1,2,3,4],   ...:             [9,8,7,6] ])   ...:             In [10]: x.shapeOut[10]: (2,)In [11]: x.dtypeOut[11]: dtype('O')In [12]: xOut[12]: array([list([0, 1, 2, 3, 4]), list([9, 8, 7, 6])], dtype=object)In [13]: x.itemsizeOut[13]: 8In [14]: x.sizeOut[14]: 2
The method for creating and transforming an ndarray array from the list and ancestor types in Pyhton (1) Create an ndarray.
x = np.array(list/tuple)x = np.array(list/tuple,dtype=np.float32)

When np. array () does not specify dtype, NumPy associates a dtype Based on Data conditions.

Example: Create an ndarray

In [15]: x = np. array ([0, 1, 2, 3]) # create In [16]: print (x) [0 1 2 3] In [17]: x = np. array (,) # create In [18]: print (x) [4 5 6 7] In [19]: x = np. array ([[0.1], [0.2], (,)]) # create In [20]: print (x) [[1. 2.] [9. 8.] [0.1 0.2]
(2) Use the Numpy function to create an ndarray, such as arange, ones, and zeros.
Function Description
Np. arange (n) Similar to the range () function, the ndarray type is returned, and the elements are from 0 to n-1.
Np. ones (shape) Generate a full 1 array based on shape. shape is a tuples.
Np. zeros (shape) Generate an array of all 0 based on shape. shape is a tuples.
Np. full (shape, val) G generates an array based on shape. Each element value is val.
Np. eye (n) Create a matrix of n * n units for a square. the diagonal line is 1 and the rest is 0.
Np. ones_like () Generate a full 1 Array Based on the shape of array
Np. zeros_like () Generates an array of all 0 based on the shape of array.
Np. full_like (a, val) Generates an Array Based on Array a. Each element value is val.
Use other functions in Numpy to create an ndarray
Np. linspace () Fill data according to the spacing between start and end data to form an array
Np. concatenate () Combine two or more numbers into a new array

Example: Create an ndarray

In [21]: np.arange(10)Out[21]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])In [22]: np.ones((3,6))Out[22]: array([[ 1.,  1.,  1.,  1.,  1.,  1.],       [ 1.,  1.,  1.,  1.,  1.,  1.],       [ 1.,  1.,  1.,  1.,  1.,  1.]])In [23]: np.zeros((3,6),dtype=np.int32)Out[23]: array([[0, 0, 0, 0, 0, 0],       [0, 0, 0, 0, 0, 0],       [0, 0, 0, 0, 0, 0]])In [24]: np.eye(5)Out[24]: array([[ 1.,  0.,  0.,  0.,  0.],       [ 0.,  1.,  0.,  0.,  0.],       [ 0.,  0.,  1.,  0.,  0.],       [ 0.,  0.,  0.,  1.,  0.],       [ 0.,  0.,  0.,  0.,  1.]])In [25]: x = np.ones((2,3,4))In [26]: print(x)[[[ 1.  1.  1.  1.]  [ 1.  1.  1.  1.]  [ 1.  1.  1.  1.]] [[ 1.  1.  1.  1.]  [ 1.  1.  1.  1.]  [ 1.  1.  1.  1.]]]In [27]: x.shapeOut[27]: (2, 3, 4)In [28]: a = np.linspace(1, 10, 4)In [29]: aOut[29]: array([  1.,   4.,   7.,  10.])In [30]: b = np.linspace(1, 10, 4, endpoint=False)In [31]: bOut[31]: array([ 1.  ,  3.25,  5.5 ,  7.75])In [32]: c = np.concatenate((a,b))In [33]: cOut[33]: array([  1.  ,   4.  ,   7.  ,  10.  ,   1.  ,   3.25,   5.5 ,   7.75])
(3) create an ndarray from the byte stream (raw bytes. (4) read the specified format from the file and create an ndarray. Ndarray array transformation

For the created ndarray, you can perform dimension transformation and element type conversion.

Dimension transformation of the ndarray
Method Description
. Reshape (shape) Returns a shape array without changing the array element. The original array remains unchanged.
. Resize (shape) The function is consistent with. reshape (), but the original array is modified.
. Swapaxes (ax1, ax2) Replace two dimensions in n dimensions of the array
. Flatten () Dimensionality Reduction of the array, returns the collapsed one-dimensional array, the original array remains unchanged

In [34]: a = np.ones((2,3,4), dtype=np.int32)In [35]: a.reshape((3,8))Out[35]: array([[1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1]])In [36]: aOut[36]: array([[[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]], [[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]]])In [37]: a.resize((3,8))In [38]: aOut[38]: array([[1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1]])In [39]: a = np.ones((2,3,4), dtype=np.int32)In [40]: a.flatten()Out[40]: array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])In [41]: aOut[41]: array([[[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]], [[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]]])In [42]: b = a.flatten()In [43]: bOut[43]: array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
Type conversion of ndarray Array
new_a = a.astype(new_type)

Example: array type conversion

In [44]: a = np.ones((2,3,4), dtype=np.int)In [45]: aOut[45]: array([[[1, 1, 1, 1],        [1, 1, 1, 1],        [1, 1, 1, 1]],       [[1, 1, 1, 1],        [1, 1, 1, 1],        [1, 1, 1, 1]]])In [46]: b = a.astype(np.float)In [47]: bOut[47]: array([[[ 1.,  1.,  1.,  1.],        [ 1.,  1.,  1.,  1.],        [ 1.,  1.,  1.,  1.]],       [[ 1.,  1.,  1.,  1.],        [ 1.,  1.,  1.,  1.],        [ 1.,  1.,  1.,  1.]]])

The astype () method will certainly create a new array (a copy of the original data), even if the two types are the same.

Conversion from an ndarray to a list
ls = a.tolist()

Example: converting an ndarray to a list

In [48]: a = np.full((2,3,4), 25, dtype=np.int32)In [49]: aOut[49]: array([[[25, 25, 25, 25],        [25, 25, 25, 25],        [25, 25, 25, 25]],       [[25, 25, 25, 25],        [25, 25, 25, 25],        [25, 25, 25, 25]]])In [50]: a.tolist()Out[50]: [[[25, 25, 25, 25], [25, 25, 25, 25], [25, 25, 25, 25]], [[25, 25, 25, 25], [25, 25, 25, 25], [25, 25, 25, 25]]]
Index and slice of the Operation Array of the ndarray

Index: The process of retrieving specific elements in an array

Slice: process of getting a subset of array elements

Indexing and slicing of one-dimensional arrays: similar to the Python list

In [51]: a = np. array ([, 5]) In [52]: a [2] Out [52]: 7In [53]: a [] # Start Number: end number (not included): step size (separated by 3-element colons), number 0 starts to increase from left, or-1 starts to decrease from right Out [53]: array ([8, 6])

Index of multi-dimensional arrays:

In [54]: a = np. arange (24 ). reshape (2, 3, 4) In [55]: aOut [55]: array ([[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]) In [56]: a [1, 2, 3] # One index value for each dimension, separated by commas (,) [56]: 23In [57]: a [0, 1, 2] Out [57]: 6In [58]: a [-1,-2,-3] Out [58]: 17
Multi-dimensional array slicing:
In [59]: a [:, 1,-3] # select a dimension using Out [59]: array ([5, 17]) In [60]: a [:,:] # Each dimension slicing method is the same as a one-dimensional array Out [60]: array ([[4, 5, 6, 7], [8, 9, 10, 11], [[16, 17, 18, 19], [20, 21, 22, 23]) In [61]: [:, :,: 2] # Each dimension can use the step to skip the slice Out [61]: array ([[0, 2], [4, 6], [8, 10], [[12, 14], [16, 18], [20, 22])
Calculation of an ndarray array between an array and a scalar

The operation between an array and a scalar acts on every element of the array.

Example: Operator for calculating the average values of a and elements

In [62]: a.mean()Out[62]: 11.5In [63]: a = a/a.mean()In [64]: aOut[64]: array([[[ 0.        ,  0.08695652,  0.17391304,  0.26086957],        [ 0.34782609,  0.43478261,  0.52173913,  0.60869565],        [ 0.69565217,  0.7826087 ,  0.86956522,  0.95652174]],       [[ 1.04347826,  1.13043478,  1.2173913 ,  1.30434783],        [ 1.39130435,  1.47826087,  1.56521739,  1.65217391],        [ 1.73913043,  1.82608696,  1.91304348,  2.        ]]])
Numpy Functions

Function for performing element-level operations on data in ndarray

Function Description
Np. abs (x) np. fabs (x) Calculates the absolute value of each element in the array.
Np. sqrt (x) Calculates the square root of each element in the array.
Np. square (x) Calculates the square of each element in the array.
Np. log (x) np. log10 (x) np. log2 (x) Calculate the natural logarithm, base 10 logarithm, and base 2 logarithm of each element in the array.
Np. ceil (x) np. floor (x) Calculates the ceiling value or floor value of each element in the array.
Np. rint (x) Returns the rounded value of each element in the array.
Np. modf (x) Returns the decimal number of each element in the array.
Np. cos (x) np. cosh (x)
Np. sin (x) np. sinh (x)
Np. tan (x) np. tanh (x)
Calculate the ordinary and hyperbolic trigonometric functions of each element in the array
Np. exp (x) Returns the exponential value of each element in the array.
Np. sign (x) Calculate the symbol values of each element in the array, 1 (+), 0,-1 (-)

Example: mona1 function instance

In [65]: a = np.arange(24).reshape((2,3,4))In [66]: np.square(a)Out[66]: array([[[  0,   1,   4,   9],        [ 16,  25,  36,  49],        [ 64,  81, 100, 121]],       [[144, 169, 196, 225],        [256, 289, 324, 361],        [400, 441, 484, 529]]], dtype=int32)In [67]: a = np.sqrt(a)In [68]: aOut[68]: array([[[ 0.        ,  1.        ,  1.41421356,  1.73205081],        [ 2.        ,  2.23606798,  2.44948974,  2.64575131],        [ 2.82842712,  3.        ,  3.16227766,  3.31662479]],       [[ 3.46410162,  3.60555128,  3.74165739,  3.87298335],        [ 4.        ,  4.12310563,  4.24264069,  4.35889894],        [ 4.47213595,  4.58257569,  4.69041576,  4.79583152]]])In [69]: np.modf(a)Out[69]: (array([[[ 0.        ,  0.        ,  0.41421356,  0.73205081],         [ 0.        ,  0.23606798,  0.44948974,  0.64575131],         [ 0.82842712,  0.        ,  0.16227766,  0.31662479]],         [[ 0.46410162,  0.60555128,  0.74165739,  0.87298335],         [ 0.        ,  0.12310563,  0.24264069,  0.35889894],         [ 0.47213595,  0.58257569,  0.69041576,  0.79583152]]]), array([[[ 0.,  1.,  1.,  1.],         [ 2.,  2.,  2.,  2.],         [ 2.,  3.,  3.,  3.]],         [[ 3.,  3.,  3.,  3.],         [ 4.,  4.,  4.,  4.],         [ 4.,  4.,  4.,  4.]]]))
NumPy binary Functions
Function Description
+ -*/** Corresponding operations on each element of the two Arrays
Np. maximum (x, y) np. fmax ()
Np. minimum (x, y) np. fmin ()
Element-level Maximum/minimum value calculation
Np. mod (x, y) Element-level modulo operation
Np. copysign (x, y) Assign the symbol of each element value in array y to the corresponding element of array x.
><>==! = Arithmetic comparison to generate a Boolean Array

Example: NumPy binary Functions

In [70]: a = np.arange(24).reshape((2,3,4))In [71]: b = np.sqrt(a)In [72]: aOut[72]: array([[[ 0,  1,  2,  3],        [ 4,  5,  6,  7],        [ 8,  9, 10, 11]],       [[12, 13, 14, 15],        [16, 17, 18, 19],        [20, 21, 22, 23]]])In [73]: bOut[73]: array([[[ 0.        ,  1.        ,  1.41421356,  1.73205081],        [ 2.        ,  2.23606798,  2.44948974,  2.64575131],        [ 2.82842712,  3.        ,  3.16227766,  3.31662479]],       [[ 3.46410162,  3.60555128,  3.74165739,  3.87298335],        [ 4.        ,  4.12310563,  4.24264069,  4.35889894],        [ 4.47213595,  4.58257569,  4.69041576,  4.79583152]]])In [74]: np.maximum(a,b)Out[74]: array([[[  0.,   1.,   2.,   3.],        [  4.,   5.,   6.,   7.],        [  8.,   9.,  10.,  11.]],       [[ 12.,  13.,  14.,  15.],        [ 16.,  17.,  18.,  19.],        [ 20.,  21.,  22.,  23.]]])In [75]: a > bOut[75]: array([[[False, False,  True,  True],        [ True,  True,  True,  True],        [ True,  True,  True,  True]],       [[ True,  True,  True,  True],        [ True,  True,  True,  True],        [ True,  True,  True,  True]]], dtype=bool)
NumPy Data Access and functions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.