International - English

Cart Console

Topic Center

Contact Sales

Home > Others

Common functions of Python3numpy

Last Update:2018-05-02 Source: Internet

Author: User

Tags arithmetic diff square root unpack

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Common functions of Python3numpy 1. TXT file

(1) The unit matrix, that is, the elements on the main diagonal are 1, the remaining elements are 0 square matrices.
In NumPy, you can create a two-dimensional array with the eye function, and we just need to give a parameter that specifies the number of elements in the matrix of 1.
For example, create a 3x3 array:

import numpy as npI2 = np.eye(3)print(I2)

[[1. 0. 0.] [0. 1. 0.] [0. 0. 1.]]

(2) using the Savetxt function to store the data in a file, of course we need to specify the file name and the array to save.

np.savetxt(‘eye.txt‘, I2)#创建一个eye.txt文件，用于保存I2的数据

2. csv file

CSV (comma-separated value, comma-separated value) format is a common file format;
Typically, the dump file of a database is in CSV format, and each field in the file corresponds to a column in a database table;
spreadsheet software, such as Microsoft Excel, can handle CSV files.

Note: The Loadtxt function in NumPy makes it easy to read CSV files, automatically slice fields, and load data into numpy arrays

Data.csv's data content:

c, v = np.loadtxt(‘data.csv‘, delimiter=‘,‘, usecols=(6,7), unpack=True)# usecols的参数为一个元组，以获取第7字段至第8字段的数据# unpack参数设置为True，意思是分拆存储不同列的数据，即分别将收盘价和成交量的数组赋值给变量c和v

print(c)

[336.1  339.32 345.03 344.32 343.44 346.5  351.88 355.2  358.16 354.54 356.85 359.18 359.9  363.13 358.3  350.56 338.61 342.62 342.88 348.16 353.21 349.31 352.12 359.56 360.   355.36 355.76 352.47 346.67 351.99]

print(v)

[21144800. 13473000. 15236800.  9242600. 14064100. 11494200. 17322100. 13608500. 17240800. 33162400. 13127500. 11086200. 10149000. 17184100. 18949000. 29144500. 31162200. 23994700. 17853500. 13572000. 14395400. 16290300. 21521000. 17885200. 16188000. 19504300. 12718000. 16192700. 18138800. 16824200.]

print(type(c))print(type(v))

<class ‘numpy.ndarray‘><class ‘numpy.ndarray‘>

3. Volume Weighted Average price = average () function

Vwap Overview:
VWAP (volume-weighted Average Price, Volume weighted average ) is a very important amount of economics,
It represents the "average" price of financial assets.
The higher the volume of a price, the greater the weight of the price.
Vwap is a weighted average calculated with volume as the weight , and is often used for algorithmic trading.

vwap = np.average(c,weights=v)print(‘成交量加权平均价格vwap =‘, vwap)

成交量加权平均价格vwap = 350.5895493532009

4. Arithmetic mean function = mean () function

The mean function in NumPy calculates the arithmetic mean of an array element

print(‘c数组中元素的算数平均值为： {}‘.format(np.mean(c)))

c数组中元素的算数平均值为： 351.0376666666667

5. Time weighted Average price

Twap Overview:
In economics,TWAP (time-weighted Average Price, time-weighted average) is an indicator of another "average" price. Now that we have calculated the Vwap, let's calculate the Twap. In fact, Twap is just a variant, the basic idea is that the recent price is more important, so we should give a higher weight for the recent price . The simplest method is to use the Arange function to create a sequence of natural numbers that starts from 0, in turn, and the number of natural numbers is the number of close prices. Of course, this is not necessarily the right way to calculate twap.

t = np.arange(len(c))print(‘时间加权平均价格twap=‘, np.average(c, weights=t))

时间加权平均价格twap= 352.4283218390804

6. Maximum value and minimum value

h, l = np.loadtxt(‘data.csv‘, delimiter=‘,‘, usecols=(4,5), unpack=True)print(‘h数据为： \n{}‘.format(h))print(‘-‘*10)print(‘l数据为： \n{}‘.format(l))

h数据为： [344.4  340.04 345.65 345.25 344.24 346.7  353.25 355.52 359.   360. 357.8  359.48 359.97 364.9  360.27 359.5  345.4  344.64 345.15 348.43 355.05 355.72 354.35 359.79 360.29 361.67 357.4  354.76 349.77 352.32]----------l数据为： [333.53 334.3  340.98 343.55 338.55 343.51 347.64 352.15 354.87 348. 353.54 356.71 357.55 360.5  356.52 349.52 337.72 338.61 338.37 344.8 351.12 347.68 348.4  355.92 357.75 351.31 352.25 350.6  344.9  345.  ]

print(‘h数据的最大值为： {}‘.format(np.max(h)))print(‘l数据的最小值为： {}‘.format(np.min(l)))

h数据的最大值为： 364.9l数据的最小值为： 333.53

There is a PTP function in the numpy to calculate the range of values for the array
This function returns the difference between the maximum and minimum values of an array element
In other words, the return value equals Max (array)-min (array)

print(‘h数据的最大值-最小值的差值为： \n{}‘.format(np.ptp(h)))print(‘l数据的最大值-最小值的差值为： \n{}‘.format(np.ptp(l)))

h数据的最大值-最小值的差值为： 24.859999999999957l数据的最大值-最小值的差值为： 26.970000000000027

7. Statistical analysis

Number of Median:
We can use some thresholds to get rid of outliers, but there is a better way, that is, the median.
The values of each variable are arranged in order of size, forming a sequence, and the number in the middle of the series is the median.
For example, we have 5 values of 1, 2, 3, 4, 5, then the median is the middle digit 3.

m = np.loadtxt(‘data.csv‘, delimiter=‘,‘, usecols=(6,), unpack=True)print(‘m数据中的中位数为： {}‘.format(np.median(m)))

m数据中的中位数为： 352.055

# 数组排序后，查找中位数sorted_m = np.msort(m)print(‘m数据排序： \n{}‘.format(sorted_m))N = len(c)print(‘m数据中的中位数为： {}‘.format((sorted_m[N//2]+sorted_m[(N-1)//2])/2))

m数据排序： [336.1  338.61 339.32 342.62 342.88 343.44 344.32 345.03 346.5  346.67 348.16 349.31 350.56 351.88 351.99 352.12 352.47 353.21 354.54 355.2 355.36 355.76 356.85 358.16 358.3  359.18 359.56 359.9  360.   363.13]m数据中的中位数为： 352.055

Variance:
Variance is the value of the sum of squares of the difference between each data and the arithmetic mean of all data, divided by the number of data.

print(‘variance =‘, np.var(m))

variance = 50.126517888888884

var_hand = np.mean((m-m.mean())**2)print(‘var =‘, var_hand)

var = 50.126517888888884

Note: The difference between the sample variance and the population variance is calculated. The total variance is to remove the squared sum of deviations with the number of data, while the sample variance is the number of sample data minus 1 to remove the squared sum of deviations, where the number of sample data minus 1 (i.e. n-1) is called degrees of freedom. The reason for this difference is to ensure that the sample variance is an unbiased estimator.

8. Stock return rate

In academic literature, the analysis of close price is often based on stock return rate and logarithmic yield.
The simple yield refers to the rate of change between two adjacent prices, while the logarithmic rate of return is the difference between the 22 of the logarithm and the value of all prices.
We learned the knowledge of logarithms in high school, and the logarithm of "a" minus "B" is equal to the logarithm of "a divided by B". Therefore, the logarithmic rate of return can also be used to measure the rate of change in price.
Note that because the yield is a ratio, for example, we divide the dollar by the dollar (or other currency units), so it is dimensionless.
In short, investors are most interested in the variance of the yield or the standard deviation, as this represents the size of the investment risk.

(1) First, let's calculate the simple rate of return. The diff function in NumPy can return an array of the difference values of adjacent array elements. This is somewhat analogous to the differential in calculus. To calculate the yield, we also need to divide the difference by the price of the preceding day. Note, however, that the diff returns an array with fewer elements than the closing price array. returns = Np.diff (arr)/arr[:-1]
Notice that we did not divide the last value in the closing price array. Next, use the STD function to calculate the standard deviation:
Print ("Standard deviation =", np.std (returns))

(2) The logarithmic rate of return is even simpler to calculate. We first use the log function to get the logarithm of each closing price, and then the diff function is used for the result.
Logreturns = Np.diff (Np.log (c))
In general, we should check the input array to ensure that it does not contain 0 and negative numbers. Otherwise, you will get an error prompt. In our case, however, the stock price is always positive, so you can omit the check.

(3) We are likely to be very interested in which trading days yields are positive.
After we have completed the previous steps, we just need to use the WHERE function to do this. The WHERE function returns the index value of all array elements that satisfy the condition, based on the specified criteria.
Enter the following code:
Posretindices = Np.where (returns > 0)
Print "Indices with positive returns", Posretindices
To output the index of all positive elements in the array.
Indices with positive Returns (array ([0, 1, 4, 5, 6, 7, 9, 10, 11, 12, 16, 17, 18, 19, 21, 22, 23, 25, 28]),)

(4) In investment studies, volatility (volatility) is a measure of price movement. Historical volatility can be calculated based on historical price data. Logarithmic rate of return is required when calculating historical volatility, such as annual volatility or monthly volatility. The annual volatility equals the standard deviation of the logarithmic yield divided by its mean, divided by the square root of the reciprocal of the trading day, usually taking 252 days on the trading day.
With the STD and mean functions, the code looks like this:
annual_volatility = NP.STD (logreturns)/np.mean (logreturns)
annual_volatility = Annual_volatility/np.sqrt (1./252.)

(5) The division operation in the SQRT function. In Python, the division of integers and the division of floating-point numbers are different (Python3 has modified this feature), and we must use floating-point numbers to get the correct results. Similar to the method for calculating annual volatility, the calculated monthly volatility is as follows:
Annual_volatility * NP.SQRT (1./12.)

c = np.loadtxt(‘data.csv‘, delimiter=‘,‘, usecols=(6,), unpack=True)returns = np.diff(c)/c[:-1]print(‘returns的标准差： {}‘.format(np.std(returns)))logreturns = np.diff(np.log(c))posretindices = np.where(returns>0)print(‘retruns中元素为正数的位置： \n{}‘.format(posretindices))annual_volatility = np.std(logreturns)/np.mean(logreturns)annual_volatility = annual_volatility/np.sqrt(1/252)print(‘每年波动率: {}‘.format(annual_volatility))print(‘每月波动率：{}‘.format(annual_volatility*np.sqrt(1/12)))

returns的标准差： 0.012922134436826306retruns中元素为正数的位置： (array([ 0,  1,  4,  5,  6,  7,  9, 10, 11, 12, 16, 17, 18, 19, 21, 22, 23,       25, 28], dtype=int64),)每年波动率: 129.27478991115132每月波动率：37.318417377317765

This article refers to the basic Python data Analysis Tutorial: NumPy Learning Guide

Common functions of Python3numpy

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

php common functions examples of common traits common uses of php common divisor of two numbers common factors of two numbers common types of ddos attacks cissp common body of knowledge

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Common functions of Python3numpy

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support