Time resampling of Pandas data Visualization (iii)
Python+pandas generate the specified date and resampling-CSDN blog https://blog.csdn.net/LY_ysys629/article/details/73823803
Pandas Resample Method-Csdn Blog https://blog.csdn.net/wangshuang1631/article/details/52314944
——————————————————————————————————————————————————
Time Series Conversions:
C=PD. Series (Np.random.rand (5), index= (Pd.date_range (' 20180130 ', periods=5, freq= ' D ')) #创建时间戳系列
d=c.to_period ( ' M ') #
print (c, type (c), C.index)
print (d, type (d), D.index)
2018-01-30 0.424927
2018-01-31 0.522582
2018-02-01 0.889830
2018-02-02 0.130641
2018-02-03 0.222065
Freq:d, Dtype:float64 <class ' pandas.core.series.Series ' > Datetimeindex ([' 2018-01-30 ', ' 2018-01-31 ', ' 2018-02-01 ', ' 2018-02-02 ', ' 2018-02-03 '], dtype= ' Datetime64[ns ', freq= ' D ')
2018-01 0.424927
2018-01 0.522582
2018-02 0.889830
2018-02 0.130641
2018-02 0.222065
Freq:m, Dtype:float64 <class ' pandas.core.series.Series ' > Periodindex ([' 2018-01 ', ' 2018-01 ', ' 2018-02 ', ' 2018-02 ', ' 2018-02 '], dtype= ' period[m ', freq= ' M ')
——————————————————————————————————————————————————
groupby sum by Group (sum) or average (mean)
C=PD. Series (Np.random.rand (5), index= (Pd.date_range (' 20180130 ', periods=5, freq= ' D '))
d=c.to_period (' M ' ) Convert to Month
e=d.groupby (level=0). Mean () #或e =d.groupby (level=0). SUM ()
Output the following results:
2018-01 0.407399
2018-02 0.826991
————————————————————————————————————————————————
time to convert:
The timestamp series turns into a period series ' M ', which is then turned into a timestamp series by the time series, and the "Day" of the subsequent timestamp is no longer the number of days of the original timestamp.
C=PD. Series (Np.random.rand (3), index= (Pd.date_range (' 20180130 ', periods=3, freq= ' D '))
d=c.to_period (' M ')
f= D.to_timestamp (how= ' start ') #或者how = ' End '
——————————————————————————————————————————————
time resampling: Timestamp series index, minutes to 5 minutes
C=PD. Series (np.random.randint (0, one), index= (Pd.date_range (' 2018-01-30 9:30 ', periods=11, freq= ' T '))
D=c.resam ple (' 5min ', how= ' sum ', label= ' right ') #label = ' OK ', which represents the right-hand time (the back end of the 5-minute period), defaults to the start time
e=c.resample (' 5min ', how= ' OHLC ') )
print (c, D, E)
Output results:
The new syntax is. Resample (...). OHLC ()
e=c.resample (' 5min ', how= ' OHLC ') 2018-01-30 09:30:00-2018-01-30
09:31:00 5
2018-01-30 09:32:00
2018-01-30 09:33:00 2018-01-30 09:34:00
2018-01-30 09:35:00
2018-01-30 09:36:00
2018-01-30 09:37:00 7
2018-01-30 09:38:00 4
2018-01-30 09:39:00
2018-01-30 09:40:00 40
2018-01-30 09:35:00 140
2018-01-30 09:40:00 140 #以5分钟时期的右侧时间9:35 As a sampling point in time, The default is 9:30 to the left of this period of time as sampling point
2018-01-30 09:45:00 40
freq:5t, Dtype:int32 open High low close
2018-01-30 09:30:00 5 C.resample (' 5min ', how= ' OHLC '), the 1th price in 5 minutes for the opening price, the last 1 prices for the closed plate.
2018-01-30 09:35:00 4
2018-01-30 09:40:00 40
————————————————————————————————————
Time resampling:
C=PD. Series (np.random.randint (0, M), Index=pd.date_range (' 2018-1-24 ', periods=10, freq= ' D '))
D=c.groupby ( Lambda x:x.month). SUM ()
e=c.groupby (c.index.to_period (' M ')). SUM ()
print (c) print (
E )
Output Result: print (c)
2018-01-24
2018-01-25
2018-01-26
2018-01-27
2018-01-28
2018-01-29
2018-01-30
2018-01-31 2018-02-01
2018-02-02
freq:d, Dtype:int32
Print (d)
1 257
2
Dtype:int32
Print (e)
2018-01 257
2018-02
freq:m, Dtype:int32
C=PD. Series (Np.random.randint (0, 2), Index=pd.date_range (' 20180401 ', periods=2, freq= ' W-fri '))
d=c.resample (' d '), Fill_method= ' Ffill ', limit=2) #
e=c.resample (' W-mon ', fill_method= ' Ffill ')
print (c)
print (d)
print (e)
output results:
2018-04-06
2018-04-13 1
freq:w-fri, Dtype:int32
2018-04-06 34.0
2018-04-07 34.0
2018-04-08 34.0 2018-04-09 NaN
2018-04-10 nan
2018-04-11 nan 2018-04-12 nan
2018-04-13 1.0 freq:d
, Dtype:float64
2018-04-09
2018-04-16 1
freq:w-mon, Dtype:int32
——————————————————————————————————————————————————
time resampling: sampling for year, quarter
C=PD. Dataframe (Np.random.randint (2, 4)), Index=pd.date_range (' 2018-3-2 ', periods=15, freq= ' M '), Columns=list (' ABCD ')
d=c.resample (' A-dec ', how= ' sum ') #
e=c.resample (' A-mar ', how= ' sum ')
f=c.resample (' Q-dec ') , how= ' sum ')
print (c) print (
d) print (
e)
print (f)
output results: print (c)
a b c D
2018-03-31 4
2018-04-30 3 5 6
2018-05-31 4 8
2018-06-30 3 4
2018-07-31 14 4
2018-08-31 9
2018-09-30 8 2 28
2018-10-31 9 3 4
2018-11-30 6 6
2018-12-31 6
2019-01-31
2019-02-28 5 7
2019-04-30 9 9 (2019-03-31)
2019-05-31 7 15
Output results: print (d)
a b c d
2018-12-31 129 211
2019-12-31 80
Output Result: Print (e)
a b c d
2018-03-31 4
2019-03-31 172 145 159 242
2020-03-31 24
Output Result: print (f)
A B c D
2018-03-31 20 29 4 25
2018-06-30 18 13 32 50
2018-09-30 55 49 30 76
2018-12-31 57 38 28 60
2019-03-31 42 45 69 56
2019-06-30 32 26 37 24
——————————————————————————————————————————
stock Data Cycle conversion:
C=pd.read_csv (' 601656 ', index_col= ' Date ', parse_dates=true)
d=c[' adj_close '].resample (' W-fri ', how= ' OHLC ') #收盘价按周重采样
d[' vol ']=c[' vol '].resample (' W-fri ', how= ' sum ') #向d追加交易量周采样列