Boosting performance with Cython even with my old PC (AMD Athlon II, 3GB RAM), I-seldom run into performance issues when Running vectorized code. But unfortunately there is plenty of cases where that can isn't being easily vectorized, for example the
Drawdownfunction. My implementation of such was extremely slow, so I decided to use it as a test case for speeding things up. I ' ll be using the SPY timeseries with ~5k samples as test data. Here comes the original version of my
Drawdownfunction (as it is now implemented in the
TradingwithpythonLibrary
?
12345678910111213141516171819202122232425 |
def drawdown(pnl):
"""
calculate max drawdown and duration
Returns:
drawdown : vector of drawdwon values
duration : vector of drawdown duration
"""
cumret
= pnl
highwatermark
= [
0
]
idx
= pnl.index
drawdown
= pd.Series(index
= idx)
drawdowndur
= pd.Series(index
= idx)
for t
in range
(
1
,
len
(idx)) :
highwatermark.append(
max
(highwatermark[t
-
1
], cumret[t]))
drawdown[t]
= (highwatermark[t]
-
cumret[t])
drawdowndur[t]
= (
0 if drawdown[t]
=
= 0 else drawdowndur[t
-
1
]
+
1
)
return drawdown, drawdowndur
%
timeit drawdown(spy)
1 loops, best of
3
:
1.21 s per loop
|
HMM 1.2 seconds is not the too speedy for such a simple function. There is some things this could be a great drag to performance, such as a list *highwatermark* so is being Appende D on each loop iteration. Accessing Series by their index should also involve some processing the is isn't strictly necesarry. Let's take a look at what happens when the This function was rewritten to work with NumPy data
?
123456789101112131415161718 |
def dd(s):
# ‘‘‘ simple drawdown function ‘‘‘
highwatermark
= np.zeros(
len
(s))
drawdown
= np.zeros(
len
(s))
drawdowndur
= np.zeros(
len
(s))
for t
in range
(
1
,
len
(s)):
highwatermark[t]
= max
(highwatermark[t
-
1
], s[t])
drawdown[t]
= (highwatermark[t]
-
s[t])
drawdowndur[t]
= (
0 if drawdown[t]
=
= 0 else drawdowndur[t
-
1
]
+
1
)
return drawdown , drawdowndur
%
timeit dd(spy.values)
10 loops, best of
3
:
27.9 ms per loop
|
Well, this is
muchFaster than the original function, approximately 40x speed increase. Still there is much-improvement by moving-compiled code with
CythonNow I rewrite the DD function from above, but using optimisation tips that I ' ve found on the Cython tutorial.
Using Cython to improve Python performance