which line code(function) consume more time when doing a backtest?
-
below is the information when I do a backtest. and I will give a simple analysis in the last.
total_use_time is 4.024263381958008 begin to run this params:,now_time is 2020-01-09 16:44:21 这次pre_load数据需要的时间:104.02971696853638 backtest consume time :665.2806780338287,结束时间为:2020-01-09 16:55:26 Wrote profile results to test_cmo.py.lprof Timer unit: 1e-07 s Total time: 665.281 s File: c:\anaconda3\lib\site-packages\backtrader\cerebro.py Function: run at line 1031 Line # Hits Time Per Hit % Time Line Contents ============================================================== 1031 @profile 1032 def run(self, **kwargs): 1033 '''The core method to perform backtesting. Any ``kwargs`` passed to it 1034 will affect the value of the standard parameters ``Cerebro`` was 1035 instantiated with. 1036 1037 If ``cerebro`` has not datas the method will immediately bail out. 1038 1039 It has different return values: 1040 1041 - For No Optimization: a list contanining instances of the Strategy 1042 classes added with ``addstrategy`` 1043 1044 - For Optimization: a list of lists which contain instances of the 1045 Strategy classes added with ``addstrategy`` 1046 ''' 1047 1 36.0 36.0 0.0 self._event_stop = False # Stop is requested 1048 1049 1 14.0 14.0 0.0 if not self.datas: 1050 return [] # nothing can be run 1051 1052 # 根据cerebro传递的参数来设置params的值 1053 1 71.0 71.0 0.0 pkeys = self.params._getkeys() 1054 1 15.0 15.0 0.0 for key, val in kwargs.items(): 1055 if key in pkeys: 1056 setattr(self.params, key, val) 1057 1058 # Manage activate/deactivate object cache 1059 1 48.0 48.0 0.0 linebuffer.LineActions.cleancache() # clean cache 1060 1 54.0 54.0 0.0 indicator.Indicator.cleancache() # clean cache 1061 1062 1 31.0 31.0 0.0 linebuffer.LineActions.usecache(self.p.objcache) # 默认是False 1063 1 27.0 27.0 0.0 indicator.Indicator.usecache(self.p.objcache) # 默认是False 1064 1065 1 13.0 13.0 0.0 self._dorunonce = self.p.runonce # 默认是True 1066 1 11.0 11.0 0.0 self._dopreload = self.p.preload # 默认是True 1067 1 16.0 16.0 0.0 self._exactbars = int(self.p.exactbars) # 默认是False 1068 1069 1 11.0 11.0 0.0 if self._exactbars: 1070 self._dorunonce = False # something is saving memory, no runonce 1071 self._dopreload = self._dopreload and self._exactbars < 1 1072 1073 1 12841.0 12841.0 0.0 self._doreplay = self._doreplay or any(x.replaying for x in self.datas) 1074 1 11.0 11.0 0.0 if self._doreplay: 1075 # preloading is not supported with replay. full timeframe bars 1076 # are constructed in realtime 1077 self._dopreload = False 1078 1079 1 14.0 14.0 0.0 if self._dolive or self.p.live: 1080 # in this case both preload and runonce must be off 1081 self._dorunonce = False 1082 self._dopreload = False 1083 1084 1 14.0 14.0 0.0 self.runwriters = list() 1085 1086 # Add the system default writer if requested 1087 1 12.0 12.0 0.0 if self.p.writer is True: 1088 wr = WriterFile() 1089 self.runwriters.append(wr) 1090 1091 # Instantiate any other writers 1092 1 23.0 23.0 0.0 for wrcls, wrargs, wrkwargs in self.writers: 1093 wr = wrcls(*wrargs, **wrkwargs) 1094 self.runwriters.append(wr) 1095 1096 # Write down if any writer wants the full csv output 1097 1 20.0 20.0 0.0 self.writers_csv = any(map(lambda x: x.p.csv, self.runwriters)) 1098 1099 1 11.0 11.0 0.0 self.runstrats = list() 1100 1101 1 10.0 10.0 0.0 if self.signals: # allow processing of signals 1102 signalst, sargs, skwargs = self._signal_strat 1103 if signalst is None: 1104 # Try to see if the 1st regular strategy is a signal strategy 1105 try: 1106 signalst, sargs, skwargs = self.strats.pop(0) 1107 except IndexError: 1108 pass # Nothing there 1109 else: 1110 if not isinstance(signalst, SignalStrategy): 1111 # no signal ... reinsert at the beginning 1112 self.strats.insert(0, (signalst, sargs, skwargs)) 1113 signalst = None # flag as not presetn 1114 1115 if signalst is None: # recheck 1116 # Still None, create a default one 1117 signalst, sargs, skwargs = SignalStrategy, tuple(), dict() 1118 1119 # Add the signal strategy 1120 self.addstrategy(signalst, 1121 _accumulate=self._signal_accumulate, 1122 _concurrent=self._signal_concurrent, 1123 signals=self.signals, 1124 *sargs, 1125 **skwargs) 1126 1127 1 9.0 9.0 0.0 if not self.strats: # Datas are present, add a strategy 1128 self.addstrategy(Strategy) 1129 1130 1 27.0 27.0 0.0 iterstrats = itertools.product(*self.strats) 1131 1 9.0 9.0 0.0 if not self._dooptimize or self.p.maxcpus == 1: 1132 # If no optimmization is wished ... or 1 core is to be used 1133 # let's skip process "spawning" 1134 2 38.0 19.0 0.0 for iterstrat in iterstrats: 1135 1 6652796857.0 6652796857.0 100.0 runstrat = self.runstrategies(iterstrat) 1136 1 32.0 32.0 0.0 self.runstrats.append(runstrat) 1137 1 12.0 12.0 0.0 if self._dooptimize: 1138 for cb in self.optcbs: 1139 cb(runstrat) # callback receives finished strategy 1140 else: 1141 if self.p.optdatas and self._dopreload and self._dorunonce: 1142 for data in self.datas: 1143 data.reset() 1144 if self._exactbars < 1: # datas can be full length 1145 data.extend(size=self.params.lookahead) # lookahead默认参数是0 1146 data._start() 1147 if self._dopreload: 1148 data.preload() 1149 1150 pool = multiprocessing.Pool(self.p.maxcpus or None) 1151 for r in pool.imap(self, iterstrats): 1152 # print("测试-------------------") 1153 # print(self.iterstrats) 1154 # print(r) 1155 self.runstrats.append(r) 1156 for cb in self.optcbs: 1157 cb(r) # callback receives finished strategy 1158 1159 pool.close() 1160 1161 if self.p.optdatas and self._dopreload and self._dorunonce: 1162 for data in self.datas: 1163 data.stop() 1164 1165 1 12.0 12.0 0.0 if not self._dooptimize: 1166 # avoid a list of list for regular cases 1167 1 16.0 16.0 0.0 return self.runstrats[0] 1168 1169 return self.runstrats Total time: 665.248 s File: c:\anaconda3\lib\site-packages\backtrader\cerebro.py Function: runstrategies at line 1185 Line # Hits Time Per Hit % Time Line Contents ============================================================== 1185 @profile 1186 def runstrategies(self, iterstrat, predata=False): 1187 ''' 1188 Internal method invoked by ``run``` to run a set of strategies 1189 ''' 1190 1 31.0 31.0 0.0 self._init_stcount() 1191 1192 1 14.0 14.0 0.0 self.runningstrats = runstrats = list() 1193 1 13.0 13.0 0.0 for store in self.stores: 1194 store.start() 1195 1196 1 12.0 12.0 0.0 if self.p.cheat_on_open and self.p.broker_coo: 1197 # try to activate in broker 1198 if hasattr(self._broker, 'set_coo'): 1199 self._broker.set_coo(True) 1200 1201 1 11.0 11.0 0.0 if self._fhistory is not None: 1202 self._broker.set_fund_history(self._fhistory) 1203 1204 1 12.0 12.0 0.0 for orders, onotify in self._ohistory: 1205 self._broker.add_order_history(orders, onotify) 1206 1207 1 216.0 216.0 0.0 self._broker.start() 1208 1209 1 14.0 14.0 0.0 for feed in self.feeds: 1210 feed.start() 1211 1212 1 10.0 10.0 0.0 if self.writers_csv: 1213 wheaders = list() 1214 for data in self.datas: 1215 if data.csv: 1216 wheaders.extend(data.getwriterheaders()) 1217 1218 for writer in self.runwriters: 1219 if writer.p.csv: 1220 writer.addheaders(wheaders) 1221 1222 # self._plotfillers = [list() for d in self.datas] 1223 # self._plotfillers2 = [list() for d in self.datas] 1224 1225 1 10.0 10.0 0.0 if not predata: 1226 1 13.0 13.0 0.0 if self.p.load_my_data: 1227 begin_time=time.time() 1228 self.datas = self.load_my_data_from_pickle() 1229 end_time=time.time() 1230 print("从pickle加载数据消耗的时间为:{}".format(end_time-begin_time)) 1231 # assert 0 1232 1 12.0 12.0 0.0 elif self.p.save_my_data: 1233 print("进入策略运行当中,开始加载数据") 1234 begin_time=time.time() 1235 import _pickle as pickle 1236 for data in self.datas: 1237 data.reset() 1238 if self._exactbars < 1: # datas can be full length 1239 data.extend(size=self.params.lookahead) 1240 data._start() 1241 if self._dopreload: 1242 data.preload() 1243 # file_name=iterstrat.__name__+'.pkl' 1244 with open("C:/result/"+"常用期货数据.pkl",'wb') as f: 1245 pickle.dump(self.datas,f) 1246 end_time=time.time() 1247 1248 print("每次pre_load数据需要的时间:{}".format(end_time-begin_time)) 1249 print("加载数据完毕,尝试使用pickle保存数据,再后续使用同样策略的时候进行读取") 1250 print("数据保存到pickle文件完毕,结束测试") 1251 assert 0 1252 else: 1253 1 15.0 15.0 0.0 begin_time=time.time() 1254 2116 54788.0 25.9 0.0 for data in self.datas: 1255 2115 579877.0 274.2 0.0 data.reset() 1256 2115 42506.0 20.1 0.0 if self._exactbars < 1: # datas can be full length 1257 2115 274101.0 129.6 0.0 data.extend(size=self.params.lookahead) 1258 2115 176030065.0 83229.3 2.6 data._start() 1259 2115 49135.0 23.2 0.0 if self._dopreload: 1260 2115 863010067.0 408042.6 13.0 data.preload() 1261 1 36.0 36.0 0.0 end_time=time.time() 1262 1 1717.0 1717.0 0.0 print("这次pre_load数据需要的时间:{}".format(end_time-begin_time)) 1263 # assert 0 1264 2 39.0 19.5 0.0 for stratcls, sargs, skwargs in iterstrat: 1265 1 3878.0 3878.0 0.0 sargs = self.datas + list(sargs) 1266 1 11.0 11.0 0.0 try: 1267 1 18474372.0 18474372.0 0.3 strat = stratcls(*sargs, **skwargs) 1268 except bt.errors.StrategySkipError: 1269 continue # do not add strategy to the mix 1270 1271 1 38.0 38.0 0.0 if self.p.oldsync: 1272 strat._oldsync = True # tell strategy to use old clock update 1273 1 14.0 14.0 0.0 if self.p.tradehistory: 1274 strat.set_tradehistory() 1275 1 17.0 17.0 0.0 runstrats.append(strat) 1276 1277 1 13.0 13.0 0.0 tz = self.p.tz 1278 1 17.0 17.0 0.0 if isinstance(tz, integer_types): 1279 tz = self.datas[tz]._tz 1280 else: 1281 1 66.0 66.0 0.0 tz = tzparse(tz) 1282 1283 1 11.0 11.0 0.0 if runstrats: 1284 # loop separated for clarity 1285 1 20.0 20.0 0.0 defaultsizer = self.sizers.get(None, (None, None, None)) 1286 2 42.0 21.0 0.0 for idx, strat in enumerate(runstrats): 1287 1 18.0 18.0 0.0 if self.p.stdstats: 1288 1 1103015.0 1103015.0 0.0 strat._addobserver(False, observers.Broker) 1289 1 43.0 43.0 0.0 if self.p.oldbuysell: 1290 strat._addobserver(True, observers.BuySell) 1291 else: 1292 1 32.0 32.0 0.0 strat._addobserver(True, observers.BuySell, 1293 1 5677762.0 5677762.0 0.1 barplot=True) 1294 1295 1 55.0 55.0 0.0 if self.p.oldtrades or len(self.datas) == 1: 1296 strat._addobserver(False, observers.Trades) 1297 else: 1298 1 1837714.0 1837714.0 0.0 strat._addobserver(False, observers.DataTrades) 1299 1300 1 38.0 38.0 0.0 for multi, obscls, obsargs, obskwargs in self.observers: 1301 strat._addobserver(multi, obscls, *obsargs, **obskwargs) 1302 1303 1 14.0 14.0 0.0 for indcls, indargs, indkwargs in self.indicators: 1304 strat._addindicator(indcls, *indargs, **indkwargs) 1305 1306 16 530.0 33.1 0.0 for ancls, anargs, ankwargs in self.analyzers: 1307 15 19681825.0 1312121.7 0.3 strat._addanalyzer(ancls, *anargs, **ankwargs) 1308 1309 1 27.0 27.0 0.0 sizer, sargs, skwargs = self.sizers.get(idx, defaultsizer) 1310 1 13.0 13.0 0.0 if sizer is not None: 1311 strat._addsizer(sizer, *sargs, **skwargs) 1312 1313 1 107.0 107.0 0.0 strat._settz(tz) 1314 1 1628369.0 1628369.0 0.0 strat._start() 1315 1316 1 39.0 39.0 0.0 for writer in self.runwriters: 1317 if writer.p.csv: 1318 writer.addheaders(strat.getwriterheaders()) 1319 1320 1 12.0 12.0 0.0 if not predata: 1321 2 25.0 12.5 0.0 for strat in runstrats: 1322 1 60.0 60.0 0.0 strat.qbuffer(self._exactbars, replaying=self._doreplay) 1323 1324 1 13.0 13.0 0.0 for writer in self.runwriters: 1325 writer.start() 1326 1327 # Prepare timers 1328 1 15.0 15.0 0.0 self._timers = [] 1329 1 15.0 15.0 0.0 self._timerscheat = [] 1330 1 15.0 15.0 0.0 for timer in self._pretimers: 1331 # preprocess tzdata if needed 1332 timer.start(self.datas[0]) 1333 1334 if timer.params.cheat: 1335 self._timerscheat.append(timer) 1336 else: 1337 self._timers.append(timer) 1338 1339 1 15.0 15.0 0.0 if self._dopreload and self._dorunonce: 1340 1 23.0 23.0 0.0 if self.p.oldsync: 1341 self._runonce_old(runstrats) 1342 else: 1343 1 5562175637.0 5562175637.0 83.6 self._runonce(runstrats) 1344 else: 1345 if self.p.oldsync: 1346 self._runnext_old(runstrats) 1347 else: 1348 self._runnext(runstrats) 1349 1350 2 57.0 28.5 0.0 for strat in runstrats: 1351 1 1484470.0 1484470.0 0.0 strat._stop() 1352 1353 1 54.0 54.0 0.0 self._broker.stop() 1354 1355 1 13.0 13.0 0.0 if not predata: 1356 2116 26165.0 12.4 0.0 for data in self.datas: 1357 2115 61436.0 29.0 0.0 data.stop() 1358 1359 1 28.0 28.0 0.0 for feed in self.feeds: 1360 feed.stop() 1361 1362 1 15.0 15.0 0.0 for store in self.stores: 1363 store.stop() 1364 1365 1 283595.0 283595.0 0.0 self.stop_writers(runstrats) 1366 1367 1 30.0 30.0 0.0 if self._dooptimize and self.p.optreturn: 1368 # Results can be optimized 1369 results = list() 1370 for strat in runstrats: 1371 for a in strat.analyzers: 1372 a.strategy = None 1373 a._parent = None 1374 for attrname in dir(a): 1375 if attrname.startswith('data'): 1376 setattr(a, attrname, None) 1377 1378 oreturn = OptReturn(strat.params, analyzers=strat.analyzers, strategycls=type(strat)) 1379 results.append(oreturn) 1380 1381 return results 1382 1383 1 13.0 13.0 0.0 return runstrats
-
@tianjixuetu said in which line code(function) consume more time when doing a backtest?:
first,when the cerebro run,the most part consume time is self.runstrategies(iterstrat)
secondly,when I run this strategy,the most part that consumes time is
(data._start(),2.6%)
(data._preload(),13%)
(self._runonce(runstrats),83.6%) -
@tianjixuetu It seems that ,when I add the three part,it equals 99.2%. now , I calculate the time consume on self._runonce(runstrats),to split every line and function,in order to know,how to add speed.
-
Total time: 545.011 s File: c:\anaconda3\lib\site-packages\backtrader\cerebro.py Function: _runonce at line 1693 Line # Hits Time Per Hit % Time Line Contents ============================================================== 1693 @profile 1694 def _runonce(self, runstrats): 1695 ''' 1696 Actual implementation of run in vector mode. 1697 1698 Strategies are still invoked on a pseudo-event mode in which ``next`` 1699 is called for each data arrival 1700 ''' 1701 2 41.0 20.5 0.0 for strat in runstrats: 1702 1 64200997.0 64200997.0 1.2 strat._once() 1703 1 144.0 144.0 0.0 strat.reset() # strat called next by next - reset lines 1704 1705 # The default once for strategies does nothing and therefore 1706 # has not moved forward all datas/indicators/observers that 1707 # were homed before calling once, Hence no "need" to do it 1708 # here again, because pointers are at 0 1709 1 29.0 29.0 0.0 datas = sorted(self.datas, 1710 1 21625.0 21625.0 0.0 key=lambda x: (x._timeframe, x._compression)) 1711 1712 1 7.0 7.0 0.0 while True: 1713 # Check next incoming date in the datas 1714 2399 304098795.0 126760.6 5.6 dts = [d.advance_peek() for d in datas] 1715 2399 868121.0 361.9 0.0 dt0 = min(dts) 1716 2399 34173.0 14.2 0.0 if dt0 == float('inf'): 1717 1 8.0 8.0 0.0 break # no data delivers anything 1718 1719 # Timemaster if needed be 1720 # dmaster = datas[dts.index(dt0)] # and timemaster 1721 2398 143918.0 60.0 0.0 slen = len(runstrats[0]) 1722 5074168 31242492.0 6.2 0.6 for i, dti in enumerate(dts): 1723 5071770 29992588.0 5.9 0.6 if dti <= dt0: 1724 465518 247441547.0 531.5 4.5 datas[i].advance() 1725 # self._plotfillers2[i].append(slen) # mark as fill 1726 else: 1727 # self._plotfillers[i].append(slen) 1728 pass 1729 1730 2398 93741.0 39.1 0.0 self._check_timers(runstrats, dt0, cheat=True) 1731 1732 2398 32843.0 13.7 0.0 if self.p.cheat_on_open: 1733 for strat in runstrats: 1734 strat._oncepost_open() 1735 if self._event_stop: # stop if requested 1736 return 1737 1738 2398 610418040.0 254553.0 11.2 self._brokernotify() 1739 2398 50153.0 20.9 0.0 if self._event_stop: # stop if requested 1740 return 1741 1742 2398 83868.0 35.0 0.0 self._check_timers(runstrats, dt0, cheat=False) 1743 1744 4796 42695.0 8.9 0.0 for strat in runstrats: 1745 2398 4161191006.0 1735275.6 76.4 strat._oncepost(dt0) 1746 2398 85888.0 35.8 0.0 if self._event_stop: # stop if requested 1747 return 1748 1749 2398 69803.0 29.1 0.0 self._next_writers(runstrats)
-
@tianjixuetu 在self._runonce(runstrats)中
代码 在该函数中占用的时间的比例 占用的总时间比例 dts = [d.advance_peek() for d in datas] 5.6% 4.68% datas[i].advance() 4.5% 3.76% self._brokernotify() 11.2% 9.36%
-
Line # Hits Time Per Hit % Time Line Contents ============================================================== 291 @profile 292 def _oncepost(self, dt): 293 2398 97209.0 40.5 0.0 for indicator in self._lineiterators[LineIterator.IndType]: 294 if len(indicator._clock) > len(indicator): 295 indicator.advance() 296 297 2398 27533.0 11.5 0.0 if self._oldsync: 298 # Strategy has not been reset, the line is there 299 self.advance() 300 else: 301 # strategy has been reset to beginning. advance step by step 302 2398 337940.0 140.9 0.0 self.forward() 303 304 2398 116440.0 48.6 0.0 self.lines.datetime[0] = dt 305 2398 4619271.0 1926.3 0.1 self._notify() 306 307 2398 105163575.0 43854.7 2.5 minperstatus = self._getminperstatus() 308 2398 30777.0 12.8 0.0 if minperstatus < 0: 309 self.next() 310 2398 10568.0 4.4 0.0 elif minperstatus == 0: 311 self.nextstart() # only called for the 1st value 312 else: 313 2398 735079500.0 306538.6 17.6 self.prenext() 314 315 2398 1672866860.0 697609.2 40.0 self._next_analyzers(minperstatus, once=True) 316 2398 1666655645.0 695019.0 39.8 self._next_observers(minperstatus, once=True) 317 318 2398 190217.0 79.3 0.0 self.clear()
-
code percent of total time self._next_analyzers(minperstatus, once=True) 25.55% self._next_observers(minperstatus, once=True) 25.42% self.prenext()/self.next() 11.24% dts = [d.advance_peek() for d in datas] 4.68% datas[i].advance() 3.76% self._brokernotify() 9.36% data._start() 2.6% data._preload() 13% total 95.61%
ok,from the backtest time consume information,it's better to delete self._next_analyzers() and self._next_observers(),if possible,or at last,speed up self._next_analyzers() and self._next_observers() and data._preload()
-
@tianjixuetu hahaha,just as which function consume time,when I prohibit the self._next_observers() and reduce the analyzer,
my every backtest time drop from 280s to 120s. -
Let me try to summarize your findings:
- Over 80% of the time is spent in actual backtesting (as opposed to something like data loading, which is productive but not real backtesting)
Something to be expected, I guess.
-
25% (total or of the 80%, it doesn't matter) is spent in collecting statistics for visualization (i.e.: "observers")
And your answer is:
@tianjixuetu said in which line code(function) consume more time when doing a backtest?:
,when I prohibit the self._next_observers()
Once again, have you tried to read the documentation? Because you can disable the standard observers when invoking
cerebro
(look forstdstats
in the cerebro documentation. It is also used in several samples and blog posts.)And
- 25% (total or of the 80%, it doesn't matter) is spent in collecting statistics for analysis (i.e. "analyzers")
And you answer is
@tianjixuetu said in which line code(function) consume more time when doing a backtest?:
and reduce the analyzer,
backtrader doesn't plug ANY analyzer as a default for some good reason. It would seem you actually don't need many of the analyzers you are plugging in.
And the conclusion is?
-
@backtrader Yes,I use stdstats=False to prohibit the obversers. At first, I add all the analyzers when I optimize my strategy,but,I find,I just need several indicator to test my strategy and all analyzer consume too much time,so I reduce them.
However,I don't use the backtrader's original optimize function,because there is something wrong when I do it,I can't sovle it,so every time,my backtest must consume 13%(data._preload()),is there some ways to pcikle the preloaded data?
my conclusion is that :
- reduce the data you load,the more data,the more time consumes(maybe linear,for example,5000+,10 mins,2500+,5mins+).
- when you optimize,you'd better prohibit the obverser.
- reduce the analyzer you add,may speed up your strategy.
-
However,I don't use the backtrader's original optimize function,because there is something wrong when I do it,I can't sovle it,so every time,my backtest must consume 13%(data._preload()),is there some ways to pcikle the preloaded data?
There is a bug with
optresult=True
in the current backtrader code when using particular analyzers which will result in unnecessary large result objects. As every process needs to pickle its result back to the one main process this main process will become a bottleneck with large result objects. The processes will pile up waiting for their chance to send their results which effectively brings them to a halt.
Not sure if you encountered this situation though. -
optreturn
ofc... I will never get this right... -
@vbs while optimize encountered different situation according different strategy, some very simple may run seccessfully,some complicated may not run,some unique may just walk and have no error reporting.I don't have so much time and experience to debug this.Now,I debug the analyzers,there are some errors.