Navigation

    Backtrader Community

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    For code/output blocks: Use ``` (aka backtick or grave accent) in a single line before and after the block. See: http://commonmark.org/help/

    How to speed up almost 100 times when add data and preload data?

    General Discussion
    11
    17
    10379
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • tianjixuetu
      tianjixuetu last edited by

      If you don't use too many data, you may ignore this page.

      As we know,when we use preload data,our backtest will faster than not using it,but however,preload data will consume much time ,so ,maybe ,there is a way to speed up the preload function.

      when I load 5000+ future contract,every time I preload data,it consume me 62.5 seconds. Terrible!
      每次preload的时间.png
      But , if we save self.datas, after it runs in cerebro to pickle and read it from pickle,it just consume 0.66 seconds.
      从pickle进行pre_load数据.png

      first time,you should run it and save my.datas :

      cerebro.run(save_my_data=True)
      

      after this time,you can use it to speed up:

      cerebro.run(load_my_data=True)
      

      how to implement it?

      you should add a function load_my_data_from_pickle to cerebro and modify the function runstrategies and add params.

      1. add two param
      params = (
              ('preload', True),
              ('runonce', True),
              ('maxcpus', None),
              ('stdstats', True),
              ('oldbuysell', False),
              ('oldtrades', False),
              ('lookahead', 0),
              ('exactbars', False),
              ('optdatas', True),
              ('optreturn', True),
              ('objcache', False),
              ('live', False),
              ('writer', False),
              ('tradehistory', False),
              ('oldsync', False),
              ('tz', None),
              ('cheat_on_open', False),
              ('broker_coo', True),
              ('quicknotify', False),
              ("load_my_data",False),
              ("save_my_data",False)
          )
      
      1. add function and modify function
      def load_my_data_from_pickle(self,path="normal_future_data.pkl"):
              ''' add from pickle'''
              import pickle 
              with open(path,"rb") as f:
                  my_data = pickle.load(f)
              return my_data
      
          def runstrategies(self, iterstrat, predata=False):
              '''
              Internal method invoked by ``run``` to run a set of strategies
              '''
              self._init_stcount()
      
              self.runningstrats = runstrats = list()
              for store in self.stores:
                  store.start()
      
              if self.p.cheat_on_open and self.p.broker_coo:
                  # try to activate in broker
                  if hasattr(self._broker, 'set_coo'):
                      self._broker.set_coo(True)
      
              if self._fhistory is not None:
                  self._broker.set_fund_history(self._fhistory)
      
              for orders, onotify in self._ohistory:
                  self._broker.add_order_history(orders, onotify)
      
              self._broker.start()
      
              for feed in self.feeds:
                  feed.start()
      
              if self.writers_csv:
                  wheaders = list()
                  for data in self.datas:
                      if data.csv:
                          wheaders.extend(data.getwriterheaders())
      
                  for writer in self.runwriters:
                      if writer.p.csv:
                          writer.addheaders(wheaders)
      
              # self._plotfillers = [list() for d in self.datas]
              # self._plotfillers2 = [list() for d in self.datas]
      
              if not predata:
                  if self.p.load_my_data:
                      # begin_time=time.time()
                      self.datas = self.load_my_data_from_pickle()
                      # end_time=time.time()
                      # print("every time pre_load consume time :{}".format(end_time-begin_time))
                      # assert 0
                  elif self.p.save_my_data:
                      
                      begin_time=time.time()
                      for data in self.datas:
                          data.reset()
                          if self._exactbars < 1:  # datas can be full length
                              data.extend(size=self.params.lookahead)
                          data._start()
                          if self._dopreload:
                              data.preload()
                      end_time=time.time()
                      print("every time pre_load consume time :{}".format(end_time-begin_time))
          
                      import pickle 
                      with open("normal_future_data.pkl",'wb') as f:
                           pickle.dump(self.datas,f)
                      
                      assert 0
                  else:
                      begin_time=time.time()
                      for data in self.datas:
                          data.reset()
                          if self._exactbars < 1:  # datas can be full length
                              data.extend(size=self.params.lookahead)
                          data._start()
                          if self._dopreload:
                              data.preload()
                      end_time=time.time()
                      print("every time pre_load consume time :{}".format(end_time-begin_time))
      
              for stratcls, sargs, skwargs in iterstrat:
                  sargs = self.datas + list(sargs)
                  try:
                      strat = stratcls(*sargs, **skwargs)
                  except bt.errors.StrategySkipError:
                      continue  # do not add strategy to the mix
      
                  if self.p.oldsync:
                      strat._oldsync = True  # tell strategy to use old clock update
                  if self.p.tradehistory:
                      strat.set_tradehistory()
                  runstrats.append(strat)
      
              tz = self.p.tz
              if isinstance(tz, integer_types):
                  tz = self.datas[tz]._tz
              else:
                  tz = tzparse(tz)
      
              if runstrats:
                  # loop separated for clarity
                  defaultsizer = self.sizers.get(None, (None, None, None))
                  for idx, strat in enumerate(runstrats):
                      if self.p.stdstats:
                          strat._addobserver(False, observers.Broker)
                          if self.p.oldbuysell:
                              strat._addobserver(True, observers.BuySell)
                          else:
                              strat._addobserver(True, observers.BuySell,
                                                 barplot=True)
      
                          if self.p.oldtrades or len(self.datas) == 1:
                              strat._addobserver(False, observers.Trades)
                          else:
                              strat._addobserver(False, observers.DataTrades)
      
                      for multi, obscls, obsargs, obskwargs in self.observers:
                          strat._addobserver(multi, obscls, *obsargs, **obskwargs)
      
                      for indcls, indargs, indkwargs in self.indicators:
                          strat._addindicator(indcls, *indargs, **indkwargs)
      
                      for ancls, anargs, ankwargs in self.analyzers:
                          strat._addanalyzer(ancls, *anargs, **ankwargs)
      
                      sizer, sargs, skwargs = self.sizers.get(idx, defaultsizer)
                      if sizer is not None:
                          strat._addsizer(sizer, *sargs, **skwargs)
      
                      strat._settz(tz)
                      strat._start()
      
                      for writer in self.runwriters:
                          if writer.p.csv:
                              writer.addheaders(strat.getwriterheaders())
      
                  if not predata:
                      for strat in runstrats:
                          strat.qbuffer(self._exactbars, replaying=self._doreplay)
      
                  for writer in self.runwriters:
                      writer.start()
      
                  # Prepare timers
                  self._timers = []
                  self._timerscheat = []
                  for timer in self._pretimers:
                      # preprocess tzdata if needed
                      timer.start(self.datas[0])
      
                      if timer.params.cheat:
                          self._timerscheat.append(timer)
                      else:
                          self._timers.append(timer)
      
                  if self._dopreload and self._dorunonce:
                      if self.p.oldsync:
                          self._runonce_old(runstrats)
                      else:
                          self._runonce(runstrats)
                  else:
                      if self.p.oldsync:
                          self._runnext_old(runstrats)
                      else:
                          self._runnext(runstrats)
      
                  for strat in runstrats:
                      strat._stop()
      
              self._broker.stop()
      
              if not predata:
                  for data in self.datas:
                      data.stop()
      
              for feed in self.feeds:
                  feed.stop()
      
              for store in self.stores:
                  store.stop()
      
              self.stop_writers(runstrats)
      
              if self._dooptimize and self.p.optreturn:
                  # Results can be optimized
                  results = list()
                  for strat in runstrats:
                      for a in strat.analyzers:
                          a.strategy = None
                          a._parent = None
                          for attrname in dir(a):
                              if attrname.startswith('data'):
                                  setattr(a, attrname, None)
      
                      oreturn = OptReturn(strat.params, analyzers=strat.analyzers, strategycls=type(strat))
                      results.append(oreturn)
      
                  return results
      
              return runstrats
      

      very good job!!!

      notdbestcoiner Sumit Pandey tianjixuetu 3 Replies Last reply Reply Quote 9
      • notdbestcoiner
        notdbestcoiner @tianjixuetu last edited by

        @tianjixuetu Thanks for sharing. It helped me save over 140sec per run.

        1 Reply Last reply Reply Quote 2
        • Sumit Pandey
          Sumit Pandey @tianjixuetu last edited by

          @vladisld @ab_trader Can we add this feature to the base code.

          1 Reply Last reply Reply Quote 0
          • Ibrahim Chippa
            Ibrahim Chippa last edited by

            @tianjixuetu f6f84fe7-0665-424f-a48a-3d47d50feb9a-image.png

            Can you help me with this error?
            Get the same error if I use

            pickle.dump(self.datas , f_point)
            
            tianjixuetu 1 Reply Last reply Reply Quote 0
            • A
              andy last edited by

              @tianjixuetu 6a757d83-675d-4df0-ac91-c935f162d8c7-image.png
              when i use get databyname() to order , this error happened , can you help me ? thanks!

              tianjixuetu 1 Reply Last reply Reply Quote 0
              • tianjixuetu
                tianjixuetu @andy last edited by

                @andy maybe you change strategy ,so you cannot get she data by name, I also get this problem, maybe we can just use it in the same strategy name.

                A 1 Reply Last reply Reply Quote 0
                • tianjixuetu
                  tianjixuetu @Ibrahim Chippa last edited by

                  @Ibrahim-Chippa I don't know.it seems the pickle can not work

                  1 Reply Last reply Reply Quote 0
                  • A
                    andy @tianjixuetu last edited by

                    @tianjixuetu

                    hi,I have not change the strategy. I have solved the problem by add these codes,thanks again!

                                 with open("data.pkl","rb") as f:
                                    self.datas = pickle.load(f)
                                for data in self.datas:
                                    self.datasbyname[data._name] = data
                    
                    frontline 1 Reply Last reply Reply Quote 1
                    • frontline
                      frontline @andy last edited by

                      I'm utterly confused by this thread, sorry. If you have your datasets stashed away somewhere (loaded via broker API then saved as CSV, Pickle etc), why not do this --

                          for symbol in symbols:
                              df = broker_api.cached_price_history_1day(symbol)
                              data = bt.feeds.PandasData(dataname=df, plot=False, **dkwargs)
                              cerebro.adddata(data, name=symbol)            
                      

                      (...where my broker_api.cached_price_history_1day() is smart enouht to load disk-cached data if no update is needed) ?

                      1 Reply Last reply Reply Quote 0
                      • tik lok
                        tik lok last edited by

                        You can use vpn in such cases. I also often use sich tools during web surfing for protecting own plivacy. My internet speed is fast and I do not have glicthes during using https://veepn.com/vpn-features/no-log-vpn/

                        1 Reply Last reply Reply Quote -3
                        • D
                          dehati_paul last edited by

                          Hi,

                          This is extremely helpful. The issue I am having is that the pickle file saves only when my data classes are derived from bt.feeds.PandasData, but when the classes are derived from bt.feeds.PandasDirectData (to speed up first time loading) I get the following error. Any insights?

                          _pickle.PicklingError: Can't pickle <class 'pandas.core.frame.Pandas'>: attribute lookup Pandas on pandas.core.frame failed

                          Thanks,

                          AP

                          the world 1 Reply Last reply Reply Quote 0
                          • the world
                            the world last edited by

                            When I using optstrategy, It turns out errors.

                            the world 1 Reply Last reply Reply Quote 0
                            • the world
                              the world @the world last edited by

                              @the-world I solved this error by modifying the code using the pickle in the cerebro.py

                                          if self.p.optdatas and self._dopreload and self._dorunonce:
                                              if self.p.load_my_data:
                                                  begin_time = time.time()
                                                  self.datas = self.load_my_data_from_pickle()
                                                  end_time = time.time()
                                                  print("every time pre_load from pkl consume time :{}".format(end_time - begin_time))
                                              else:
                                                  begin_time = time.time()
                                                  for data in self.datas:
                                                      data.reset()
                                                      if self._exactbars < 1:  # datas can be full length
                                                          data.extend(size=self.params.lookahead)
                                                      data._start()
                                                      if self._dopreload:
                                                          data.preload()
                                                  end_time = time.time()
                                                  print("every time pre_load from raw consume time :{}".format(end_time-begin_time))
                              
                              
                              1 Reply Last reply Reply Quote 1
                              • the world
                                the world @dehati_paul last edited by

                                @dehati_paul said in How to speed up almost 100 times when add data and preload data?:

                                es only when my data classes are derived from bt.feeds.PandasData, but when the classes are derived from bt.feeds.PandasDirectData (to speed up first time loading) I get the following error. Any insights?
                                _pickle.PicklingError: Can't pickle <class 'pandas.core.frame.Pandas'>: attribute lookup Pa

                                I got the same error. Have you fixed it?

                                1 Reply Last reply Reply Quote 0
                                • tianjixuetu
                                  tianjixuetu @tianjixuetu last edited by

                                  @tianjixuetu I quit this way to speed up. no matter can this way validate . I will find a new way to speed the backtest speed.maybe use numpylines,in the python,when deal with huge number,numpy maybe a good choice.

                                  CHENGXIN LI 1 Reply Last reply Reply Quote 0
                                  • K
                                    kevkev last edited by

                                    @tianjixuetu did you find any better way to speed up the back tests?

                                    1 Reply Last reply Reply Quote 0
                                    • CHENGXIN LI
                                      CHENGXIN LI @tianjixuetu last edited by

                                      @tianjixuetu Hi, can we have a little discussion about how to speed the backtest? Recently I am trying to find a new way to solve it. My Wechat Id is 13247198760

                                      1 Reply Last reply Reply Quote 0
                                      • 1 / 1
                                      • First post
                                        Last post
                                      Copyright © 2016, 2017, 2018, 2019, 2020, 2021 NodeBB Forums | Contributors