Navigation

    Backtrader Community

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    For code/output blocks: Use ``` (aka backtick or grave accent) in a single line before and after the block. See: http://commonmark.org/help/

    Backtrader with a lot of datafeed - trick to reduce data loading time

    General Discussion
    5
    10
    1757
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • L
      lampalork last edited by

      Dear All,

      I'm currently backtesting a strategy that needs a lot of datafeeds. If I understand correctly how backtrader works, one of the first step cerebro will perform (when doing cerebro.run()) is to pre-load the data (assuming preload = True which is the default setting). This step takes quite a bit of time in my case (maybe 3/4 minutes) but running through the bars afterwards is relatively fast (as fast as python can be:)); is there a way I could cache/store things so that i don't need to wait that time (3/4 minutes) everytime i run a backtest? I was thinking about pickling the cerebro object (i know cerebro can be pickled) but i do not know which cerebro's method to launch to load the data without launching cerebro.run(). or would you any other suggestion to shorten this annoying waiting time?

      thanks and regards
      Lamp'

      vbs 1 Reply Last reply Reply Quote 0
      • vladisld
        vladisld last edited by

        Probably not directly related to multiple data feeds speedup - but the following post discussed some ways of speeding up the feed's loading times (including some caching):

        How to speed up backtest

        Thanks
        Vlad

        tianjixuetu 1 Reply Last reply Reply Quote 0
        • L
          lampalork last edited by

          thanks, I'll take a look. Profiling the backtest is any case probably a good start

          1 Reply Last reply Reply Quote 0
          • B
            backtrader administrators last edited by

            Pickling cerebro won't help.

            The way to achieve what you want is to develop your own data feed which would use pre-loaded data already residing in RAM.

            1 Reply Last reply Reply Quote 0
            • tianjixuetu
              tianjixuetu @vladisld last edited by

              @vladisld thank you again.
              @lampalork I also meet this question. I load more than 5000 future contracts,and backtest a strategy,it consumes nearly 12 minutes, this time is a bit longer.I also want to find some way to speed up, we guess,it maybe the pre_loaded data consume most of the time.so, we should develop something to speed up as @backtrader said?

              @ as you said,our own data feed, is it like this code you write before?

              class PandasDirectData_NumPyLines(feed.DataBase):
                  params = (
                      ('datetime', 0),
                      ('open', 1),
                      ('high', 2),
                      ('low', 3),
                      ('close', 4),
                      ('volume', 5),
                      ('openinterest', 6),
                  )
              
                  datafields = [
                      'datetime', 'open', 'high', 'low', 'close', 'volume', 'openinterest'
                  ]
              
                  def start(self):
                      super(PandasDirectData_NumPyLines, self).start()
                      self._df = self.p.dataname
              
                  def preload(self):
                      # Set the standard datafields - except for datetime
                      for datafield in self.datafields[1:]:
                          # get the column index
                          colidx = getattr(self.params, datafield)
              
                          if colidx < 0:
                              # column not present -- skip
                              continue
              
                          l = getattr(self.lines, datafield)
                          l.array = self._df.iloc[:, colidx]
              
                      field0 = self.datafields[0]
                      dts = pd.to_datetime(self.index)
                      getattr(self.l, field0).array = dts.apply(date2num)
              
                      self._last()
                      self.home()
              

              This is a strategy which I run optimizition!

              backtest_too_long_time.png

              If we can speed up the strategy,we may give the user better experience!

              B 1 Reply Last reply Reply Quote 0
              • B
                backtrader administrators @tianjixuetu last edited by

                @tianjixuetu said in Backtrader with a lot of datafeed - trick to reduce data loading time:

                t maybe the pre_loaded data consume most of the time.

                Don't preload the data, run it again and compare the times.

                tianjixuetu 1 Reply Last reply Reply Quote 0
                • tianjixuetu
                  tianjixuetu @backtrader last edited by

                  @backtrader It is too strange!!! when I use preload,it consume little time than I don't use
                  load_data.png
                  preload=True.png
                  preload=False.png
                  my main code:

                  ### some code has nothing with this test
                  cerebro.broker.setcash(1000000.0)
                  cerebro.run(preload=False)
                  end_time=time.time()
                  print("preload=False total use time is : {} ".format(end_time-begin_time))
                  

                  what happend???

                  1 Reply Last reply Reply Quote 0
                  • B
                    backtrader administrators last edited by

                    The obvious happpened. preload=True is saving you time, even if you think you can develop some magical method to speed things up.

                    Your optimization is already using the preloaded data in all processes.

                    This thread is about keeping the data in memory across backtesting runs, for which a data feed which sources from RAM across instantiations would be needed. Which means you have to keep a 2nd process running which keeps things in RAM which has to give you a key to access that RAM.

                    tianjixuetu 1 Reply Last reply Reply Quote 0
                    • vbs
                      vbs @lampalork last edited by

                      @lampalork
                      Which data feed are you using?

                      1 Reply Last reply Reply Quote 0
                      • tianjixuetu
                        tianjixuetu @backtrader last edited by

                        @backtrader @lampalork @vladisld maybe,this is a way to speed up! How to speed up almost 100 times when add data and preload data?

                        1 Reply Last reply Reply Quote 1
                        • 1 / 1
                        • First post
                          Last post
                        Copyright © 2016, 2017, 2018, 2019, 2020, 2021 NodeBB Forums | Contributors