Navigation

    Backtrader Community

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    For code/output blocks: Use ``` (aka backtick or grave accent) in a single line before and after the block. See: http://commonmark.org/help/

    Help Understanding Replay on Intraday Data/Compression

    General Code/Help
    2
    12
    85
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Matt Wilson
      Matt Wilson last edited by

      I see others are having some issues with getting replaydata to work correctly, and I haven't seen a reproducible code example of what I'm trying to accomplish, so figured I'd ask the gods here.

      Question 1 - How exactly does replaydata work? I.e., how does it simulate the bars from the lower timeframe? As an example using the 1 minute timeframe, TradingView's replay feature will start with the open, then go to the next nearest price (whether it be high or low) on the 2nd tick, then will go to the opposite H/L price on the 3rd tick, before finally going to the close price on the 4th. Does BackTrader do this, or does it just simulate the entire 1 minute bar in one shot?

      Question 2 - For arguments sake, let's say I have a 1 minute dataset in CSV format, and I'd like to "replay" at the 5 minute timeframe. Would I be correct in assuming that the proper syntax would be:

      cerebro.replaydata(data, timeframe=bt.TimeFrame.Minutes, compression=5)

      ... or would there need to be extra options added to ensure the timeframe's lined up properly (as I read someone else having issues here). When I run the example strategy:

          def next(self):
              # Simply log the closing price of the series from the reference
              # self.log('Close, %.2f' % self.dataclose[0])
      
              # Check if an order is pending ... if yes, we cannot send a 2nd one
              if self.order:
                  return
      
              # Check if we are in the market
              if not self.position:
      
                  # Not yet ... we MIGHT BUY if ...
                  if self.dataclose[0] > self.sma[0]:
      
                      # BUY, BUY, BUY!!! (with all possible default parameters)
                      self.log('BUY CREATE, %.2f' % self.dataclose[0])
      
                      # Keep track of the created order to avoid a 2nd order
                      self.order = self.buy()
      
              else:
      
                  if self.dataclose[0] < self.sma[0]:
                      # SELL, SELL, SELL!!! (with all possible default parameters)
                      self.log('SELL CREATE, %.2f' % self.dataclose[0])
      
                      # Keep track of the created order to avoid a 2nd order
                      self.order = self.sell()
      

      ...my output looks like this, which seems really off:

      Starting Portfolio Value: 10000.00
      2019-12-10 00:00:01.209599, BUY CREATE, 304.18
      2019-12-10 00:00:01.295999, BUY EXECUTED, Price: 304.12, Cost: 3041.23, Comm 3.04
      2019-12-10 00:00:02.851211, SELL CREATE, 304.22
      2019-12-10 00:00:02.937612, SELL EXECUTED, Price: 304.26, Cost: 3041.23, Comm 3.04
      2019-12-10 00:00:02.937612, OPERATION PROFIT, GROSS 1.35, NET -4.73
      2019-12-10 00:00:04.060820, BUY CREATE, 304.17
      2019-12-10 00:00:04.147221, BUY EXECUTED, Price: 304.19, Cost: 3041.90, Comm 3.04
      2019-12-10 00:00:05.356830, SELL CREATE, 304.25
      2019-12-10 00:00:05.443231, SELL EXECUTED, Price: 304.30, Cost: 3041.90, Comm 3.04
      ...
      2019-12-10 00:00:59.789245, BUY CREATE, 303.31
      2019-12-10 00:00:59.875645, BUY EXECUTED, Price: 303.32, Cost: 3033.20, Comm 3.03
      2019-12-10 00:01:00.221248, SELL CREATE, 303.30
      2019-12-10 23:59:59.999989, SELL EXECUTED, Price: 302.45, Cost: 3033.20, Comm 3.02
      2019-12-10 23:59:59.999989, OPERATION PROFIT, GROSS -8.70, NET -14.76
      2019-12-11 00:00:01.814403, BUY CREATE, 302.26
      2019-12-11 00:00:01.900804, BUY EXECUTED, Price: 302.25, Cost: 3022.47, Comm 3.02
      2019-12-11 00:00:03.024012, SELL CREATE, 302.50
      2019-12-11 00:00:03.110413, SELL EXECUTED, Price: 302.47, Cost: 3022.47, Comm 3.02
      2019-12-11 00:00:03.110413, OPERATION PROFIT, GROSS 2.22, NET -3.82
      

      ...and it goes on and on like this, closes trades at the end of the day, and wayyy too many trades being placed in the mean time for a simple price crossing sma strategy? (I can post full code if desired, but without my dataset, might seem like pointless clutter, as I'm just following the documentation strategy).

      Question 3 - With the above two things in mind, let's say I've created my own "tick" dataset as described in Q1 above that looks like this (notice the time stamps):

      50548b2c-c079-4e65-9dac-923427c8c00c-image.png

      ...how could I "replay" this data on the 5 minute timeframe, ensuring the times line up as described in some other posts re: replaydata? Before sending me to the documentation page, I've read it a few times, but there only seems to be direction on the daily/weekly timeframes, but not tick/intraday, so would love some clarification on it.

      Thanks!!! I hope this post can serve as a guide to anyone else looking to work with tick data (unless you can steer me to similar posts that I haven't seen already). Would like to see all 3 questions get answered here.

      1 Reply Last reply Reply Quote 0
      • Pierre Cilliers 0
        Pierre Cilliers 0 last edited by

        Hi @Matt-Wilson

        Can you please share the code of how you import your data to backtrader. For example, how does your data feed look?

        Similar to this?

        datapath = os.path.join('../../filename.csv')
        data = bt.feeds.GenericCSVData(dataname=datapath,
                                                separator=";",
                                                fromdate=fromdate,
                                                todate=todate,
                                                dtformat=('%Y%m%d'),
                                                tmformat=('%H:%M:%S:%f'),
                                                timeframe=bt.TimeFrame.MicroSeconds,
                                                compression=1,
                                                date=0,
                                                time=1,
                                                open=2,
                                                high=3,
                                                low=2,
                                                close=3,
                                                volume=4,
                                                openinterest=-1
                                                )
        
        1 Reply Last reply Reply Quote 0
        • Matt Wilson
          Matt Wilson last edited by

          @Pierre-Cilliers-0

          Certainly, this is what that looks like:

              datapath = './data/SPY1min.csv'
          
              data = bt.feeds.GenericCSVData(
                  dataname=datapath,
                  # fromdate=datetime.datetime(2000, 1, 1),
                  # todate=datetime.datetime(2000, 12, 31),
                  # nullvalue=0.0,
                  reverse=False,
                  dtformat=('%Y-%m-%d  %H:%M:%S'),
                  tmformat=('%H:%M:%S'),
                  datetime=0,
                  open=1,
                  high=2,
                  low=3,
                  close=4,
                  volume=5,
                  openinterest=-1
              )
          
              # Add the Data Feed to Cerebro
              cerebro.replaydata(data, timeframe=bt.TimeFrame.Minutes, compression=5)
          
          Pierre Cilliers 0 1 Reply Last reply Reply Quote 0
          • Pierre Cilliers 0
            Pierre Cilliers 0 @Matt Wilson last edited by

            @matt-wilson

            Cool thanks.

            So as I have it (looking at your datetime format, it seems like you have data that contains seconds (assuming it is in 1-second intervals). Note that you need to TELL backtrader what frequency your data is in the bt.feed. Therefore you should always (for safety) add the lines timeframe and compression in your bt.feed.

            As I have used replay in the past I would suggest you try the following (if your data is in 1-second intervals). ***SideNote --- if it is in 15-second intervals (as your previous dataset), then you would still use the same timeframe parameter but change compression = 15:

                datapath = './data/SPY1min.csv'
            
                data = bt.feeds.GenericCSVData(
                    dataname=datapath,
                    reverse=False,
                    dtformat=('%Y-%m-%d %H:%M:%S'),
                    # tmformat=('%H:%M:%S'), # this line might not be neccessary?
                    timeframe=bt.TimeFrame.Seconds,
                    compression=1,
                    datetime=0,
                    open=1,
                    high=2,
                    low=3,
                    close=4,
                    volume=5,
                    openinterest=-1
                )
            
                # Add the Data Feed to Cerebro
                cerebro.replaydata(data, timeframe=bt.TimeFrame.Minutes, compression=5) # this will then be stored in variable self.data/self.data0 in 5 minute intervals
            

            A way to test this is to keep two data sources to compare when running your strategies. For instance, keep both 1-second interval data and then also your 5-minute resampled data. If you would want to test this, keep the bt.feed the same (like above) but replace your replaydata line with:

                    cerebro.replaydata(data, timeframe=bt.TimeFrame.Seconds)  # stored in variable self.data OR self.data0 which is every 1-second interval
                    data.plotinfo.plotmaster = data # ignore: this just ensures that it plots both intervals on the same plot
                    cerebro.replaydata(data, timeframe=bt.TimeFrame.Minutes, compression=5)  # stored in variable self.data1 which is every 5-minute interval
            

            Therefore, in your def next() you can print("1-second interval:", self.data0.close[0], "5-minute interval:", self.data1.close[0]) which will print out the close price of your 1-second interval and your 5 minute interval. The result you will be looking for is that the self.data0.close[0] will change each second but the self.data1.close[0] will stay the same for 5 minutes before changing.

            Matt Wilson 1 Reply Last reply Reply Quote 0
            • Matt Wilson
              Matt Wilson @Pierre Cilliers 0 last edited by

              @pierre-cilliers-0

              Thank you for the suggestions, however let me back up a second as there seems to be something wrong with way my datetime's are showing up in the log of BT.

              I'm using a simple 1 minute dataset, taken from AlphaVantage, which looks like this:

              960518fe-309f-411d-93b2-423900deeae6-image.png

              Simple enough, the times shown are in NY time, and show extended trading hours (i.e. 4 AM).

              NOT using replaydata, just simply using adddata, my output for the log looks like this:

              b029d32c-26df-4951-b915-31165a6327de-image.png

              Notice the times are all at midnight, even on the same day, which isn't accurate, this should be showing the current datetime down to the minute of when each trade was placed?

              I'll show my full code below for reproducibility, but I'm using a 500 period moving average and the simple "price crossing MA" strategy, so there aren't a lot of trades happening. I just need to get this datetime issue sorted out, and then I can accurately implement your suggestions and go from there.

              The dataset that I'm using for these tests can be accessed from my Google Drive link here, and then run the code below to see the same output. Could this be because of how I'm defining the datetime format? Even though it looks right, or are orders really only fulfilled at the end of each day by default?

              Thanks! Once I get this fixed, I will start on your suggestions.

              Code here:

              from __future__ import (absolute_import, division, print_function,
                                      unicode_literals)
              
              import datetime  # For datetime objects
              import os.path  # To manage paths
              import sys  # To find out the script name (in argv[0])
              
              # Import the backtrader platform
              import backtrader as bt
              
              # Create a Stratey
              class TestStrategy(bt.Strategy):
                  params = (('maperiod', 500),)
              
                  def log(self, txt, dt=None):
                      ''' Logging function fot this strategy'''
                      dt = dt or self.datas[0].datetime.datetime(0)
                      # Attempting two different print methods here for the current
                      # datetime.
                      print('%s, %s' % (dt.isoformat(), txt))
                      # print('%s, %s' % (dt, txt))
              
                  def __init__(self):
                      # Keep a reference to the "close" line in the data[0] dataseries
                      self.dataclose = self.datas[0].close
                      # To keep track of pending orders and buy price/commission
                      self.order = None
                      self.buyprice = None
                      self.buycomm = None
                      # Add a MovingAverageSimple indicator
                      self.sma = bt.indicators.SimpleMovingAverage(
                          self.datas[0], period=self.params.maperiod)
              
                  def notify_order(self, order):
                      if order.status in [order.Submitted, order.Accepted]:
                          # Buy/Sell order submitted/accepted to/by broker - Nothing to do
                          return
                      # Check if an order has been completed
                      # Attention: broker could reject order if not enough cash
                      if order.status in [order.Completed]:
                          if order.isbuy():
                              self.log(
                                  'BUY EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f' %
                                  (order.executed.price,
                                   order.executed.value,
                                   order.executed.comm))
              
                              self.buyprice = order.executed.price
                              self.buycomm = order.executed.comm
                          else:  # Sell
                              self.log('SELL EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f' %
                                       (order.executed.price,
                                        order.executed.value,
                                        order.executed.comm))
                          self.bar_executed = len(self)
                      elif order.status in [order.Canceled, order.Margin, order.Rejected]:
                          self.log('Order Canceled/Margin/Rejected')
                      # Write down: no pending order
                      self.order = None
              
                  def notify_trade(self, trade):
                      if not trade.isclosed:
                          return
                      self.log('OPERATION PROFIT, GROSS %.2f, NET %.2f' %
                               (trade.pnl, trade.pnlcomm))
              
                  def next(self):
                      # Simply log the closing price of the series from the reference
                      # self.log('Close, %.2f' % self.dataclose[0])
                      # Check if an order is pending ... if yes, we cannot send a 2nd one
                      if self.order:
                          return
                      # Check if we are in the market
                      if not self.position:
                          # Not yet ... we MIGHT BUY if ...
                          if self.dataclose[0] > self.sma[0]:
                              # BUY, BUY, BUY!!! (with all possible default parameters)
                              self.log('BUY CREATE, %.2f' % self.dataclose[0])
                              # Keep track of the created order to avoid a 2nd order
                              self.order = self.buy()
                      else:
                          if self.dataclose[0] < self.sma[0]:
                              # SELL, SELL, SELL!!! (with all possible default parameters)
                              self.log('SELL CREATE, %.2f' % self.dataclose[0])
                              # Keep track of the created order to avoid a 2nd order
                              self.order = self.sell()
              
              
              if __name__ == '__main__':
                  # Create a cerebro entity
                  cerebro = bt.Cerebro()
              
                  # Add a strategy
                  cerebro.addstrategy(TestStrategy)
              
                  datapath = './data/SPY1min.csv'
              
                  data = bt.feeds.GenericCSVData(
                      dataname=datapath,
                      reverse=False,
                      dtformat=('%Y-%m-%d %H:%M'),
                      datetime=0,
                      open=1,
                      high=2,
                      low=3,
                      close=4,
                      volume=5,
                      openinterest=-1
                  )
              
                  # Add the Data Feed to Cerebro
                  cerebro.adddata(data)
                  # cerebro.replaydata(data, timeframe=bt.TimeFrame.Minutes, compression=15)
              
                  # Set our desired cash start
                  cerebro.broker.setcash(10000.0)
              
                  # Add a FixedSize sizer according to the stake
                  cerebro.addsizer(bt.sizers.FixedSize, stake=1)
              
                  # Set the commission
                  cerebro.broker.setcommission(commission=0.0001)
              
                  # Print out the starting conditions
                  print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue())
              
                  # Run over everything
                  cerebro.run()
              
                  # Print out the final result
                  print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue())
              
                  # Plot the result
                  cerebro.plot()
              
              Matt Wilson 1 Reply Last reply Reply Quote 0
              • Matt Wilson
                Matt Wilson @Matt Wilson last edited by

                I should also point out the plot looks off, the green and red arrows are not showing on the price the trade was placed at, but way below/above them, and when I try using my 15 second dataset, it's even worse. Not sure if it's relevant or not, but just thought I'd mention it:

                d0c01ad8-515f-44ff-93c2-e4debbc77f18-image.png

                Matt Wilson 1 Reply Last reply Reply Quote 0
                • Matt Wilson
                  Matt Wilson @Matt Wilson last edited by

                  Quick update,

                  With the help of this post re: the 1 minute issue, and this post re: the plotting issue, I've been able to fix these minor problems. I will be fairly busy until Thursday this week, but I will post an update with @Pierre-Cilliers-0 's suggestions after that.

                  Matt Wilson 1 Reply Last reply Reply Quote 0
                  • Matt Wilson
                    Matt Wilson @Matt Wilson last edited by

                    Ok last update until Thursday haha

                    So using @Pierre-Cilliers-0 's suggestions, and using my 15 second dataset again, I've added this print line to my next() function:

                    print(self.datas[0].datetime.datetime(0).isoformat(),"15-second close:", self.data0.close[0], "5-minute close:", self.data1.close[0])
                    

                    as well as these lines when defining my datareplay's/cerebro re: the plotting issue (which still isn't perfect, but it's close enough):

                        # Create a cerebro entity
                        cerebro = bt.Cerebro(stdstats=False)
                    
                        # Add a strategy
                        cerebro.addstrategy(TestStrategy)
                    
                        datapath = 'SPY_5_1_5_days_backtest_dataset.csv'
                    
                        data = bt.feeds.GenericCSVData(
                            dataname=datapath,
                            reverse=False,
                            dtformat=('%Y-%m-%d %H:%M:%S'),
                            timeframe=bt.TimeFrame.Seconds, compression=15,
                            datetime=0,
                            open=1,
                            high=2,
                            low=3,
                            close=4,
                            volume=5,
                            openinterest=-1
                        )
                    
                        # Add the Data Feed to Cerebro
                        # cerebro.adddata(data)
                        cerebro.replaydata(data, timeframe=bt.TimeFrame.Seconds) 
                        data.plotinfo.plotmaster = data # ignore
                        cerebro.replaydata(data, timeframe=bt.TimeFrame.Minutes, compression=5) 
                    
                        cerebro.addobserver(
                            bt.observers.BuySell,
                            barplot=True,
                            bardist=0)  # buy / sell arrows
                    

                    Running this code, my 15 second close and my 5 minute close are identical, the 5 minute close == the 15 close on every bar. I would imagine the 5 minute close should stay as the last fully formed 5 minute bar's close price, until the 15 second dataset reaches the next % 5 minute spot, yes? Or is this a normal output? Output is below:

                    72b754e5-eabc-42b5-ba4f-de8a078c109c-image.png

                    So will need to figure this out, however

                    ANSWER TO QUESTION 1 in OP
                    The lower timeframe used in replaydata uses the close column's "ticks" to simulate the replays.

                    ANSWER TO QUESTION 2 in OP
                    The correct syntax when replaying 1 minute data up to a 5 minute interval would look like this:

                        data = bt.feeds.GenericCSVData(
                            dataname=datapath,
                            reverse=False,
                            dtformat=('%Y-%m-%d %H:%M:%S'),
                            timeframe=bt.TimeFrame.Minutes, compression=1,
                            datetime=0,
                            open=1,
                            high=2,
                            low=3,
                            close=4,
                            volume=5,
                            openinterest=-1
                        )
                    
                        # Add the Data Feed to Cerebro
                        cerebro.replaydata(data, timeframe=bt.TimeFrame.Minutes, compression=5)
                    

                    Once we get this resampled 5 minute close price thing lining up properly, I can answer question 3. Talk Thursday! Thanks again.

                    Pierre Cilliers 0 1 Reply Last reply Reply Quote 0
                    • Pierre Cilliers 0
                      Pierre Cilliers 0 last edited by

                      Hi @Matt-Wilson

                      Just a few points I want to bring up. Sorry if I dont respond to everything. Will try my best.

                      First thing first, I assume you are using the 1-minute interval data for your first questions.

                      @matt-wilson said in Help Understanding Replay on Intraday Data/Compression:

                      I'm using a simple 1 minute dataset, taken from AlphaVantage, which looks like this:

                      Note that if your trades are being made on 23:59:59.99999, it means that backtrader is executing everything on a daily-interval. This is confirmed if you look at your plot. See next to the SPY1min word, it says (1 Dat), which means that backtrader interperates the data on a daily-interval.

                      I have two suggestions for this.
                      First, add the following lines to your bt.feeds:

                      ... ,
                      dtformat=('%Y-%m-%d %H:%M'),
                      timeframe=bt.TimeFrame.Minutes,
                      compression=1, 
                      ... ,
                      

                      This will ensure that Backtrader knows that the input data is 1(compression=1)-minute(timeframe=bt.TimeFrame.Minutes)) intervals.

                      Secondly, I see you are calling self.datas when assigning your moving average and closing price. Try replace this with self.data/self.data0 instead. The variable self.datas is a collective object of all data sources in the backtrader backend where self.data/self.data0 points to the first input data source which is your 1-minute interval data. Therefore try replace self.dataclose = self.data[0].close and self.sma = bt.indicators.SimpleMovingAverage(self.data[0], period=self.params.maperiod).

                      1 Reply Last reply Reply Quote 0
                      • Pierre Cilliers 0
                        Pierre Cilliers 0 @Matt Wilson last edited by

                        @matt-wilson Now touching on your final post.

                        @matt-wilson said in Help Understanding Replay on Intraday Data/Compression:

                        Running this code, my 15 second close and my 5 minute close are identical, the 5 minute close == the 15 close on every bar. I would imagine the 5 minute close should stay as the last fully formed 5 minute bar's close price, until the 15 second dataset reaches the next % 5 minute spot, yes? Or is this a normal output? Output is below:

                        I will take the hit for this, sorry.
                        Due to a time interval's close price being dynamic (meaning that it updates continuously until the timeframe is closed) the 15-second and 5-minute intervals close[0] will always be the same.
                        To explain this better... at any given point in time (lets say at 10 seconds into your execution, the current rulling price will be assigned to the close price of both your 15-second and 5-minute intervals)./
                        But note that the open price will indicate what we are looking for as these values are static, not dynamic.

                        So please replace your line with the following top check whether the replaydatya works properly.

                        print(self.datas[0].datetime.datetime(0).isoformat(),"15-second open:", self.data0.open[0], "5-minute open:", self.data1.open[0])
                        

                        Chat Thursday mate.

                        Matt Wilson 1 Reply Last reply Reply Quote 1
                        • Matt Wilson
                          Matt Wilson @Pierre Cilliers 0 last edited by

                          @pierre-cilliers-0

                          Not a problem! Most of that was me working through it aloud :P

                          Ahhhh yes that makes sense, duh, the close is always moving in real time, of course. Which is good! I like knowing that's what's going on during the backtest anyway. So I think I'm all good now!

                          Your suggestions really helped me understand what's going on with replaydata, and I so appreciate it! So to answer my final question...

                          ANSWER TO QESTION 3 in OP

                              data = bt.feeds.GenericCSVData(
                                  dataname=datapath,
                                  reverse=False,
                                  dtformat=('%Y-%m-%d %H:%M:%S'),
                                  timeframe=bt.TimeFrame.Seconds, compression=15,
                                  datetime=0,
                                  open=1,
                                  high=2,
                                  low=3,
                                  close=4,
                                  volume=5,
                                  openinterest=-1
                              )
                              # Add the Data Feed to Cerebro
                              # cerebro.adddata(data)
                              cerebro.replaydata(data, timeframe=bt.TimeFrame.Seconds)  # stored in variable self.data OR self.data0 which is every 1-second interval
                              data.plotinfo.plotmaster = data # ignore: this just ensures that it plots both intervals on the same plot
                              cerebro.replaydata(data, timeframe=bt.TimeFrame.Minutes, compression=5)  # stored in variable self.data1 which is every 5-minute interval
                          

                          Be sure to add the compression and timeframe arguments in the bt.feeds.WhateverCSVData() call to tell BT what kind of data it's originally looking at, then you can add your higher timeframe(s) when invoking the replaydata. Then you can access each dataset individually with self.data0 or self.data1 in the strategy class.

                          Thanks again!! Have a great end to the week!

                          Pierre Cilliers 0 1 Reply Last reply Reply Quote 0
                          • Pierre Cilliers 0
                            Pierre Cilliers 0 @Matt Wilson last edited by

                            @matt-wilson Cool man.

                            Yeah it took me a while to resample/replay my data correctly, but now that it is clear, you can cruise through your backtesting

                            1 Reply Last reply Reply Quote 1
                            • 1 / 1
                            • First post
                              Last post
                            Copyright © 2016, 2017, 2018, 2019, 2020, 2021 NodeBB Forums | Contributors