Navigation

    Backtrader Community

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    For code/output blocks: Use ``` (aka backtick or grave accent) in a single line before and after the block. See: http://commonmark.org/help/

    Strange behavior around holidays

    General Discussion
    2
    8
    211
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • R
      Rapha last edited by

      Hi there,

      I am in the process of switching from Zipline to Backtrader, and analyze the difference between the two backtrading engines for some algos.

      While the holdings and returns are very close for some periods, I have big variations at some points in time. One source of difference I have identified occurs around holidays where the algo seems to be looping twice on the day preceding the holiday, and behave in a fashion I cannot explain shortly after.

      I have for instance identified this behavior the around Good Friday 2016 (24-03-2016), where the behavior is the following:

      • 24-03-2016 --> algo loops twice through next
      • 25-03-2016 --> Good Friday (market closed)
      • 26-03-2016 & 27-03-2016 --> weekend (market closed)
      • 28-03-2016 --> nothing happens
      • 29-03-2016 --> orders are submitted, accepted and executed, but no position is reported for that day in my log
      • 30-03-2016 --> back to normal

      As you'll see in my code, I tried two different options to add a calendar (one commented) as it seemed to the issue here, but it did not help.

      Below is my code which is a slightly modified version of the the following script.
      https://github.com/PacktPublishing/Machine-Learning-for-Algorithmic-Trading-Second-Edition/blob/master/08_ml4t_workflow/03_backtesting_with_backtrader.ipynb

      Remarks

      • My data feed does not contain any values for the holidays, but does for every other days when the market is open (28-03-2016 for instance)
      • My signal (predicted column of self.datas) is the output of a simple regression.

      My knowledge of backtrader is quite limited yet, and this issue has gotten me stuck for a while now. Any help would be highly appreciated. Please let me know if the log for the days mentionned above would help.

      from pathlib import Path
      import csv
      from time import time
      import datetime
      import numpy as np
      import pandas as pd
      import pandas_datareader.data as web
      import matplotlib.pyplot as plt
      import seaborn as sns
      
      import pandas_market_calendars as mcal
      
      import backtrader as bt
      from backtrader.feeds import PandasData
      
      import quantstats as qs
      
      #nyse = mcal.get_calendar('NYSE')
      class NYSE_2016(bt.TradingCalendar):
          params = dict(
              holidays=[
                  datetime.date(2016, 1, 1),
                  datetime.date(2016, 1, 18),
                  datetime.date(2016, 2, 15),
                  datetime.date(2016, 3, 25),
                  datetime.date(2016, 5, 30),
                  datetime.date(2016, 7, 4),
                  datetime.date(2016, 9, 5),
                  datetime.date(2016, 11, 24),
                  datetime.date(2016, 12, 26),
              ]
          )
      
      pd.set_option('display.expand_frame_repr', False)
      np.random.seed(42)
      sns.set_style('darkgrid')
      
      def format_time(t):
          m_, s = divmod(t, 60)
          h, m = divmod(m_, 60)
          return f'{h:>02.0f}:{m:>02.0f}:{s:>02.0f}'
      
      #----------------------------- BACKTRADER SETUP -------------------------------
      class FixedCommisionScheme(bt.CommInfoBase):
          """
          Simple fixed commission scheme for demo
          """
          params = (
              ('commission', .02),
              ('stocklike', True),
              ('commtype', bt.CommInfoBase.COMM_FIXED),
          )
      
          def _getcommission(self, size, price, pseudoexec):
              return abs(size) * self.p.commission
          
      #----------------- DATAFRAME LOADER -----------------
      OHLCV = ['open', 'high', 'low', 'close', 'volume']
      class SignalData(PandasData):
          """
          Define pandas DataFrame structure
          """
          cols = OHLCV + ['predicted']
      
          # create lines
          lines = tuple(cols)
      
          # define parameters
          params = {c: -1 for c in cols}
          params.update({'datetime': None})
          params = tuple(params.items())
          
      #----------------- STRATEGY -----------------
      class MLStrategy(bt.Strategy):
          params = (('n_positions', 20),
                    ('min_positions', 10),
                    ('verbose', False),
                    ('log_file', 'backtest.csv'))
      
          def log(self, txt, dt=None):
              """ Logger for the strategy"""
              dt = dt or self.datas[0].datetime.datetime(0)
              with Path(self.p.log_file).open('a') as f:
                  log_writer = csv.writer(f)
                  log_writer.writerow([dt.isoformat()] + txt.split(','))
      
          def notify_order(self, order):
              if order.status in [order.Submitted, order.Accepted]:
                  if order.status in [order.Submitted]:
                      self.log(f'{order.data._name},SUBMITTED')
                  if order.status in [order.Accepted]:
                      self.log(f'{order.data._name},ACCEPTED')
                  return
      
              if self.p.verbose:
                  if order.status in [order.Completed]:
                      p = order.executed.price
                      if order.isbuy():
                          self.log(f'{order.data._name},BUY executed,{p:.2f}')
                      elif order.issell():
                          self.log(f'{order.data._name},SELL executed,{p:.2f}')
      
                  elif order.status in [order.Canceled]:
                      self.log(f'{order.data._name},Order Canceled')
                  elif order.status in [order.Margin]:
                      self.log(f'{order.data._name},Order Margin')
                  elif order.status in [order.Rejected]:
                      self.log(f'{order.data._name},Order Rejected')
      
          def prenext(self):
              self.next()
      
          def next(self):
              self.log('next')
              today = self.datas[0].datetime.date()
      
              positions = [d._name for d, pos in self.getpositions().items() if pos]
              posdata = [d for d, pos in self.getpositions().items() if pos]   
              
              up, down = {}, {}
              missing = not_missing = 0
              for data in self.datas:
                  if data.datetime.date() == today:
                      if data.predicted[0] > 0:
                          up[data._name] = data.predicted[0]
                      elif data.predicted[0] < 0:
                          down[data._name] = data.predicted[0]
                  
              for ticker in posdata:
                  self.log(f'{ticker._name,self.getposition(data=ticker).size},POSITION')
                  
              shorts = sorted(down, key=down.get)[:self.p.n_positions]
              longs = sorted(up, key=up.get, reverse=True)[:self.p.n_positions]
              n_shorts, n_longs = len(shorts), len(longs)
      
              if n_shorts < self.p.min_positions or n_longs < self.p.min_positions:
                  longs, shorts = [], []
              else:
                  short_target = -1 / n_shorts
                  long_target = 1 / n_longs
                  
              
              for ticker in positions:
                  if ticker not in longs + shorts:
                      self.order_target_percent(data=ticker, target=0)
                      self.log(f'{ticker},CLOSING ORDER CREATED')
                  
              for ticker in shorts:
                  self.order_target_percent(data=ticker, target=short_target)
                  self.log(f'{ticker},SHORT ORDER CREATED')
              for ticker in longs:
                  self.order_target_percent(data=ticker, target=long_target)
                  self.log(f'{ticker},LONG ORDER CREATED')
      
                  
      #CREATE AND CONFIGURE CEREBRO INSTANCE
      cerebro = bt.Cerebro() 
      cash = 1000000
      cerebro.broker.setcash(cash)
      
      #------------------------------ ADD INPUT DATA --------------------------------
      idx = pd.IndexSlice
      data = pd.read_hdf('data.h5', 'backtest_data').sort_index()
      tickers = data.index.get_level_values(0).unique()
      
      for ticker in tickers:
          df = data.loc[idx[ticker, :], :].droplevel('ticker', axis=0)
          df.index.name = 'datetime'
          bt_data = SignalData(dataname=df)
          cerebro.adddata(bt_data, name=ticker)
          
      #---------------------------- RUN STRATEGY BACKTEST ---------------------------
      #cerebro.addcalendar(nyse)
      cerebro.addcalendar(NYSE_2016)
      cerebro.addanalyzer(bt.analyzers.PyFolio, _name='pyfolio')
      cerebro.addstrategy(MLStrategy, n_positions=20, min_positions=10,
                          verbose=True, log_file='backtesting_backtrader_log.csv')
      start = time()
      results = cerebro.run()
      ending_value = cerebro.broker.getvalue()
      duration = time() - start
      
      print(f'Final Portfolio Value: {ending_value:,.2f}')
      print(f'Duration: {format_time(duration)}')
      
      #GET PYFOLIO INPUTS
      pyfolio_analyzer = results[0].analyzers.getbyname('pyfolio')
      returns, positions, transactions, gross_lev = pyfolio_analyzer.get_pf_items()
      
      returns.rename_axis(index={'index':'date'})
      gross_lev.rename_axis(index={'index':'date'})
      positions.rename_axis(index={'Datetime':'date'})
      
      returns.to_hdf('backtest.h5', 'backtrader/returns')
      positions.to_hdf('backtest.h5', 'backtrader/positions')
      transactions.to_hdf('backtest.h5', 'backtrader/transactions')
      gross_lev.to_hdf('backtest.h5', 'backtrader/gross_lev')
      
      #------------------------------- RUN PYFOLIO ----------------------------------
      returns = pd.read_hdf('backtest.h5', 'backtrader/returns')
      positions = pd.read_hdf('backtest.h5', 'backtrader/positions')
      transactions = pd.read_hdf('backtest.h5', 'backtrader/transactions')
      gross_lev = pd.read_hdf('backtest.h5', 'backtrader/gross_lev')
      
      benchmark = web.DataReader('SP500', 'fred', '2014', '2018').squeeze()
      benchmark = benchmark.pct_change().tz_localize('UTC')
      
      daily_tx = transactions.groupby(level=0)
      longs = daily_tx.value.apply(lambda x: x.where(x>0).sum())
      shorts = daily_tx.value.apply(lambda x: x.where(x<0).sum())
      
      fig, axes = plt.subplots(ncols=2, figsize=(15, 5))
      
      df = returns.to_frame('Strategy').join(benchmark.to_frame('Benchmark (S&P 500)'))
      df.add(1).cumprod().sub(1).plot(ax=axes[0], title='Cumulative Return')
      
      longs.plot(label='Long',ax=axes[1], title='Positions')
      shorts.plot(ax=axes[1], label='Short')
      positions.cash.plot(ax=axes[1], label='PF Value')
      axes[1].legend()
      sns.despine()
      fig.tight_layout()
      
      plt.show()
      plt.close()
      
      1 Reply Last reply Reply Quote 0
      • A
        ab_trader last edited by

        I don't think you need a trading calendar to be added in case if you don't use resampling and you have no data on the holidays.

        Also there is a lot of things going on in your script (necessary or not), so I would split it on a simpler pieces to debug.

        And it us useful to have output as well.

        • If my answer helped, hit reputation up arrow at lower right corner of the post.
        • Python Debugging With Pdb
        • New to python and bt - check this out
        R 1 Reply Last reply Reply Quote 2
        • R
          Rapha @ab_trader last edited by

          @ab_trader Thanks for your feedback.

          I removed from my code whatever was not necessary to reproduce the issue (in my environment at least):

          from pathlib import Path
          import csv
          from time import time
          import numpy as np
          import pandas as pd
          import seaborn as sns
          
          import backtrader as bt
          from backtrader.feeds import PandasData
          
          pd.set_option('display.expand_frame_repr', False)
          np.random.seed(42)
          sns.set_style('darkgrid')
          
          def format_time(t):
              m_, s = divmod(t, 60)
              h, m = divmod(m_, 60)
              return f'{h:>02.0f}:{m:>02.0f}:{s:>02.0f}'
              
          #----------------- DATAFRAME LOADER -----------------
          OHLCV = ['open', 'high', 'low', 'close', 'volume']
          class SignalData(PandasData):
              cols = OHLCV + ['predicted']
          
              lines = tuple(cols)
          
              params = {c: -1 for c in cols}
              params.update({'datetime': None})
              params = tuple(params.items())
              
          #----------------- STRATEGY -----------------
          class MLStrategy(bt.Strategy):
              params = (('n_positions', 20),
                        ('min_positions', 10),
                        ('verbose', False),
                        ('log_file', 'backtest.csv'))
          
              def log(self, txt, dt=None):
                  """ Logger for the strategy"""
                  dt = dt or self.datas[0].datetime.datetime(0)
                  with Path(self.p.log_file).open('a') as f:
                      log_writer = csv.writer(f)
                      log_writer.writerow([dt.isoformat()] + txt.split(','))
          
              def notify_order(self, order):
                  if order.status in [order.Submitted, order.Accepted]:
                      if order.status in [order.Submitted]:
                          self.log(f'{order.data._name},SUBMITTED')
                      if order.status in [order.Accepted]:
                          self.log(f'{order.data._name},ACCEPTED')
                      return
          
                  if self.p.verbose:
                      if order.status in [order.Completed]:
                          p = order.executed.price
                          if order.isbuy():
                              self.log(f'{order.data._name},BUY executed,{p:.2f}')
                          elif order.issell():
                              self.log(f'{order.data._name},SELL executed,{p:.2f}')
          
                      elif order.status in [order.Canceled]:
                          self.log(f'{order.data._name},Order Canceled')
                      elif order.status in [order.Margin]:
                          self.log(f'{order.data._name},Order Margin')
                      elif order.status in [order.Rejected]:
                          self.log(f'{order.data._name},Order Rejected')
          
              def prenext(self):
                  self.next()
          
              def next(self):
                  self.log('next')
                  today = self.datas[0].datetime.date()
          
                  positions = [d._name for d, pos in self.getpositions().items() if pos]
                  posdata = [d for d, pos in self.getpositions().items() if pos]    
                  
                  up, down = {}, {}
                  for data in self.datas:
                      if data.datetime.date() == today:
                          if data.predicted[0] > 0:
                              up[data._name] = data.predicted[0]
                          elif data.predicted[0] < 0:
                              down[data._name] = data.predicted[0]
                      
                  for ticker in posdata:
                      self.log(f'{ticker._name,self.getposition(data=ticker).size},POSITION')
          
                  shorts = sorted(down, key=down.get)[:self.p.n_positions]
                  longs = sorted(up, key=up.get, reverse=True)[:self.p.n_positions]
                  n_shorts, n_longs = len(shorts), len(longs)
                  
                  if n_shorts < self.p.min_positions or n_longs < self.p.min_positions:
                      longs, shorts = [], []
                  else:
                      short_target = -1 / n_shorts
                      long_target = 1 / n_longs
                      
                  
                  for ticker in positions:
                      if ticker not in longs + shorts:
                          self.order_target_percent(data=ticker, target=0)
                          self.log(f'{ticker},CLOSING ORDER CREATED')
                      
                  for ticker in shorts:
                      self.order_target_percent(data=ticker, target=short_target)
                      self.log(f'{ticker},SHORT ORDER CREATED')
                  for ticker in longs:
                      self.order_target_percent(data=ticker, target=long_target)
                      self.log(f'{ticker},LONG ORDER CREATED')
          
          
          cerebro = bt.Cerebro()
          
          cash = 1000000
          
          cerebro.broker.setcash(cash)
          
          #------------------------------ ADD INPUT DATA --------------------------------
          idx = pd.IndexSlice
          data = pd.read_hdf('data.h5', 'backtest_data').sort_index()
          tickers = data.index.get_level_values(0).unique()
          
          for ticker in tickers:
              df = data.loc[idx[ticker, :], :].droplevel('ticker', axis=0)
              df.index.name = 'datetime'
              bt_data = SignalData(dataname=df)
              cerebro.adddata(bt_data, name=ticker)
              
          #---------------------------- RUN STRATEGY BACKTEST ---------------------------
          #cerebro.addcalendar(nyse)
          cerebro.addanalyzer(bt.analyzers.PyFolio, _name='pyfolio')
          cerebro.addstrategy(MLStrategy, n_positions=20, min_positions=10,
                              verbose=True, log_file='backtesting_backtrader_log.csv')
          start = time()
          results = cerebro.run()
          ending_value = cerebro.broker.getvalue()
          duration = time() - start
          

          Unfortunately the log generated from 24-03-2016 to 30-03-2016 is too long to be posted here in its entirety. I just left the first 3 lines of each part of the log (3 first tickers) and replaced the rest with 3 dots. I hope it's still helpful enough.

          2016-03-24T00:00:00,CNX,SUBMITTED
          2016-03-24T00:00:00,DDD,SUBMITTED
          2016-03-24T00:00:00,HOV,SUBMITTED
          ...
          2016-03-24T00:00:00,CNX,ACCEPTED
          2016-03-24T00:00:00,DDD,ACCEPTED
          2016-03-24T00:00:00,HOV,ACCEPTED
          ...
          2016-03-24T00:00:00,CNX,BUY executed,10.27
          2016-03-24T00:00:00,DDD,BUY executed,14.14
          2016-03-24T00:00:00,HOV,SELL executed,38.00
          ...
          2016-03-24T00:00:00,next
          2016-03-24T00:00:00,('AG', -6266),POSITION
          2016-03-24T00:00:00,('AKS', -9768),POSITION
          ...
          2016-03-24T00:00:00,AKS,CLOSING ORDER CREATED
          2016-03-24T00:00:00,AU,CLOSING ORDER CREATED
          2016-03-24T00:00:00,DNR,CLOSING ORDER CREATED
          ...
          2016-03-24T00:00:00,HMY,SHORT ORDER CREATED
          2016-03-24T00:00:00,TECK,SHORT ORDER CREATED
          2016-03-24T00:00:00,VAL,SHORT ORDER CREATED
          ...
          2016-03-24T00:00:00,BTU,LONG ORDER CREATED
          2016-03-24T00:00:00,SALT,LONG ORDER CREATED
          2016-03-24T00:00:00,BHC,LONG ORDER CREATED
          ...
          2016-03-24T00:00:00,AKS,SUBMITTED
          2016-03-24T00:00:00,AU,SUBMITTED
          2016-03-24T00:00:00,DNR,SUBMITTED
          ...
          2016-03-24T00:00:00,AKS,ACCEPTED
          2016-03-24T00:00:00,AU,ACCEPTED
          2016-03-24T00:00:00,DNR,ACCEPTED
          ...
          2016-03-24T00:00:00,AKS,BUY executed,4.19
          2016-03-24T00:00:00,AU,BUY executed,12.73
          2016-03-24T00:00:00,DNR,SELL executed,2.25
          ...
          2016-03-24T00:00:00,next
          2016-03-24T00:00:00,('AG', -5817),POSITION
          2016-03-24T00:00:00,('AMRX', 1200),POSITION
          2016-03-24T00:00:00,('AUY', -13857),POSITION
          ...
          2016-03-24T00:00:00,AG,CLOSING ORDER CREATED
          2016-03-24T00:00:00,AMRX,CLOSING ORDER CREATED
          2016-03-24T00:00:00,AUY,CLOSING ORDER CREATED
          ...
          2016-03-29T00:00:00,AG,SUBMITTED
          2016-03-29T00:00:00,AMRX,SUBMITTED
          2016-03-29T00:00:00,AUY,SUBMITTED
          ...
          2016-03-29T00:00:00,AG,ACCEPTED
          2016-03-29T00:00:00,AMRX,ACCEPTED
          2016-03-29T00:00:00,AUY,ACCEPTED
          ...
          2016-03-29T00:00:00,AG,BUY executed,6.42
          2016-03-29T00:00:00,AMRX,SELL executed,31.22
          2016-03-29T00:00:00,AUY,BUY executed,2.80
          ...
          2016-03-29T00:00:00,next
          2016-03-29T00:00:00,VAL,SHORT ORDER CREATED
          2016-03-29T00:00:00,CDE,SHORT ORDER CREATED
          2016-03-29T00:00:00,AG,SHORT ORDER CREATED
          ...
          2016-03-30T00:00:00,VAL,SUBMITTED
          2016-03-30T00:00:00,CDE,SUBMITTED
          2016-03-30T00:00:00,AG,SUBMITTED
          ...
          2016-03-30T00:00:00,VAL,ACCEPTED
          2016-03-30T00:00:00,CDE,ACCEPTED
          2016-03-30T00:00:00,AG,ACCEPTED
          ...
          2016-03-30T00:00:00,VAL,SELL executed,105.20
          2016-03-30T00:00:00,CDE,SELL executed,5.64
          2016-03-30T00:00:00,AG,SELL executed,6.84
          ...
          2016-03-30T00:00:00,next
          2016-03-30T00:00:00,('AG', -5418),POSITION
          2016-03-30T00:00:00,('AMRX', 1143),POSITION
          2016-03-30T00:00:00,('AUY', -12576),POSITION
          ...
          2016-03-30T00:00:00,AMRX,CLOSING ORDER CREATED
          2016-03-30T00:00:00,AZO,CLOSING ORDER CREATED
          2016-03-30T00:00:00,CYH,CLOSING ORDER CREATED
          ...
          

          I hope it helps.
          Best,
          Rapha

          A 1 Reply Last reply Reply Quote 0
          • A
            ab_trader @Rapha last edited by

            @rapha

            I can't use your script since it requires your data feeds, therefore I made some tests based on the basic bt scripts. I was able to have two calls of the next per one date only when the data feed has that date twice. So i would check the data feeds around that dates first. Also i would check if 28-03-2021 date is present.

            • If my answer helped, hit reputation up arrow at lower right corner of the post.
            • Python Debugging With Pdb
            • New to python and bt - check this out
            1 Reply Last reply Reply Quote 2
            • R
              Rapha last edited by

              @ab_trader
              Thanks for the heads up.

              As you suggested, I have investigated the data and unfortunately, it does not look like in my case that it is the root cause of the problem.

              On 2016-03-24, I only have unique references:

              len(data.xs('2016-03-24',level = 1, drop_level = False).index)
              Out: 1033
              
              len(data.xs('2016-03-24',level = 1, drop_level = False).index.unique())
              Out: 1033
              

              On 2016-03-28, there are data as well:

              len(data.xs('2016-03-28',level = 1, drop_level = False).index)
              Out: 1030
              
              len(data.xs('2016-03-28',level = 1, drop_level = False).index.unique())
              Out: 1030
              
              A 1 Reply Last reply Reply Quote 0
              • A
                ab_trader @Rapha last edited by

                @rapha why the number of 24th is larger than the number of 28th? Should be the same, right? How many 23rds do you have.

                • If my answer helped, hit reputation up arrow at lower right corner of the post.
                • Python Debugging With Pdb
                • New to python and bt - check this out
                R 2 Replies Last reply Reply Quote 1
                • R
                  Rapha @ab_trader last edited by

                  @ab_trader good point. 23rd as 1033 tickers. I have a total of 1034 tickers in my data feed, which I have for some days, but apparently not all the time:

                  len(data.index.get_level_values(0).unique())
                  Out[48]: 1034
                  
                  len(data.xs('2016-03-23',level = 1, drop_level = False).index.get_level_values(0).unique())
                  Out[49]: 1033
                  
                  len(data.xs('2016-03-03',level = 1, drop_level = False).index.get_level_values(0).unique())
                  Out[50]: 1034
                  

                  I will investigate the reason of these differences and post the explanation back here.

                  1 Reply Last reply Reply Quote 0
                  • R
                    Rapha @ab_trader last edited by

                    @ab_trader: I have looked into my data and while I am supposed to only have tickers traded on the NYSE in my feed, 5 of them were actually traded on European and Asian markets, following a slightly different calendar for holidays. Removing them solved the problem rergarding the two calls of next on 24-03-2016, and my holdings are now very closely aligned in Zipline and Backtrader every days! thanks for the help.

                    I still have some discrepancies in the returns, apparently caused by uneven weighting of my positions in Backtrader, which should follow a simple 1/N weighting scheme. I have not had time to look into this yet, and it is not even related to my orginal question. But if there is something odd you spot in my code, pls let me know.

                    1 Reply Last reply Reply Quote 0
                    • 1 / 1
                    • First post
                      Last post
                    Copyright © 2016, 2017, 2018, 2019, 2020, 2021 NodeBB Forums | Contributors