Unable to get closing daily values for SPY



  • I'm attempting to capture closing prices for the day in an IB datafeed, requested at bt.TimeFrame.Days and resampled to the same. As discussed in other threads, it is my goal to capture the closing bar for the day and enter or exit ES based on evaluation of indicator values driven by the SPY data. I've set sessionend= as shown below, yet I am not seeing the bar in my strategy at the 16:00 closing time. Suggestions would be appreciated.

    Setting up data feed as follows:

            # SPY Live data timeframe resampled to 1 Day
            data1 = ibstore.getdata(dataname=args.live_spy, backfill_from=bfdata1,
                                    timeframe=bt.TimeFrame.Days, compression=1, sessionend=dt.time(16, 0))
            cerebro.resampledata(data1, name="SPY-daily", timeframe=bt.TimeFrame.Days, compression=1)
    

    Using the following code in Strategy next() to capture the bar.

     # We only care about ticks on the Daily SPY
            if not len(self.data_spy) > self.len_data_spy:
                return
            else:
                import pdb; pdb.set_trace()
                self.len_data_spy = len(self.data_spy)
    

    At the breakpoint above, I can see the following data:

    (Pdb) self.data_spy.sessionend
    0.625
    (Pdb) self.data_spy.DateTime
    6
    (Pdb) self.data_spy.buflen()
    4219
    (Pdb) self.data_spy.contractdetails.m_tradingHours
    '20170111:0400-2000;20170112:0400-2000'
    (Pdb) self.data_spy.contractdetails.m_timeZoneId
    'EST'
    (Pdb) self.data_spy.contractdetails.m_liquidHours
    '20170111:0930-1600;20170112:0930-1600'
    

  • administrators

    This seems bound to fail:

    if not len(self.data_spy) > self.len_data_spy:
        return
    

    When the len of your data is >= 1 the not turns that to False (consider it 0 for the comparison) and it will never be larger than something which is already >= 1



  • While that logic might not be very intuitive to look at, it does accomplish the goal since I would want to return if it is False.

    I think one possible bug there is that I could miss ticks if setting the self.len_data_spy counter equal to the len(self.data_spy). I've since changed this to be a += 1 counter, but I think the issue may still remain. Will see today when we hit the close.


  • administrators

    It seem it is going to be always False and never execute the return.

    Example

    • Assume initialization self.len_data_spy = 0 (not shown above)

    • As soon as len(self.data_spy) > 0 then not len(self.data_spy) -> False

    • And False > self.len_data_spy evaluates to False and you don't return

    • self.len_data_spy is updated and contains for sure a number > 0

    • And you repeat the cycle (not the initialization) with the same consequences

    You are mixing Minutes (for ES) and Days (for the SPY) and according to your narrative, making decisions based on the daily data, operating on the minute data. It is therefore assumed you have an indicator and/or lines operation on SPY, which means next will 1st be called when len(self.data_spy) > 0 evaluates to True, because the indicator/operation has increased the minimum period. (And if this doesn't hold true, then something is buggy in the platform)

    It may be that you have some other initialization value or the reasoning is incorrect, but it really seems like the logic won't actually do anything.



  • I have market data ticking through next() on both 1-minute interval for ES and on Daily interval for SPY.

    My goals are:

    • Ignore ticks coming through every minute for ES
    • Only see the tick for SPY at close of RTH which is 16:00 EST

    Is there a way to see if the tick that is coming through next() is for a particular data source? That would allow me to return if the tick is for self.data_es

    Otherwise, my only option here is to see if len(self.data_spy) is larger than my counter and if not, return.

    so the following is False unless the tick is on self.data_spy:

    if not len(self.data_spy) > self.len_data_spy:  # could be written if self.len_data_spy == len(self.data_spy)
        return  # ignore self.data_es tick
    
    ...do something...
    
    self.len_data_spy += 1
    

    The above logic seems to be ok except that it lets through one ES tick at startup
    I'll come back with more detail about what I am doing in the strategy if I fail to get this closing bar in the next :30 minutes.


  • administrators

    Is there a way to see if the tick that is coming through next() is for a particular data source? That would allow me to return if the tick is for self.data_es

    As explained, by checking if the len of a data feed has increased. When two or more data feeds are time aligned a single next call will let you evaluate a change in both data feeds at the same time.



  • Still not getting this closing tick on self.data_spy. I've included most of the Strategy code below to see if there is something I am doing wrong here.

    One thought is that the SPY data is backfilled from static data that has no time information beyond the date. Not sure if that could impact what the IB feed that is supplementing it is storing regarding time. Also, the data in static backfill is daily data and is being augmented with daily timeframe data from IB. Does that show the time for 16:00 close?

    I guess the next debugging approach might be to break in the debugger at specific time for any tick to see what we have. I am open to other ideas as to how to sort this out.

    I've snipped out the code that I do not think is relevant. Happy to provide that if that detail is needed.

    class SampleStrategy(bt.Strategy):
        params = (
            ('live', False),
            ('maperiod', 200),
        )
    
        def log(self, txt, dt=None):
            ...
    
        def __init__(self):
            self.datastatus = False
            self.data_es = self.data0
            self.data_spy = self.data1
    
            # Add a MovingAverageSimple indicator based from SPY
            self.sma = btind.MovingAverageSimple(self.data_spy, period=self.p.maperiod)
    
        def start(self):
            self.len_data_spy = 0
    
        def notify_data(self, data, status, *args, **kwargs):
            if status == data.LIVE:
                self.datastatus = True
    
        def notify_store(self, msg, *args, **kwargs):
            ...
    
        def notify_order(self, order):
            ...
    
        def notify_trade(self, trade):
            ...
    
        def next(self):
            # We only care about ticks on the Daily SPY
            if len(self.data_spy) == self.len_data_spy:
                return
            elif self.len_data_spy == 0:
                self.len_data_spy = (len(self.data_spy) - 1)
    
            if self.order:
                return  # if an order is active, no new orders are allowed
    
            if self.p.live and not self.datastatus:
                return  # if running live and no live data, return
    
            if self.position:  # position is long or short
                if self.position.size < 0 and self.signal_exit_short:
                        self.order = self.close(data=self.data_es)
                        self.log('CLOSE: BUY TO COVER')
                elif self.position.size > 0 and self.signal_exit_long:
                        self.order = self.close(data=self.data_es)
                        self.log('CLOSE: SELL TO COVER')
                else:
                    self.log('NO TRADE EXIT')
    
            if not self.position:  # position is flat
                if self.signal_entry_long:
                    self.order = self.buy(data=self.data_es)
                    self.log('OPEN: BUY LONG')
                elif self.signal_entry_short:
                    self.order = self.sell(data=self.data_es)
                    self.log('OPEN: BUY LONG')
                else:
                    self.log('NO TRADE ENTRY')
    
            self.len_data_spy += 1
    

  • administrators

    In another thread it was recommended to use replaydata, because it will continuously give you the current daily bar.

    It should be something like this

    data0 = ibstore.getdata(`ES`)
    cerebro.resampledata(data0, timeframe=bt.TimeFrame.Minutes, compression=1)
    
    data1 = ibstore.getdata(`SPY`)
    cerebro.replaydata(data1, timeframe=bt.TimeFrame.Datay, compression=1)
    

    In the strategy

    def next(self):
    
        if self.data0.datetime.time() >= datetime.time(16, 0):  # session has come to the end
            if self.data1.close[0] == MAGICAL_NUMBER:
                self.buy(data=self.data0)  # buying data0 which is ES, but check done on data1 which is SPY
    

    Rationale:

    • replaydata will give you every tick of the data (SPY in this case) but in a daily bar which is slowly being constructed
    • Because ES keeps ticking at minute level, once it has reached (or gone over) the end of session of SPY you can put your buying logic in place

    Note

    This time in this line needs to be adjusted to the local time in which ES is (information available in m_contractDetails

        if self.data0.datetime.time() >= datetime.time(16, 0):  # session has come to the end
    

    or as an alternative tell the code to give you the time in EST (aka US/Eastern) timezone. For that timezone the end of the session is for sure 16:00

    import pytz
    EST = pytz.timezone('US/Eastern')
    ...
    ...
    def next(self):
        ...
        if self.data0.datetime.time(tz=EST) >= datetime.time(16, 0):  # session has come to the end
            ...
    


  • Is it also necessary to first call .resampledata() on data1 in your example because it is an IB feed, or is it enough to use .replaydata() instead?

    Also wanted to mention that while digging around in the debugger, I see the following with a very odd value for sessionend= which is not at all what I have set when creating these feeds.

    It is possible this is a bar sourced from the static file... but would still expect the setting to be applied.

    (Pdb) self.data_spy.sessionend
    0.7291666666666667
    (Pdb) self.data_spy.sessionstart
    0.0
    (Pdb) self.data_spy.datetime.time()
    datetime.time(19, 0)
    (Pdb) self.data_spy.datetime.date()
    datetime.date(2017, 1, 11)
    


  • Looking at ibtest.py I think I have answered my question that the .replaydata() is in place of the .resampledata().

    However, I am unable to get this to run. Continually erroring out when starting the system. Remember that this is also the data source which I am using backfill_from to backfill from local static data since these indicators need several years of data. Not sure if that could be a factor.

      File "backtrader/strategy.py", line 296, in _next
        super(Strategy, self)._next()
      File "backtrader/lineiterator.py", line 236, in _next
        clock_len = self._clk_update()
      File "backtrader/strategy.py", line 285, in _clk_update
        newdlens = [len(d) for d in self.datas]
      File "backtrader/strategy.py", line 285, in <listcomp>
        newdlens = [len(d) for d in self.datas]
      File "backtrader/lineseries.py", line 432, in __len__
        return len(self.lines)
      File "backtrader/lineseries.py", line 199, in __len__
        return len(self.lines[0])
    ValueError: __len__() should return >= 0
    


  • Just to add another data point here:

    In the debugger, when running with .resampledata(), self.data_spy.datetime.time(tz=EST) always reports dt.time(19, 0)

    I've added a break to debugger today to break if it reports something other than dt.time(19,0)

    Will start looking at what might be happening when running .replaydata()



  • Finding the following:

    I have 3 data feeds configured and available in self.datas

    At the point in the code where this is failing, self.datas[0] has no size and the call to len fails.

    The code:

     def _clk_update(self):
            if self._oldsync:
                clk_len = super(Strategy, self)._clk_update()
                self.lines.datetime[0] = max(d.datetime[0]
                                             for d in self.datas if len(d))
                return clk_len
    
            import pdb; pdb.set_trace()
            newdlens = [len(d) for d in self.datas]
            if any(nl > l for l, nl in zip(self._dlens, newdlens)):
                self.forward()
    
            self.lines.datetime[0] = max(d.datetime[0]
                                         for d in self.datas if len(d))
            self._dlens = newdlens
    
            return len(self)
    

    Debugger output:

    > /home/inmate/.virtualenvs/backtrader3/lib/python3.4/site-packages/backtrader/strategy.py(286)_clk_update()
    -> newdlens = [len(d) for d in self.datas]
    (Pdb) self._oldsync
    False
    (Pdb) self.data_spy
    <backtrader.feeds.ibdata.IBData object at 0x810dcfb70>
    (Pdb) self.datas
    [<backtrader.feeds.ibdata.IBData object at 0x810dcf3c8>, <backtrader.feeds.ibdata.IBData object at 0x810dcfb70>, <backtrader.feeds.ibdata.IBData object at 0x810dd6320>]
    (Pdb) len(self.datas)
    3
    (Pdb) self.datas[0]._name
    'ES-minutes'
    (Pdb) self.datas[1]._name
    'SPY-daily'
    (Pdb) self.datas[2]._name
    'ES-daily'
    (Pdb) len(self.datas[0])
    *** ValueError: __len__() should return >= 0
    (Pdb) len(self.datas[1])
    1
    (Pdb) len(self.datas[2])
    1
    

    Code to setup the feed that is failing size check:

     # ES Futures Live data timeframe resampled to 1 Minute
            data0 = ibstore.getdata(dataname=args.live_es, fromdate=fetchfrom,
                                    timeframe=bt.TimeFrame.Minutes, compression=1)
            cerebro.resampledata(data0, name="ES-minutes", timeframe=bt.TimeFrame.Minutes, compression=1)
    

    Removing fromdate= gets past that error. But then the next error...

      File "/home/inmate/.virtualenvs/backtrader3/lib/python3.4/site-packages/backtrader/cerebro.py", line 809, in run
        runstrat = self.runstrategies(iterstrat)
      File "/home/inmate/.virtualenvs/backtrader3/lib/python3.4/site-packages/backtrader/cerebro.py", line 933, in runstrategies
        self._runnext(runstrats)
      File "/home/inmate/.virtualenvs/backtrader3/lib/python3.4/site-packages/backtrader/cerebro.py", line 1166, in _runnext
        dt0 = min((d for i, d in enumerate(dts)
    ValueError: min() arg is an empty sequence
    


  • If I change to use only .replaydata() for all of these feeds, and set exactbars < 1, I can avoid the above crash.

    exactbars set to = 1 causes crash in linebuffer.py



  • I now find that what I am getting from these replayed feeds now using the datas names I have assigned self.data0 and self.data1 to is Minute data.

    What am I missing?


  • administrators

    @RandyT

    Is it also necessary to first call .resampledata() on data1 in your example because it is an IB feed, or is it enough to use .replaydata() instead?

    Either resampledata or replaydata. They do similar but different things. See docs for Data Replay

    Also wanted to mention that while digging around in the debugger, I see the following with a very odd value for sessionend= which is not at all what I have set when creating these feeds.

    It is possible this is a bar sourced from the static file... but would still expect the setting to be applied.

    The platform tries not to be too intelligent. time(19, 0) for your assets (which seem to be in EST) is time(24, 0) (or time(0, 0) in UTC during the winter time).

    sessionend will be used by the platform as a hint as to when intraday data has gone over the session to put that extra intraday data in the next bar.


  • administrators

    @RandyT There are only some insights as to what's actually running. For example: up until today you had 2 data feeds, suddenly there are 3. And which value actually fetchfrom has may play a role, since it seems to affect what happens when you have it in place and when you don't.

    With regards to replaydata and the timeframe/compression you get:

    • A replayed data feed will tick very often, but with the same length until the boundary of the timeframe/compression pair is met.

    That means that for a single minutes, it may tick 60 times (1 per second). The len(self.datax) value remains constant until you move to the next minutes. You are seeing the construction of a 1-minute bar replayed.

    That's why it was the idea above to use it in combination with the 1-minute resampled data for the ES, to make sure that you see the final values of the daily bar of the SPY.

    Since you seem to be stretching the limits of the platform and no real data feeds run during the weekend, it will give time to prepare a sample a see if some of your reports can be duly reproduced.



  • I added a third datafeed to give me some daily ES data to do position size calculations. I had been using the SPY for this but ultimately I want to use ES.

    fromdate is specifying a 7 hour retrieval start time in an attempt to reduce the startup/backfill times. Calculated as shown below. Seemed to work as expected with .resampledata() but immediately failed when changing to .replaydata() for these feeds.

    fetchfrom = (dt.datetime.now() - timedelta(hours=7))
    

    With some of the changes made today to avoid the crashes, I managed to get to a point where I could run and could print values for self.data_spy (data1) through the day based on timestamps of the ticks, but discovered that rather than the values building on the daily bar for self.data_spy it instead was giving me minute data.

    I will attempt to put together a more simple version over the weekend that will demonstrate some of these issues.

    Thanks again for your help with this.


  • administrators

    Side note following from all of the above:

    sessionend is currently not used to find out the end of a daily bar. The rationale behind:

    • Many real markets keep on delivering ticks after the sessionend

    Example:

    • Even if the official closing time of the Eurostoxx50 future is 22:00CET, the reality is that it will not close until around 22:05CET. Because of the end of day auction which takes place.

      Some platforms deliver that tick later in historical data integrated in the last tick at 22:00CET and some others deliver an extra single tick, usually 5 minutes later (the 5 minutes is a rule of thumb, because it does actually change)

    This is the same as when you consider the different out-of-RTH periods for products like ES.

    A resampled daily bar could be returned earlier by considering the sessionend, but any extra ticks would have to be discarded (or put into the next bar). Of course, the end user could set the sessionend to a time of its choosing to balance when to return the bar and start discarding values.

    replayed bars on the other hand are constantly returned, hence the recommendation to use them combined with a time chek


  • administrators

    @randyt - please see this announcement with regards to synchronizing the resampling of daily bars with the end of the session.

    This should avoid the need to use replaydata



  • Great explanation in the announcement you made. I'll give this change a try.

    One point that I want to make sure is not lost is that on Friday, will looking at the data being returned by the replay to Daily timeframe, I was seeing minute data being reported for OHLC in bars captured at specific times. I could see this because I was comparing the values to the charts seen in IB for minute data. It was not updating these values from the day, but instead was reporting them for the live data that had been replayed to Daily value. This issue was also seen in the values that my indicators were reporting that should have been calculating based on the Daily timeframe. Not sure if you looked at this in your work this weekend.

    I am going to revert back to .resampledata() for this system and will give it a try tomorrow. Seems I should be able to do the following in .next()

    if self.data_spy.datetime.time(tz=EST) != dt.time(16, 0):
        return
    

Log in to reply
 

Looks like your connection to Backtrader Community was lost, please wait while we try to reconnect.