For code/output blocks: Use ``` (aka backtick or grave accent) in a single line before and after the block. See: http://commonmark.org/help/

Race Condition Bug w/ Orders and Trades



  • Howdy!

    I've been using backtrader for a little while, now, and believe I've stumbled upon a bug when using shorts / longs and using order_target_value or order_target_percent. I have a base strategy class which contains the following logic:

    def next(self):
            if self.datetime.date(ago=0) >= self.p.date_end:
                self.order_target_percent(target=0.0)
            elif self.datetime.date(ago=0) >= self.p.date_start:
                if self.buysig and self.position.size <=0:
                    print(self.position.price * self.position.size)
                    if self.p.log_trades:
                        self.loginfo('Enter Long')
                    self.order_target_percent(target=self.p.buy_target)
                elif self.sellsig and self.position.size >=0:
                    print(self.position.price * self.position.size)
                    if self.p.log_trades:
                        self.loginfo('Enter Short')
                    self.order_target_percent(target=self.p.sell_target)
    

    To be clear, this issue does not come up when this particular strategy has one of self.p.buy_target or self.p.sell_target set to 0.0. Seems to be an issue with longing + shorting. The code which determines self.p.sellsig and self.p.buysig is as follows:

    class Storm(Strategy):
    
        params = (
            ("date_start",datetime.datetime(2019, 1, 1).date()),
            ("date_end",datetime.datetime(2019, 12, 31).date()),
            ("buy_target",0.5),
            ("sell_target",0.0),
            ("bar_size",50),
            ("ma_type", "hma"),
            ("ma_period",25),
            ("ma_source", "close")
        )
    
        def __init__(self):
            cross = bt.ind.CrossOver(self.data.close, bt.ind.HullMovingAverage(period=self.p.ma_period) if self.p.ma_type == "hma" else bt.ind.ExponentialMovingAverage(period=self.p.ma_period) if self.p.ma_type == "ema" else bt.ind.SMA(period=self.p.ma_period) if self.p.ma_type == "sma" else bt.ind.ZeroLagExponentialMovingAverage(period=self.p.ma_period) if self.p.ma_type == "zlema" else bt.ind.TripleExponentialMovingAverage(period=self.p.ma_period) if self.p.ma_type == "tema" else bt.ind.DicksonMovingAverage(period=self.p.ma_period, gainlimit=50, hperiod=7))
            self.buysig = cross > 0
            self.sellsig = cross < 0
    

    When I execute a strategy where the self.p.sell_target is -0.5 and self.p.buy_target is 0.5, all the orders get placed, but the trades stop being logged due to what looks to me like a concurrency issue. I was tipped off to this behavior when plotting - if you'll notice, the trades stop appearing past a certain date in this plot.

    I tried this with a few other values and found that the problem is intermittent and does not seem to impact the results of the test. For instance, here's the same strategy but with a self.p.sell_target of -0.55 and self.p.buy_target of 0.55. As shown, the strategy has trades throughout the entire trading period (01/01/18-06/30/20).

    For the fact that this bug happens somewhat randomly, I believe this to be a race condition in the placing of orders / trades. When logging trades to the terminal with the above code snippet, this is what I see. If you'll notice, on 2018-01-10T14:40:00, a short was entered but the sell order wasn't completed until 2018-01-10T15:45:50 after a long was entered on 2018-01-10T15:20:30.

    As far as I'm aware, this bug is only occurring for this particular strategy above. I haven't been able to run the gamut on other strategies, but it seems like they are not affected by this issue, so far.

    So, if anyone has observed this and may be able to assist, here, it would be greatly appreciated.



  • To add additional color, here, I believe this is happening because this particular variant of this strategy was trading very frequently. When I increase the ma_period, this issue resolves itself. Something to keep in mind for anyone trying to debug this.



  • So, when I decrease the starting cash, it looks like this is resolving itself? It appears to have to do with partial fills. Which is strange because all my orders are market orders. Hmm.



  • Yeah, it is definitely from a partial fill, not sure why it breaks the trade log or why it's even partial - the volume on the books well exceeds what I'm putting for my target...



  • Your charts show 1 day data feed, but seems you are looking for intraday data based on your posts. Can you show the piece which add data feeds?



  • Sure, the data I'm feeding in, here, was constructed from tick data which was sampled every 10 seconds and then converted into renko bars (bars are created solely based on price). I shape this into a dataframe, which looks as such:

    datetime,open,high,low,close,volume,openinterest
    2018-07-02 13:47:40,6475.0,6660.0,6475.0,6660.0,5916.79,0
    2018-07-10 23:22:50,6475.0,6475.0,6290.0,6290.0,50797.91,0
    2018-07-12 23:18:00,6290.0,6290.0,6105.0,6105.0,13536.71,0
    2018-07-16 09:29:50,6290.0,6475.0,6290.0,6475.0,15851.37,0
    2018-07-16 11:33:50,6475.0,6660.0,6475.0,6660.0,1363.54,0
    2018-07-17 17:38:00,6660.0,6845.0,6660.0,6845.0,11528.71,0
    2018-07-17 17:41:10,6845.0,7030.0,6845.0,7030.0,1176.72,0
    2018-07-17 17:44:50,7030.0,7215.0,7030.0,7215.0,884.67,0
    2018-07-17 18:05:20,7215.0,7400.0,7215.0,7400.0,2078.74,0
    2018-07-20 16:39:00,7400.0,7585.0,7400.0,7585.0,36508.38,0
    2018-07-21 00:53:50,7400.0,7400.0,7215.0,7215.0,6652.38,0
    2018-07-23 02:12:50,7400.0,7585.0,7400.0,7585.0,10676.5,0
    2018-07-23 07:27:30,7585.0,7770.0,7585.0,7770.0,3631.52,0
    2018-07-24 05:22:00,7770.0,7955.0,7770.0,7955.0,11520.29,0
    2018-07-24 11:19:10,7955.0,8140.0,7955.0,8140.0,4012.97,0
    2018-07-24 12:07:40,8140.0,8325.0,8140.0,8325.0,1185.62,0
    2018-07-26 21:39:10,8140.0,8140.0,7955.0,7955.0,33597.03,0
    2018-07-31 12:55:40,7955.0,7955.0,7770.0,7770.0,39759.94,0
    2018-08-03 00:36:10,7585.0,7585.0,7400.0,7400.0,18057.3,0
    2018-08-04 14:20:30,7400.0,7400.0,7215.0,7215.0,13931.18,0
    2018-08-04 14:34:50,7215.0,7215.0,7030.0,7030.0,1050.95,0
    2018-08-06 22:33:40,7030.0,7030.0,6845.0,6845.0,19805.31,0
    2018-08-08 00:10:10,6845.0,6845.0,6660.0,6660.0,12069.17,0
    2018-08-08 04:12:40,6660.0,6660.0,6475.0,6475.0,3689.75,0
    2018-08-08 16:43:40,6475.0,6475.0,6290.0,6290.0,8357.13,0
    2018-08-10 21:05:10,6290.0,6290.0,6105.0,6105.0,28368.19,0
    2018-08-11 17:35:10,6290.0,6475.0,6290.0,6475.0,10946.08,0
    

    I then feed it into the following DataFeed:

    #!/usr/bin/env python
    # -*- coding: utf-8; py-indent-offset:4 -*-
    
    from __future__ import (absolute_import, division, print_function,
                            unicode_literals)
    
    from backtrader.utils.py3 import filter, string_types, integer_types
    
    from backtrader import date2num
    import backtrader.feed as feed
    
    class PandasData(feed.DataBase):
        '''
        Uses a Pandas DataFrame as the feed source, using indices into column
        names (which can be "numeric")
    
        This means that all parameters related to lines must have numeric
        values as indices into the tuples
    
        Params:
    
          - ``nocase`` (default *True*) case insensitive match of column names
    
        Note:
    
          - The ``dataname`` parameter is a Pandas DataFrame
    
          - Values possible for datetime
    
            - None: the index contains the datetime
            - -1: no index, autodetect column
            - >= 0 or string: specific colum identifier
    
          - For other lines parameters
    
            - None: column not present
            - -1: autodetect
            - >= 0 or string: specific colum identifier
        '''
    
        params = (
            ('nocase', True),
    
            # Possible values for datetime (must always be present)
            #  None : datetime is the "index" in the Pandas Dataframe
            #  -1 : autodetect position or case-wise equal name
            #  >= 0 : numeric index to the colum in the pandas dataframe
            #  string : column name (as index) in the pandas dataframe
            ('datetime', None),
    
            # Possible values below:
            #  None : column not present
            #  -1 : autodetect position or case-wise equal name
            #  >= 0 : numeric index to the colum in the pandas dataframe
            #  string : column name (as index) in the pandas dataframe
            ('open', -1),
            ('high', -1),
            ('low', -1),
            ('close', -1),
            ('volume', -1),
            ('openinterest', None),
        )
    
        datafields = [
            'datetime', 'open', 'high', 'low', 'close', 'volume', 'openinterest'
        ]
    
        def __init__(self):
            super(PandasData, self).__init__()
    
            # these "colnames" can be strings or numeric types
            colnames = list(self.p.dataname.columns.values)
            if self.p.datetime is None:
                # datetime is expected as index col and hence not returned
                pass
    
            # try to autodetect if all columns are numeric
            cstrings = filter(lambda x: isinstance(x, string_types), colnames)
            colsnumeric = not len(list(cstrings))
    
            # Where each datafield find its value
            self._colmapping = dict()
    
            # Build the column mappings to internal fields in advance
            for datafield in self.getlinealiases():
                defmapping = getattr(self.params, datafield)
    
                if isinstance(defmapping, integer_types) and defmapping < 0:
                    # autodetection requested
                    for colname in colnames:
                        if isinstance(colname, string_types):
                            if self.p.nocase:
                                found = datafield.lower() == colname.lower()
                            else:
                                found = datafield == colname
    
                            if found:
                                self._colmapping[datafield] = colname
                                break
    
                    if datafield not in self._colmapping:
                        # autodetection requested and not found
                        self._colmapping[datafield] = None
                        continue
                else:
                    # all other cases -- used given index
                    self._colmapping[datafield] = defmapping
    
        def start(self):
            super(PandasData, self).start()
    
            # reset the length with each start
            self._idx = -1
    
            # Transform names (valid for .ix) into indices (good for .iloc)
            if self.p.nocase:
                colnames = [x.lower() for x in self.p.dataname.columns.values]
            else:
                colnames = [x for x in self.p.dataname.columns.values]
    
            for k, v in self._colmapping.items():
                if v is None:
                    continue  # special marker for datetime
                if isinstance(v, string_types):
                    try:
                        if self.p.nocase:
                            v = colnames.index(v.lower())
                        else:
                            v = colnames.index(v)
                    except ValueError as e:
                        defmap = getattr(self.params, k)
                        if isinstance(defmap, integer_types) and defmap < 0:
                            v = None
                        else:
                            raise e  # let user now something failed
    
                self._colmapping[k] = v
    
        def _load(self):
            self._idx += 1
    
            if self._idx >= len(self.p.dataname):
                # exhausted all rows
                return False
    
            # Set the standard datafields
            for datafield in self.getlinealiases():
                if datafield == 'datetime':
                    continue
    
                colindex = self._colmapping[datafield]
                if colindex is None:
                    # datafield signaled as missing in the stream: skip it
                    continue
    
                # get the line to be set
                line = getattr(self.lines, datafield)
    
                # indexing for pandas: 1st is colum, then row
                line[0] = self.p.dataname.iloc[self._idx, colindex]
    
            # datetime conversion
            coldtime = self._colmapping['datetime']
    
            if coldtime is None:
                # standard index in the datetime
                tstamp = self.p.dataname.index[self._idx]
            else:
                # it's in a different column ... use standard column index
                tstamp = self.p.dataname.iloc[self._idx, coldtime]
    
            # convert to float via datetime and store it
            dt = tstamp.to_pydatetime()
            dtnum = date2num(dt)
            self.lines.datetime[0] = dtnum
    
            # Done ... return
            return True
    

    Again, I don't see any issues with longing-only or shorting-only, so to infer that the issue lies with the time-invariance of the bars seems pretty strange to me. I am still trying to understand why, when I get a partial fill, it continues to place orders but no trades... I'm more confused as to why there was ever a partial fill - the volume in these bars well exceeds the order sizes which I'm placing.

    Thank you!



  • Here is the excert from the documentation -

    Notifications may happen even several times for the same status in the case of Order.Partial. This status will not be seen in the backtesting broker (which doesn’t consider volume when matching) but it will for sure be set by real brokers.

    If bt can't executed full order, it rejects it. This is default behaviour.

    I guess that the behavior you have is in the way your data feed passed into bt. My guess would be that bt recognizes your data feed as daily, but they are intraday and bars are different in length (in seconds), so this is the causes of what you see. Anyway based on the buy-sell triangles and cash-broker value charts, all your orders are executed. Only trade chart is off, but this can be plotting only issue due to reason I described above.



  • While the performance is indeed unaffected, there are analyzers which I have in place that utilize the trade log as a means of calculating certain values of interest (e.g. drawdown, number of trades, etc...) - If we are unable to get to the bottom of this, I will just rewrite them to use orders, instead.

    Can you explain to me why it would reject an order when there was more than enough volume in a bar to fulfill the size of the order? I looked into the code and was a bit confused by remsize and such - it seems like there isn't that much in the code regarding partial orders - any direction here is greatly appreciated :)



  • @pomnl said in Race Condition Bug w/ Orders and Trades:

    While the performance is indeed unaffected, there are analyzers which I have in place that utilize the trade log as a means of calculating certain values of interest (e.g. drawdown, number of trades, etc...) - If we are unable to get to the bottom of this, I will just rewrite them to use orders, instead.

    I would go different way - first, add regular bars data feed into bt, then apply renko to it as a filter or indicator (bt has both of them) to this data feed. Use renko as a signal and issue orders against the first, regular bar data feed. First, this approach is inline with way bt should be used, second - it excludes errors due to use of the renko bars as a data for order execution. But you choose.

    @pomnl said in Race Condition Bug w/ Orders and Trades:

    Can you explain to me why it would reject an order when there was more than enough volume in a bar to fulfill the size of the order?

    As stated above bt doesn't consider the volume during the order execution, than the order will not be rejected due to low volume. I just noticed that in bt there is no partial execution during back test.



  • What you say makes sense and removes potential errors due to discrete price steps which are incurred via utilization of renko bars as the main datafeed. One of the chief reasons for feeding the data in this way is that it reduces the size of the data from GBs to MBs.

    Maybe it would be better for me to instead precalculate indicators for a strategy based on the renko data and to feed that along with the regular bar data - this way I get the signals based on renko and am getting the continuous price data, as well.

    I will look into all this and get back to you. Thank you!



  • @pomnl I've had success using this strategy with bulky data.



  • @run-out Good to know! Although, I am not really looking forward to this refactor, :/



  • @run-out also, to put this into perspective, I currently run tens of millions of tests across these datasets. Just have to ask if you're able to do that with this strategy?



  • @ab_trader I am working on implementing your intended means of using this while trying to figure out why a partial fill is being executed in my backtesting. I really cannot seem to figure out where the order is being marked as partially filled or how this is even occurring, the order status is, in fact, order.Partial - I can't seem to figure out how / where this is occurring.



  • @pomnl said in Race Condition Bug w/ Orders and Trades:

    I really cannot seem to figure out where the order is being marked as partially filled or how this is even occurring, the order status is, in fact, order.Partial - I can't seem to figure out how / where this is occurring.

    partial filling can occur with live brokers. what broker do you use and how this broker data feed work together with your Pandas data feed?



  • @ab_trader BackBroker.

    So, I found the issue. It looks like backbroker is attempting to fill a fractional order for 7.912682742907102 BTC but is only filling 7.912682742907101. Given that the difference is 0.000000000000001, this might be some floating point issue. Anywho, I believe I have found it and will see if I can't address it. If I do, I can submit a patch (I'm sure it'll be like, one line) to the repo :)



  • Okay, so no patch needed, but let it be known that if you are going to use fractional sizing with backbroker, you should round off your sizing in the CommissionInfo.

    Previously, I had:

    class Fractional(bt.CommissionInfo):
    
        def getsize(self, price, cash):
    
            return self.p.leverage * (cash / price)
    

    No, I have:

    class Fractional(bt.CommissionInfo):
    
        def getsize(self, price, cash):
    
            return round(self.p.leverage * (cash / price), 8)
    

    Learn from my mistakes!

    Thank you @ab_trader for your help, here!



  • @pomnl said in Race Condition Bug w/ Orders and Trades:

    @run-out also, to put this into perspective, I currently run tens of millions of tests across these datasets. Just have to ask if you're able to do that with this strategy?
    I was doing a factor analysis and the factor parr (eg first part precalculated) didn't change much. I have also used it in other cases where it slowed down. So not sure how it will work out for you. Good luck.


Log in to reply
 

});