Error resampling data



  • Yesterday I updated the code to the last version on the repository (1.9.29.108) and since then when I try to do resampling of data it inject as next() always end of day data (last value on the day, even having intraday data).

    Has someone experience the same? I´ve come back to my previous version 1.9.22.105 and everything is working fine.


  • administrators

    timeframe/compression is not being specified during the creation the data feed



  • I haven´t change anything on my code, just updated the version. Is this something needed on the GenericCSVData method?
    In fact if I try to add a 15 min data feed withouth resampling it does the resample to end of days...


  • administrators

    Bolts and nuts have been tightened to clear some corner cases producing problems. The default timeframe/compression for data feeds is Days/1. Cases like having a data feed for which the wrong timeframe/compression is specified are being caught.



  • Seem to be getting a similar error as ealvarpe above. I have tried adding the timeframe and compression parameters too while adding the generic csv feed as data. I then add the resampled data through the method.

    the base data is as follows with the close in the second last column in the csv file:
    2008-02-04 10:00:00,6.0,9582.45,9492.7,9560.7,9327.05
    2008-02-04 10:30:00,36.0,9639.1,9556.85,9598.7,9574.8
    2008-02-04 11:00:00,66.0,9607.1,9575.7,9588.4,9603.55
    2008-02-04 11:30:00,96.0,9633.15,9588.15,9614.3,9587.3
    2008-02-04 12:00:00,126.0,9634.35,9596.9,9604.95,9631.35
    2008-02-04 12:30:00,156.0,9653.9,9596.6,9634.95,9608.55
    2008-02-04 13:00:00,186.0,9666.0,9622.25,9653.25,9641.3
    2008-02-04 13:30:00,216.0,9708.6,9674.95,9698.4,9663.75
    2008-02-04 14:00:00,246.0,9700.8,9562.65,9565.1,9699.0
    2008-02-04 14:30:00,276.0,9589.0,9474.3,9530.3,9585.5
    2008-02-04 15:00:00,306.0,9554.9,9430.75,9502.95,9541.4
    2008-02-04 15:30:00,335.0,9594.35,9493.9,9576.95,9518.95
    2008-02-05 10:00:00,6.0,9577.3,9454.7,9484.25,9561.5
    2008-02-05 10:30:00,36.0,9533.1,9441.65,9523.2,9483.8
    2008-02-05 11:00:00,66.0,9554.5,9496.3,9500.7,9531.35
    2008-02-05 11:30:00,96.0,9537.55,9473.6,9480.55,9514.0
    2008-02-05 12:00:00,126.0,9497.2,9426.75,9426.75,9487.0
    2008-02-05 12:30:00,156.0,9452.5,9383.0,9422.7,9426.0
    2008-02-05 13:00:00,186.0,9477.95,9439.0,9474.75,9432.7

     tf=bt.TimeFrame.Days
     compr=1
    
    data = btfeeds.GenericCSVData(
    dataname='C:/bt-datas/bnf-30min-2008-15.txt',
    
        fromdate=dt.datetime(yearstart,1,1),
        todate=dt.datetime(yearend,3,31),
    
        nullvalue=0.0,
        dtformat=('%Y-%m-%d %H:%M:%S'),
        datetime=0,
        time=-1,
        open=5,
        high=2,
        low=3,
        close=4,
        volume=-1,
        openinterest=-1,
        timeframe = tf,
        compression = compr,
    )
    
    cerebro.resampledata(data, timeframe= tf, compression=compr)
    

    I tried logging data and data1:

    2008-02-04, Date: 2008-02-04 pos size: 0 data: 9560.7 data1: 9560.7
    2008-02-04, Date: 2008-02-04 pos size: 0 data: 9598.7 data1: 9598.7
    2008-02-04, Date: 2008-02-04 pos size: 0 data: 9588.4 data1: 9588.4
    2008-02-04, Date: 2008-02-04 pos size: 0 data: 9614.3 data1: 9614.3
    2008-02-04, Date: 2008-02-04 pos size: 0 data: 9604.95 data1: 9604.95
    2008-02-04, Date: 2008-02-04 pos size: 0 data: 9634.95 data1: 9634.95
    2008-02-04, Date: 2008-02-04 pos size: 0 data: 9653.25 data1: 9653.25
    2008-02-04, Date: 2008-02-04 pos size: 0 data: 9698.4 data1: 9698.4
    2008-02-04, Date: 2008-02-04 pos size: 0 data: 9565.1 data1: 9565.1
    2008-02-04, Date: 2008-02-04 pos size: 0 data: 9530.3 data1: 9530.3
    2008-02-04, Date: 2008-02-04 pos size: 0 data: 9502.95 data1: 9502.95
    2008-02-04, Date: 2008-02-04 pos size: 0 data: 9576.95 data1: 9576.95

    Resampling does not seem to be happening. I marked this issue in the same topic because this problem started after I upgraded from github. I was on an earlier version which was from a few months ago.

    Any help would be greatly appreciated. Thanks.


  • administrators

    It's unclear why (and if) resampling is not working. The output simply shows a daily dates and what seems to be closing prices.

    In the very short snippet you show there is only "resampledata". How the code ends up with 2 data feeds is also unknown. Simply creating the data feed will not be add it to the system.



  • Hi @backtrader, there is one generic csv data method which provides self.data. the resampledata method is giving the self.data1 line. I have followed the documentation.


  • administrators

    @kunalp What you do is unknown and that's why no analysis can take place. See:

    @kunalp said in Error resampling data:

    data = btfeeds.GenericCSVData(
        dataname='C:/bt-datas/bnf-30min-2008-15.txt',
    
        fromdate=dt.datetime(yearstart,1,1),
        todate=dt.datetime(yearend,3,31),
    
        nullvalue=0.0,
        dtformat=('%Y-%m-%d %H:%M:%S'),
        datetime=0,
        time=-1,
        open=5,
        high=2,
        low=3,
        close=4,
        volume=-1,
        openinterest=-1,
        timeframe = tf,
        compression = compr,
    )
    
    cerebro.resampledata(data, timeframe= tf, compression=compr)
    

    With that code (the only one available from your snippet above) the only thing one can say:

    • There is only one data feed in the system.
    • There is nothing to indicate that you have used cerebro.adddata(data). You may have, but the world doesn't know it.

    From the output:

    @kunalp said in Error resampling data:

    ...
    2008-02-04, Date: 2008-02-04 pos size: 0 data: 9560.7 data1: 9560.7
    2008-02-04, Date: 2008-02-04 pos size: 0 data: 9598.7 data1: 9598.7
    ...
    

    The components of each line:

    • A date (where from? Is there also a time component?)
    • Another date (where from? Is there also a time component?)
    • Position (possible from self.position.size, but it is also unknown)
    • data and data1 with what seems to be closing prices. But where this is actually coming from is unknown.

    You may have a very complete script that according to your reasoning shows there is a problem in the platform. Yes, it may be. The platform may have many bugs or the script may be wrong.

    But neither of the above can be confirmed or denied with the information at hand. Problems can be looked into when complete information is available. And problems are better understood and diagnosed with small (but complete) samples. The snippet from above is small ... but rather incomplete and the output doesn't add any information at all.



  • @backtrader Apologies, I had not shared complete details in my earlier message. I am pasting more detailed and relevant sections of the code.

    To clarify, I have been using an adddata followed by resampledata. I also was not showing time in the log output. Now it is present in the log.

    Again, thanks for your patience in trying to solve the issue.

    def runstrat():
    	# Create a cerebro entity
    	cerebro = bt.Cerebro(stdstats=True, tradehistory=True)
    
    	# Add a strategy
    	cerebro.addstrategy(
    		SMAStrategy,
    	)
    
    	tf=bt.TimeFrame.Days
    	compr=1
    
    	data = btfeeds.GenericCSVData(
    		dataname='C:/bt-datas/bnf-30min-2008-14.txt',
    		fromdate=dt.datetime(yearstart,1,1),
    		todate=dt.datetime(yearend,3,31),
    		dtformat=('%Y-%m-%d %H:%M:%S'),
    		datetime=0,
    		time=-1,
    		open=5,
    		high=2,
    		low=3,
    		close=4,
    		volume=-1,
    		openinterest=-1,
    		timeframe = tf,
    		compression = compr,
    	)
    
    	cerebro.broker.setcommission(commission=kcommission, margin=kmargin, mult=kmult)
    	cerebro.broker.setcash(kcap)
    	cerebro.adddata(data)
    
    	cerebro.addanalyzer(btanalyzers.TradeAnalyzer, _name='mytrade')
    	cerebro.addanalyzer(btanalyzers.AnnualReturn, _name='myannual')
    	cerebro.addobserver(bt.observers.DrawDown)
    
    	cerebro.resampledata(data, timeframe= tf, compression=compr)
    
    	thestrats = cerebro.run(runonce=True)
    	thestrat = thestrats[0]
    
    	# Plot the result
    	# cerebro.plot(start=dt.datetime(yearstart,1,1), end = dt.datetime(yearend,12,31), style='line')
    
    	plotter = Plotter()
    	cerebro.plot(plotter=plotter)
    
    	print('Trade Analyzer:', thestrat.analyzers.mytrade.get_analysis())
    	print('Annual Return:', thestrat.analyzers.myannual.get_analysis())
    	pp.pprint(thestrat.analyzers.mytrade.get_analysis())
    
    if __name__ == '__main__':
    	runstrat()
    

    In next(self) I am creating a log:

        self.log('position size: %s, data: %s, data1: %s' % (self.position.size, self.data[0], self.data1[0]))
    

    The function for log is:

        def log(self, txt, dt=None):
        ''' Logging function fot this strategy'''
        print('%s %s' % (self.data.datetime.datetime().isoformat(), txt))
    

    The results are as follows for the initial data points:

    2008-02-04T23:59:59.999989 position size: 0, data: 9560.7, data1: 9560.7
    2008-02-04T23:59:59.999989 position size: 0, data: 9598.7, data1: 9598.7
    2008-02-04T23:59:59.999989 position size: 0, data: 9588.4, data1: 9588.4
    2008-02-04T23:59:59.999989 position size: 0, data: 9614.3, data1: 9614.3
    2008-02-04T23:59:59.999989 position size: 0, data: 9604.95, data1: 9604.95
    2008-02-04T23:59:59.999989 position size: 0, data: 9634.95, data1: 9634.95
    2008-02-04T23:59:59.999989 position size: 0, data: 9653.25, data1: 9653.25
    2008-02-04T23:59:59.999989 position size: 0, data: 9698.4, data1: 9698.4
    2008-02-04T23:59:59.999989 position size: 0, data: 9565.1, data1: 9565.1
    2008-02-04T23:59:59.999989 position size: 0, data: 9530.3, data1: 9530.3
    2008-02-04T23:59:59.999989 position size: 0, data: 9502.95, data1: 9502.95
    2008-02-04T23:59:59.999989 position size: 0, data: 9576.95, data1: 9576.95

    The data file has the following date for 4th february which is the point from which the log starts (there are indicators that do not start before 4th February even though the data file starts at 1st january)

    2008-02-04 10:00:00,6.0,9582.45,9492.7,9560.7,9327.05
    2008-02-04 10:30:00,36.0,9639.1,9556.85,9598.7,9574.8
    2008-02-04 11:00:00,66.0,9607.1,9575.7,9588.4,9603.55
    2008-02-04 11:30:00,96.0,9633.15,9588.15,9614.3,9587.3
    2008-02-04 12:00:00,126.0,9634.35,9596.9,9604.95,9631.35
    2008-02-04 12:30:00,156.0,9653.9,9596.6,9634.95,9608.55
    2008-02-04 13:00:00,186.0,9666.0,9622.25,9653.25,9641.3
    2008-02-04 13:30:00,216.0,9708.6,9674.95,9698.4,9663.75
    2008-02-04 14:00:00,246.0,9700.8,9562.65,9565.1,9699.0
    2008-02-04 14:30:00,276.0,9589.0,9474.3,9530.3,9585.5
    2008-02-04 15:00:00,306.0,9554.9,9430.75,9502.95,9541.4
    2008-02-04 15:30:00,335.0,9594.35,9493.9,9576.95,9518.95

    In this data file, the second last column is the close price.


  • administrators

    In this new sample you are logging:

    • datetime of data0 which looking at the script corresponds to the non-resampled data feed
    • And then position.size and closing prices.

    The input data:

    2008-02-04 10:00:00,6.0,9582.45,9492.7,9560.7,9327.05
    2008-02-04 10:30:00,36.0,9639.1,9556.85,9598.7,9574.8
    

    Seems to have a Minutes/30 compression input.

    But during the creationg of GenericCSVData the following is the input:

    		timeframe = tf,
    		compression = compr,
    

    And above that code

    	tf=bt.TimeFrame.Days
    	compr=1
    

    That means that the input data will be interpreted as Days/1 and during load the timestamp will be changed to be that of the defined end of session for the data (the default is the end of the day, and because there is no such thing in python, it is as close as possible to midnight)

    Your Minutes/30 bars are now interpreted as daily bars and the resampling of that to Day/1 simply repeats the incoming bars.

    The only change is that of telling GenericCSVData the actual timeframe and compression (Minutes/30)

    If you also know the end of session time (possibly the timestamp in the last bar of each day) and you tell it to GenericCSVData, the resampler can align the delivery of the resampled daily bar with the last intraday bar.



  • @backtrader Understood the first part, thanks for pointing out the error.

    So, GenericCSVData is fed (Minutes/30).

    Resampledata is fed (Days,1)

    I looked up the Data Feeds reference now for sessionend
    https://www.backtrader.com/docu/dataautoref.html

    	data = btfeeds.GenericCSVData(
    		dataname='C:/bt-datas/bnf-30min-2008-15.txt',
    		fromdate=dt.datetime(yearstart,1,1),
    		todate=dt.datetime(yearend,3,31),
    		dtformat=('%Y-%m-%d %H:%M:%S'),
    		datetime=0,
    		time=-1,
    		open=5,
    		high=2,
    		low=3,
    		close=4,
    		volume=-1,
    		openinterest=-1,
    		timeframe = bt.TimeFrame.Minutes,
    		compression = 30,
                        sessionstart = dt.time(10,00,00),
                        sessionend =dt.time(15,30,00),
            )
    	cerebro.broker.setcommission(commission=kcommission, margin=kmargin, mult=kmult)
    	cerebro.broker.setcash(kcap)
    	cerebro.adddata(data)
    	
    	cerebro.addanalyzer(btanalyzers.TradeAnalyzer, _name='mytrade')
    	cerebro.addanalyzer(btanalyzers.AnnualReturn, _name='myannual')
    
    
    	cerebro.addobserver(bt.observers.DrawDown)
    
    	cerebro.resampledata(data, timeframe = bt.TimeFrame.Days, compression=1)
    

    Raw data is:

    2008-02-11 10:30:00,36.0,8830.1,8701.5,8729.15,8682.15
    2008-02-11 11:00:00,66.0,8742.05,8690.7,8688.65,8737.45
    2008-02-11 11:30:00,96.0,8701.55,8550.75,8542.15,8693.8
    2008-02-11 12:00:00,126.0,8548.0,8401.25,8395.65,8546.15
    2008-02-11 12:30:00,156.0,8394.8,8311.35,8351.8,8392.15
    2008-02-11 13:00:00,186.0,8399.25,8285.5,8285.5,8353.15
    2008-02-11 13:30:00,216.0,8335.65,8236.65,8279.3,8284.6
    2008-02-11 14:00:00,246.0,8435.05,8224.85,8395.55,8276.5
    2008-02-11 14:30:00,276.0,8480.65,8356.75,8456.75,8402.75
    2008-02-11 15:00:00,306.0,8538.45,8457.45,8460.85,8462.65
    2008-02-11 15:30:00,335.0,8472.55,8375.7,8395.5,8469.5
    2008-02-12 10:00:00,6.0,8637.9,8573.25,8573.25,8419.25
    2008-02-12 10:30:00,36.0,8563.25,8447.1,8484.75,8561.4
    2008-02-12 11:00:00,66.0,8508.85,8410.65,8450.05,8491.95
    2008-02-12 11:30:00,96.0,8554.25,8477.6,8535.75,8468.25
    2008-02-12 12:00:00,126.0,8563.45,8488.35,8519.6,8544.7
    2008-02-12 12:30:00,156.0,8525.3,8450.25,8450.25,8522.45
    2008-02-12 13:00:00,186.0,8559.0,8407.55,8533.9,8451.7
    2008-02-12 13:30:00,216.0,8596.65,8533.6,8559.3,8549.4
    2008-02-12 14:00:00,246.0,8610.7,8545.15,8541.65,8560.85
    2008-02-12 14:30:00,276.0,8564.45,8454.0,8463.3,8552.4
    2008-02-12 15:00:00,306.0,8561.7,8418.25,8530.55,8479.3
    2008-02-12 15:30:00,335.0,8548.3,8482.7,8513.15,8545.45
    2008-02-13 10:00:00,6.0,8714.65,8667.8,8667.8,8512.35

    Output is:
    2008-02-11T15:30:00 init short
    2008-02-11T15:30:00 position size: 0, data: 8395.5, data1: 8395.5

    SELL EXECUTED on: 2008-02-12T10:00:00

    2008-02-12T10:00:00 position size: -1, data: 8573.25, data1: 8395.5
    2008-02-12T10:30:00 position size: -1, data: 8484.75, data1: 8395.5
    2008-02-12T11:00:00 position size: -1, data: 8450.05, data1: 8395.5
    2008-02-12T11:30:00 position size: -1, data: 8535.75, data1: 8395.5
    2008-02-12T12:00:00 position size: -1, data: 8519.6, data1: 8395.5
    2008-02-12T12:30:00 position size: -1, data: 8450.25, data1: 8395.5
    2008-02-12T13:00:00 position size: -1, data: 8533.9, data1: 8395.5
    2008-02-12T13:30:00 position size: -1, data: 8559.3, data1: 8395.5
    2008-02-12T14:00:00 position size: -1, data: 8541.65, data1: 8395.5
    2008-02-12T14:30:00 position size: -1, data: 8463.3, data1: 8395.5
    2008-02-12T15:00:00 position size: -1, data: 8530.55, data1: 8395.5
    2008-02-12T15:30:00 position size: -1, data: 8513.15, data1: 8513.15
    2008-02-13T10:00:00 position size: -1, data: 8667.8, data1: 8513.15

    The closing price for the last data point for 11th feb is being used as the closing price for the resampled bar for 12th Feb and so on.

    Is there a logical error here on my part?


  • administrators

    @kunalp said in Error resampling data:

    The closing price for the last data point for 11th feb is being used as the closing price for the resampled bar for 12th Feb and so on.
    Is there a logical error here on my part?

    That's how resampling works. The daily bar has been completely resampled at 2008-02-11T15:30:00 and that's why both closing prices are the same.

    On 2008-02-12 and until the day is complete (the data reaches 2008-02-11T15:30:00) the only available resampled bar is that from 2008-02-11.

    If instead of only printing the timestamp of self.data0 (aka self.data) you also printed the timestamp of self.data1, you would see how it works.

    The other available option is replaying, which constructs the bar as it happens. That may be what you are looking for.



  • @backtrader Thanks again. Resampling is the behaviour I was looking for. Modified code accordingly.


Log in to reply
 

Looks like your connection to Backtrader Community was lost, please wait while we try to reconnect.