For code/output blocks: Use ``` (aka backtick or grave accent) in a single line before and after the block. See: http://commonmark.org/help/

Dates and time loaded from CSV files not precise



  • Hi,

    I'm seeing imprecise datetimes for daily OHLCV data loaded from CSV files:
    2017-08-27T23:59:59.999989 instead of 2017-08-28T00:00:00.0 (I assume it should be rounded like this)

    I'm using OHLCV data where the datetime is a UTC timestamp in the following format:

    binance,BTCUSDT,1502928000,1D,4261.4800000000,4485.3900000000,4200.7400000000,4285.0800000000,795.1503770000 binance,BTCUSDT,1503014400,1D,4285.0800000000,4371.5200000000,3938.7700000000,4108.3700000000,1199.8882640000 binance,BTCUSDT,1503100800,1D,4108.3700000000,4184.6900000000,3850.0000000000,4139.9800000000,381.3097630000 binance,BTCUSDT,1503187200,1D,4139.9800000000,4211.0800000000,4032.6200000000,4086.2900000000,467.0830220000 binance,BTCUSDT,1503273600,1D,4069.1300000000,4119.6200000000,3911.7900000000,4016.0000000000,691.7430600000 binance,BTCUSDT,1503360000,1D,4016.0000000000,4104.8200000000,3400.0000000000,4040.0000000000,966.6848580000

    And my reader is with the following configuration:

    data = (btfeeds.GenericCSVData(
    	dataname=<my data source file>,
    	timeframe=1,
    	compression=bt.TimeFrame.Days,
    	nullvalue=0.0,
    	dtformat=1,
    	datetime=2,
    	open=4,
    	high=5,
    	low=6,
    	close=7,
    	volume=8,
    	openinterest=-1
    ))
    

    Is there a reason for this? It only happens with daily and weekly OHLCV data, 5min, 1H and 4H data do not produce such imprecision.

    I even tried to supplement my own function to convert the timestamp to python datetime.datetime via the dtformat argument but Backtrader gives me a error AttributeError: 'Lines_LineSeries_DataSeries_OHLC_OHLCDateTime_Abst' object has no attribute '_dtstr' when I try that.

    Here's the function I tried:

    from dateutil import tz
    def parsedate(timestamp):
    	if type(timestamp) is str: timestamp = int(timestamp) # put up with timestamp being in string format
    	if timestamp > 1e10: timestamp//1000 # put up with timestamp being in microseconds
    	return datetime.datetime.utcfromtimestamp(float(timestamp)).replace(tzinfo=tz.tzutc())
    

    // EDIT

    I also tried dtformat=2 (parses the timestamp to float instead of int according to the docs) but the result is the same.



  • As for the AttributeError: 'Lines_LineSeries_DataSeries_OHLC_OHLCDateTime_Abst' object has no attribute '_dtstr' when using dtformat=callable, I think that can be fixed as follows in the backtrader/backtrader/feeds/csvgeneric.py source code:

       def start(self):
            super(GenericCSVData, self).start()
    
            if isinstance(self.p.dtformat, string_types):
                self._dtstr = True
            elif isinstance(self.p.dtformat, integer_types):
                self._dtstr = False
                idt = int(self.p.dtformat)
                if idt == 1:
                    self._dtconvert = lambda x: datetime.utcfromtimestamp(int(x))
                elif idt == 2:
                    self._dtconvert = lambda x: datetime.utcfromtimestamp(float(x))
    
            else:  # assume callable
                self._dtstr = False # <-- THIS LINE IS MISSING IN THE CODE
                self._dtconvert = self.p.dtformat
    
        def _loadline(self, linetokens):
            # Datetime needs special treatment
            dtfield = linetokens[self.p.datetime]
            if self._dtstr:
                dtformat = self.p.dtformat
    


  • Hm, it seems the 23:59:59.999990 is added as the sessionend parameter during the _loadline() method in the backtrader/backtrader/feeds/csvgeneric.py file, which would mean that the datetime 2017-08-19T23:59:59.999989 actually means "the daily candle starting on 2017-08-19T00:00:00", not on 2017-08-20 (as I thought).

    But what it means and how to fix it I don't know :)


  • administrators

    @tomasrollo said in Dates and time loaded from CSV files not precise:

    2017-08-27T23:59:59.999989 instead of 2017-08-28T00:00:00.0 (I assume it should be rounded like this)

    That rounding would be wrong. That's a different day.

    @tomasrollo said in Dates and time loaded from CSV files not precise:

    Is there a reason for this? It only happens with daily and weekly OHLCV data, 5min, 1H and 4H data do not produce such imprecision.

    Sub-day timeframes already carry a time payload. But Days and greater timeframes need to be placed at the end of the period. If you were to mix a 5min timeframe with a daily timeframe in the same day, the daily candle would happen before any candle in the 5min timeframe.

    @tomasrollo said in Dates and time loaded from CSV files not precise:

    it seems the 23:59:59.999990 is added as the sessionend

    if you don't supply any sessionend, the end of the day is considered (unfortunately Python has no end-of-day time constant or object that can play this role) unless you consider your session to be a single tick at 00:00:00



  • @backtrader aha, ok, now I get it, thanks! :)