cerebro.resample() introduces data in the future
-
Hi guys, I fould some weired behaviour when using 1min data and cerebro.resample() to create 1h data, in order to make decision from both timeframe.
It seems like when I'm making decision by concentrating on 1h data, I've already recieved 1min data in the future. Just take a look at the output (time, open, high, low, close):
m 2020-03-20 01:00:00 6219.15 6222.24 6212.39 6213.54 m
h 2020-03-20 01:00:00 6200.76 6269.24 6174.0 6219.15 h
---------Do strategy logic here------------
Note: 6219.15 is the close price of the 1h data at 2020-03-20 01:00:00, while 6219.15 is also the open price of the 1min data, which means when I'm accessing the 1h data during the "Do strategy logic here" session, self.datas[0].lines[0] will give me some information I shouldn't know right at that time(when 1h candle bar closed), in order words, introduce data in the future.I'm not sure this is intented or a potential bug.
Here is the data and code I'm using to reproduce the output:
import datetime import pandas as pd import backtrader as bt class Test(bt.Strategy): def log_bar(self, data, mark='*'): time = data.datetime.datetime(0) o = data.open[0] h = data.high[0] l = data.low[0] c = data.close[0] print(mark, time, o, h, l, c, mark) def log_logic(self): print("---------Do strategy logic here------------") def nextstart(self): self.lendata1 = 0 def next(self): if len(self.data1) > self.lendata1: self.lendata1 = len(self.data1) self.log_bar(self.datas[0], 'm') self.log_bar(self.datas[1], 'h') self.log_logic() # do strategy logic here if __name__ == '__main__': cerebro = bt.Cerebro() cerebro.addstrategy(Test) datapath = 'datas/BTC_USDT_1m.csv' data = pd.read_csv(datapath, index_col='datetime', parse_dates=True) datafeed = bt.feeds.PandasData(dataname=data) cerebro.adddata(datafeed) cerebro.resampledata( datafeed, timeframe=bt.TimeFrame.Minutes, compression=60) cerebro.run()
-
@xyshell said in cerebro.resample() introduces data in the future:
Note: 6219.15 is the close price of the 1h data at 2020-03-20 01:00:00, while 6219.15 is also the open price of the 1min data
No. There is no such thing as the closing price of the
1-hour
data, because that data doesn't exist. Naming things properly does help.Additionally the information you provide is wrong, which is confirmed by looking at your data.
Note for the other readers: for whatever the reason, this data defies all established standards and has the following format:
CHLOVTimestamp
Your data indicates that
6219.15
is the closing price of the1-min
bar at00:59:00
, hence the last bar to go into the resampling for1-hour
between00:00:00
and00:59:00
(60 bars -if all are present-) which is first delivered to you as a resampled bar at01:00:00
6219.15
is the opening price of the1-min
bar at01:00:00
At
01:00:00
you have two bits of information available:- The current
1-min
bar for01:00:00
- The
1-hour
resampled data for the period00:00:00
to00:59:00
As expected.
-
@backtrader said in cerebro.resample() introduces data in the future:
At
01:00:00
you have two bits of information available:- The current
1-min
bar for01:00:00
- The
1-hour
resampled data for the period00:00:00
to00:59:00
As expected.
Thanks for explanation, it's very clear. It's not introducing data in the future, but just delivering 1-hour resampled data 1min late, so that, any decision made during "# do strategy logic here" will be executed at 01:00:01 instead of 01:00:00.
- The current
-
@xyshell said in cerebro.resample() introduces data in the future:
@backtrader said in cerebro.resample() introduces data in the future:
At
01:00:00
you have two bits of information available:- The current
1-min
bar for01:00:00
- The
1-hour
resampled data for the period00:00:00
to00:59:00
As expected.
Thanks for explanation, it's very clear. It's not introducing data in the future, but just delivering 1-hour resampled data 1min late, so that, any decision made during "# do strategy logic here" will be executed at 01:00:01 instead of 01:00:00.
I mean be executed at 01:01:00 instead of 01:00:00.
- The current