Resampling issues + adding multiple timeframes
-
Hi,
I have minute data and am first of all trying to resample it to 2-minutes.
Without resampling, the log properly gives me the close at each minute.
When I use the following code to resample to 2-minutes it seems to mess up the datetimeclass resampling(bt.Strategy): def log(self, txt, dt=None): ''' Logging function for this strategy''' dt = dt or self.datas[0].datetime.datetime(0) print('%s, %s' % (dt.strftime("%Y-%m-%d %H:%M"), txt)) def __init__(self): pass def next(self): self.log(self.data.close[0]) if __name__ == '__main__': cerebro = bt.Cerebro() # Add a strategy cerebro.addstrategy(resampling) # Create a Data Feed minute_data = bt.feeds.PandasData(dataname=df2020) cerebro.resampledata(minute_data, bt.TimeFrame.Minutes, compression=2) cerebro.run()
result of the log:
2020-01-02 23:59, 1.12075 2020-01-02 23:59, 1.12086 2020-01-02 23:59, 1.12083 2020-01-02 23:59, 1.12079 2020-01-02 23:59, 1.1208 2020-01-02 23:59, 1.12086 2020-01-02 23:59, 1.12075 2020-01-02 23:59, 1.1207 2020-01-02 23:59, 1.12084 2020-01-02 23:59, 1.12065 2020-01-02 23:59, 1.12074 2020-01-02 23:59, 1.1206399999999999
Additionally, my goal is to have 3 timeframes: 2-minute, 5-minute and 10-minute. Then I also want to use indicators on each timeframe. I'm honestly clueless on how to do this.
thanks for your help
-
you need to set thee timeframe of your data source, so
minute_data = bt.feeds.PandasData(dataname=df2020)
minute_data = bt.feeds.PandasData(dataname=df2020, timeframe=bt.TimeFrame.Minutes, compression=1)
-
I actually tried this before and then I'm only getting a couple rows showing up. Below is all the output there is from the log while there should be 100's of rows
2020-01-02 23:59, 1.1171200000000001 2020-01-03 23:59, 1.1154700000000002 2020-01-05 23:59, 1.1154700000000002 2020-01-06 23:59, 1.11958 2020-01-07 23:59, 1.11513 2020-01-08 23:59, 1.1103399999999999 2020-01-09 23:59, 1.11059 2020-01-10 23:59, 1.11196
-
@Jens-Halsberghe said in Resampling issues + adding multiple timeframes:
cerebro.resampledata(minute_data, bt.TimeFrame.Minutes, compression=2)
Try adding 'timeframe=' in fromt of
bt.TimeFrameMinutes
cerebro.resampledata(minute_data, timeframe=bt.TimeFrame.Minutes, compression=2)
Gives me the following with different data:
2020-01-02 09:32, 3234.75 2020-01-02 09:34, 3237.5 2020-01-02 09:36, 3239.0 2020-01-02 09:38, 3237.75 2020-01-02 09:40, 3238.0 2020-01-02 09:42, 3237.75 2020-01-02 09:44, 3238.0
-
@run-out Perfect! you always come to the rescue :) do you then also know how to refer to the datafeeds in the init / next functions?
I have now created the three timeframes as follows which seems to work. When I log, I'm getting the closes of the 2 minute timeframe. Can't see how to refer to the other two though.
# Add a strategy cerebro.addstrategy(resampling) # Create a Data Feed minute_data = bt.feeds.PandasData(dataname=df2020) two_minute = cerebro.resampledata(minute_data, timeframe=bt.TimeFrame.Minutes, compression=2) five_minute = cerebro.resampledata(minute_data, timeframe=bt.TimeFrame.Minutes, compression=5) ten_minute = cerebro.resampledata(minute_data, timeframe=bt.TimeFrame.Minutes, compression=10)
-
The easiest way is in order. Use self.datas[0] like this.
self.datas[0] is two_minute self.datas[1] is five_minute self.datas[2] is ten_minute
So then to get the ten minute close in next would be:
def next(self): self.datas[2].close[0]
-
@run-out great.
For the 5 minute:
self.log(self.datas[1].close[0])
instead of 09:05 it's showing 09:04 twice which is a bit annoying when I would want to make a reference
2020-01-02 09:00, 1.12075 2020-01-02 09:02, 1.12075 2020-01-02 09:04, 1.12075 2020-01-02 09:04, 1.12083 2020-01-02 09:06, 1.12083 2020-01-02 09:08, 1.12083 2020-01-02 09:10, 1.12086 2020-01-02 09:12, 1.12086 2020-01-02 09:14, 1.12086 2020-01-02 09:14, 1.1207 2020-01-02 09:16, 1.1207
-
Did you follow @dasch advice above? Not specifying timeframe when loading can cause multiple printouts for the same bar.
-
@run-out I did and I had the issue where only a couple rows were showing up. tried it again now and now I'm getting all the rows for some reason
-
you should check the length of data, if it really did advance.
next will be called on every data source. so if the 5 minute data advances, next will be called (this is why between 4 and 6, 4 will be twice.
-
@dasch This is correct. I was playing with the code and if you set the time frames to say, [2, 4, 8] or [5, 10, 20] then the problem disappears.
-
@dasch Yes. I realized that because the lowest dataframe is 2 minutes, there will never be a time ending on 5. I tried the following work around where I'm separately adding the one minute frame
# Add a strategy cerebro.addstrategy(resampling) # Create a Data Feed data = bt.feeds.PandasData(dataname=df2020) one_minute = cerebro.resampledata(data, timeframe=bt.TimeFrame.Minutes, compression=1) two_minute = cerebro.resampledata(data, timeframe=bt.TimeFrame.Minutes, compression=2) five_minute = cerebro.resampledata(data, timeframe=bt.TimeFrame.Minutes, compression=5) ten_minute = cerebro.resampledata(data, timeframe=bt.TimeFrame.Minutes, compression=10)
when logging the 5-minute closes I can see now it works properly
2020-01-02 09:00, 1.12075 2020-01-02 09:01, 1.12075 2020-01-02 09:02, 1.12075 2020-01-02 09:03, 1.12075 2020-01-02 09:04, 1.12075 2020-01-02 09:05, 1.12087 2020-01-02 09:06, 1.12087 2020-01-02 09:07, 1.12087 2020-01-02 09:08, 1.12087 2020-01-02 09:09, 1.12087 2020-01-02 09:10, 1.12086 2020-01-02 09:11, 1.12086 2020-01-02 09:12, 1.12086 2020-01-02 09:13, 1.12086 2020-01-02 09:14, 1.12086 2020-01-02 09:15, 1.1207799999999999
thanks for the help guys
-
when using multiple data sources, you will get notified of every change. when resampling, if data advances, on replay if data changes (in that case, data will not necessary advance). So to know if the data source advanced, you need to check the length of data.
example:
if self._last_len > len(self.datas[1]): self.log(self.datas[1].close[0]) self._last_len = len(self.datas[1]
-
next problem I'm running into is when I run my customer indicator in init I'm getting an error
def __init__(self): self.TI = Trend_Indicator(self.datas[0])
error:
--------------------------------------------------------------------------- IndexError Traceback (most recent call last) <ipython-input-39-1a3e58d6bea6> in <module> 39 # cerebro.adddata(data) 40 ---> 41 cerebro.run() ~\Anaconda3\lib\site-packages\backtrader\cerebro.py in run(self, **kwargs) 1125 # let's skip process "spawning" 1126 for iterstrat in iterstrats: -> 1127 runstrat = self.runstrategies(iterstrat) 1128 self.runstrats.append(runstrat) 1129 if self._dooptimize: ~\Anaconda3\lib\site-packages\backtrader\cerebro.py in runstrategies(self, iterstrat, predata) 1215 sargs = self.datas + list(sargs) 1216 try: -> 1217 strat = stratcls(*sargs, **skwargs) 1218 except bt.errors.StrategySkipError: 1219 continue # do not add strategy to the mix ~\Anaconda3\lib\site-packages\backtrader\metabase.py in __call__(cls, *args, **kwargs) 86 _obj, args, kwargs = cls.donew(*args, **kwargs) 87 _obj, args, kwargs = cls.dopreinit(_obj, *args, **kwargs) ---> 88 _obj, args, kwargs = cls.doinit(_obj, *args, **kwargs) 89 _obj, args, kwargs = cls.dopostinit(_obj, *args, **kwargs) 90 return _obj ~\Anaconda3\lib\site-packages\backtrader\metabase.py in doinit(cls, _obj, *args, **kwargs) 76 77 def doinit(cls, _obj, *args, **kwargs): ---> 78 _obj.__init__(*args, **kwargs) 79 return _obj, args, kwargs 80 <ipython-input-39-1a3e58d6bea6> in __init__(self) 18 19 # self.DVM = Decreasing_Volatility_Mode(self.datas[0]) ---> 20 self.TI = Trend_Indicator(self.datas[0]) 21 22 def next(self): ~\Anaconda3\lib\site-packages\backtrader\indicator.py in __call__(cls, *args, **kwargs) 51 def __call__(cls, *args, **kwargs): 52 if not cls._icacheuse: ---> 53 return super(MetaIndicator, cls).__call__(*args, **kwargs) 54 55 # implement a cache to avoid duplicating lines actions ~\Anaconda3\lib\site-packages\backtrader\metabase.py in __call__(cls, *args, **kwargs) 86 _obj, args, kwargs = cls.donew(*args, **kwargs) 87 _obj, args, kwargs = cls.dopreinit(_obj, *args, **kwargs) ---> 88 _obj, args, kwargs = cls.doinit(_obj, *args, **kwargs) 89 _obj, args, kwargs = cls.dopostinit(_obj, *args, **kwargs) 90 return _obj ~\Anaconda3\lib\site-packages\backtrader\metabase.py in doinit(cls, _obj, *args, **kwargs) 76 77 def doinit(cls, _obj, *args, **kwargs): ---> 78 _obj.__init__(*args, **kwargs) 79 return _obj, args, kwargs 80 <ipython-input-5-cc6e0edefdb2> in __init__(self) 11 self.BB = bt.indicators.BollingerBands(self.datas[0]) 12 ---> 13 self.ohlc.open = self.datas[0].open[0] 14 self.ohlc.high = self.datas[0].high[0] 15 self.ohlc.low = self.datas[0].low[0] ~\Anaconda3\lib\site-packages\backtrader\linebuffer.py in __getitem__(self, ago) 161 162 def __getitem__(self, ago): --> 163 return self.array[self.idx + ago] 164 165 def get(self, ago=0, size=1): IndexError: array index out of range
-
Your indicator seems to need some data to fill. Try to add a minperiod to it. Search the forum, since this is a common issue.
-
@dasch
thanks,
yes I forgot to mention it uses EMA's (8 and 21) when I run it without resampling it works. but it doesn't when I resample.
I'll have a look
-
Without some code that produces the error it is hard to tell. But if you require some data you work on and calculate the trend in next, then you would wait until both have enough data before accessing them.
So if you set the sma in the trend indicator, set the min period to the value of the bigger sma
self.addminperiod(period of bigger sma)
-
I mean ema. So in your case self.addminperiod(21) or higher if you do something with past ema data
-
@dasch I added self.addminperiod(21) in init in the indicator but that still doesn't seem to solve the issue
-
Provide some sample code to show the problem. Basically you try to access data which is not available. That’s all that can be said with the error you posted.