PandasData won't read a datetime column
-
This is my code:
import ipdb import backtrader as bt import pandas cerebro = bt.Cerebro(stdstats=False) cerebro.addstrategy(bt.Strategy) dataframe = pandas.read_json('BTC_XMR-30m.json') data = bt.feeds.PandasData( dataname=dataframe, datetime=1 ) cerebro.adddata(data) cerebro.run() cerebro.plot(style='bar')
dataframe looks like this:
close date high low open quoteVolume \ 0 0.004200 2014-07-18 16:00:00 0.004316 0.004030 0.004045 3412.578648 1 0.004180 2014-07-18 16:30:00 0.004295 0.004180 0.004295 80.464673 2 0.004226 2014-07-18 17:00:00 0.004500 0.004150 0.004195 2041.539606 3 0.004404 2014-07-18 17:30:00 0.004509 0.004226 0.004226 1840.058267 4 0.004435 2014-07-18 18:00:00 0.004511 0.004404 0.004404 866.016845 volume weightedAverage 0 14.166824 0.004151 1 0.338532 0.004207 2 8.865568 0.004343 3 8.079922 0.004391 4 3.851588 0.004447
According to the docs I can tell PandasData that the date is in column 1, which I did in the constructor. But then it fails with this:
(exch) shinichi@ayanami ~/source/poloniex $ ipython --pdb importer.py --------------------------------------------------------------------------- KeyError Traceback (most recent call last) ~/.virtualenvs/exch/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 2441 try: -> 2442 return self._engine.get_loc(key) 2443 except KeyError: pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5280)() pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5126)() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:20523)() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:20477)() KeyError: 1 During handling of the above exception, another exception occurred: KeyError Traceback (most recent call last) ~/source/poloniex/importer.py in <module>() 14 cerebro.adddata(data) 15 ---> 16 cerebro.run() 17 cerebro.plot(style='bar') ~/.virtualenvs/exch/lib/python3.6/site-packages/backtrader/cerebro.py in run(self, **kwargs) 1071 # let's skip process "spawning" 1072 for iterstrat in iterstrats: -> 1073 runstrat = self.runstrategies(iterstrat) 1074 self.runstrats.append(runstrat) 1075 else: ~/.virtualenvs/exch/lib/python3.6/site-packages/backtrader/cerebro.py in runstrategies(self, iterstrat, predata) 1145 if self._exactbars < 1: # datas can be full length 1146 data.extend(size=self.params.lookahead) -> 1147 data._start() 1148 if self._dopreload: 1149 data.preload() ~/.virtualenvs/exch/lib/python3.6/site-packages/backtrader/feed.py in _start(self) 201 202 def _start(self): --> 203 self.start() 204 205 if not self._started: ~/.virtualenvs/exch/lib/python3.6/site-packages/backtrader/feeds/pandafeed.py in start(self) 205 if v is None: 206 continue # special marker for datetime --> 207 self._colmapping[k] = self.p.dataname.columns.get_loc(v) 208 209 def _load(self): ~/.virtualenvs/exch/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 2442 return self._engine.get_loc(key) 2443 except KeyError: -> 2444 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2445 2446 indexer = self.get_indexer([key], method=method, tolerance=tolerance) pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5280)() pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5126)() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:20523)() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:20477)() KeyError: 1 > /Users/shinichi/source/poloniex/pandas/_libs/hashtable_class_helper.pxi(1218)pandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:20477)() ipdb>
However, when I changed the code to look like this:
data = bt.feeds.PandasData( dataname=dataframe, datetime='date' )
It still fails!
(exch) shinichi@ayanami ~/source/poloniex $ ipython --pdb importer.py --------------------------------------------------------------------------- IndexError Traceback (most recent call last) ~/source/poloniex/importer.py in <module>() 14 cerebro.adddata(data) 15 ---> 16 cerebro.run() 17 cerebro.plot(style='bar') ~/.virtualenvs/exch/lib/python3.6/site-packages/backtrader/cerebro.py in run(self, **kwargs) 1071 # let's skip process "spawning" 1072 for iterstrat in iterstrats: -> 1073 runstrat = self.runstrategies(iterstrat) 1074 self.runstrats.append(runstrat) 1075 else: ~/.virtualenvs/exch/lib/python3.6/site-packages/backtrader/cerebro.py in runstrategies(self, iterstrat, predata) 1147 data._start() 1148 if self._dopreload: -> 1149 data.preload() 1150 1151 for stratcls, sargs, skwargs in iterstrat: ~/.virtualenvs/exch/lib/python3.6/site-packages/backtrader/feed.py in preload(self) 433 434 def preload(self): --> 435 while self.load(): 436 pass 437 ~/.virtualenvs/exch/lib/python3.6/site-packages/backtrader/feed.py in load(self) 474 475 if not self._fromstack(stash=True): --> 476 _loadret = self._load() 477 if not _loadret: # no bar use force to make sure in exactbars 478 # the pointer is undone this covers especially (but not ~/.virtualenvs/exch/lib/python3.6/site-packages/backtrader/feeds/pandafeed.py in _load(self) 235 else: 236 # it's in a different column ... use standard column index --> 237 tstamp = self.p.dataname.index[coldtime][self._idx] 238 239 # convert to float via datetime and store it IndexError: invalid index to scalar variable.
What's wrong with PandasData? So far the documentation isn't very clear, at one point I thought the standard procedure was to subclass bt.feeds.PandasData.
-
Even if not obvious, a 2 liner of the original data (you can fake it if you wish) would be key to actually try to reproduce your behavior.
All those ipdb statements are only confusing, giving information about many things which have nothing to do with the potential problem.
In any case this is probably due to a recent pull request was issued to remove the usage of
ix
(deprecated in the latest versions of Pandas), which is replaced with eitherloc
(label based) oriloc
(numeric based) and which didn't catch all use cases (In most occasions thedatetime
timestamps are the index of the dataframe)The latest development commit contains some code to correct the situation and simplify the autodetection: