using backtrader with pandas
-
I am new here and considering moving to backtrader. but i do have alot of code that uses pandas dataframes. i am wondering if all that code needs to be rewritten using the backtrader objects that represents series'?
For example, typically, i keep all my indicators and calculated values as part of the same ohlcv dataframe, since the all must align to the same datetimeindex. for example
df = dataset //dateset is a Dataframe with ohlcv columns and a datetime index slow_per = 21; fast_per = 8; rsi_per = 2; rsi_ob = 90; rsi_os = 10 df['ma_slow'] = talib.SMA(df.close, timeperiod = slow_per) df['ma_fast'] = talib.SMA(df.close, timeperiod = fast_per) df['rsi'] = talib.RSI(df.close, timeperiod=rsi_per) _, _, df['macdhist'] = talib.MACD(df.close, 12, 26, 9)
This makes it simple in my code to slice and dice, wrangle, filter and perform calculations...and these operations are very optimized to boot with latest versions of the libraries.
Is there a possiblity of working with pandas dataframes seemlessly with backtrader?
thanks much.
-
@dizzy0ny nop, you need to use data feed.
backtrader is a event-based backtesting, and based on yur short description you are using a vector based backtesting.
The second one mode is faster, the first one is more accurate, and you need to transform the logic from one method to another.
The use of talib is allow into the platform with data feed, so the process will be easy, laborious but easy.
-
@dizzy0ny There is the
PandasData
class for reading data feeds from Pandas but it really translates the DataFrame to a Backtrader line so you are still working with BT lines and indicators in your strategy. Note that BT indicators support some mathematical operations, like below to compute daily percent change (analogous to Pandas.pct_change()
function).self.rets = (self.datas[0].close / self.datas[0].close(-1) - 1)
But BT's math support is not nearly as rich as Pandas. However you could calculate all your indicators in Pandas and add them as additional columns in your DataFrame and load them into a custom feed. Something like this:
class CustomPandasFeed(bt.feeds.PandasData): lines = ('rtn1d', 'vol1m',) params = ( ('datetime', 'Date'), ('open', 'Open'), ('high', 'High'), ('low', 'Low'), ('close', 'Close'), ('volume', 'Volume'), ('openinterest', None), ('adj_close', 'Adj Close'), ('rtn1d', 'rtn1d'), ('vol1m', 'vol1m'), ) ... df = pd.read_csv(datapath, parse_dates=['Date']) df['rtn1d'] = df.Close.pct_change(1) df['vol1m'] = df.rtn1d.rolling(21).std() * (252 ** 0.5) df = df.dropna(axis=0) data = CustomPandasFeed(dataname=df) ... cerebro.adddata(data)
Then those additional fields are accessible in your strategy as another line (i.e.
self.datas[0].rtn1d
andsell.datas[0].vol1m
) -
@ultra1971 said in using backtrader with pandas:
sed on yur short description you are using a vector based backtesting.
The second one mode is faster, the first one is more accurate, and you need to transform the logic from one method to another.
The use of talib is allow into the platform with data feed, so the process will be easy, laborious but easy.thank you. i was curious as to what advantages there are in event based (or time based) systems. Seems to me with vector based, you can almost instantaneously generate the history for buy/sell signals - even if the signals are time based (i.e. event 2 is preceded by event 1 being True).
i will do some testing w/ backtrader -
@davidavr said in using backtrader with pandas:
CustomPandasFeed
So for this approach, every additional seriesi would want in my dataframe (which will be converted to a list/array of 'lines'), i would need to first define it in that CustomPandasFeed? or will the existing PandasData do that without having to define a custom feed?
This was helpful..thanks
-
@dizzy0ny I think you'd need a custom feed derived from
PandasData
like in my example with all the additional "lines". But I supposed you could create one that just dynamically added the lines based on the columns in your DataFrame. But doing it explicitly isn't that complicated, as my example shows.As for the vector vs. event-based backtesting, I think it's probably true that a vector-based approach is more powerful in some ways and bound to be a lot faster, but I think there is some logic that is easier to express in an event-driven approach which might lead to fewer mistakes (although I'm speculating a bit). The event-driven approach is also "safer" in that you can't cheat and accidentally look at future values. Finally, Backtrader makes is pretty straightforward to switch from backtesting to live trading. That might be more challenging with a vector-based system.
BTW, here's a vector-based Python backtesting project I found that looks interesting: vectorbt
-
@davidavr yes i came across vectorbt earlier. it seems interesting as well.
-
@dizzy0ny said in using backtrader with pandas:
i was curious as to what advantages there are in event based (or time based) systems. Seems to me with vector based, you can almost instantaneously generate the history for buy/sell signals - even if the signals are time based (i.e. event 2 is preceded by event 1 being True).
This is a good question but actually a generic backtesting question, rather than a backtrader question. There are plenty of articles and discussion available. Try searching on the net
backtesting event vs. vector
.