Dynamically set params and lines attribute of class PandasData
-
Hello,
I am very new to backtrader and hope to get some advise here.
Here is what I am trying to do:
Assuming I get data as pandas dataframe. One or more columns are to be used in the strategy, but I don't know their name. How can I dynamically change the "params" attribute of bt.feeds.PandasData? Note that the resulting complications with the strategy are irrelevant at that piont.Consider the following reproducible example:
import datetime
import backtrader as bt
import numpy as np
import pandas as pdclass TestStrategy(bt.Strategy): def log(self, txt, dt=None): dt = dt or self.datas[0].datetime.date(0) print('%s, %s' % (dt.isoformat(), txt)) def __init__(self): self.dataclose = self.datas[0].close self.col_a = self.datas[0].col_a self.order = None self.buyprice = None self.buycomm = None def notify_order(self, order): if order.status in [order.Submitted, order.Accepted]: return if order.status in [order.Completed]: if order.isbuy(): self.log( 'BUY EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f' % (order.executed.price, order.executed.value, order.executed.comm)) self.buyprice = order.executed.price self.buycomm = order.executed.comm else: # Sell self.log('SELL EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f' % (order.executed.price, order.executed.value, order.executed.comm)) self.bar_executed = len(self) elif order.status in [order.Canceled, order.Margin, order.Rejected]: self.log('Order Canceled/Margin/Rejected') # Write down: no pending order self.order = None def notify_trade(self, trade): if not trade.isclosed: return self.log('OPERATION PROFIT, GROSS %.2f, NET %.2f' % (trade.pnl, trade.pnlcomm)) def next(self): self.log('Close, %.2f' % self.dataclose[0]) if self.order: return if not self.position: if (self.col_a[0] < 1): self.log('BUY CREATE, %.2f' % self.dataclose[0]) self.order = self.buy(size=500) else: if (self.col_a[0] > 1): self.log('SELL CREATE, %.2f' % self.dataclose[0]) self.order = self.sell(size=500) class BacktraderDataDefintion(bt.feeds.PandasData): lines = ("col_a",) params = ( ('datetime', None), ('open', -1), # ('high', None), # ('low', None), ('close', -1), # ('volume', None), ('col_a', -1), ) def createSampleData(): a = np.random.randint(0,4,size=15) open = np.random.randint(0,10,size=15) close = np.random.randint(0,10,size=15) df = pd.DataFrame({"Open":open,"Close":close,"col_a" if a[0] <= 1 else "col_b":a}, index=pd.date_range('2018-01-01', periods=15, freq='H')) return df if __name__ == '__main__': cerebro = bt.Cerebro() cerebro.addstrategy(TestStrategy) cerebro.broker.setcommission(commission=0.001) # Create a Data Feed data = BacktraderDataDefintion(dataname=createSampleData()) cerebro.adddata(data) cerebro.broker.setcash(10000.0) print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue()) cerebro.run() print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue())
As you can see the data sometimes has a column called "col_a" or "col_b". How can I make sure that the attribute tuple "lines" and "params" in class "BacktraderDataDefintion" get adjusted automatically, depending on the pandas DataFrame?
The expected output would be:
lines = ("col_a",) params = ( ('datetime', None), ('open', -1), # ('high', None), # ('low', None), ('close', -1), # ('volume', None), ('col_a', -1), )
or
lines = ("col_b",) params = ( ('datetime', None), ('open', -1), # ('high', None), # ('low', None), ('close', -1), # ('volume', None), ('col_b', -1), )
depending on the columns in dataframe "df".
Hopefully someone has an idea.
-
I am not too sure what you are trying to accomplish, but in my opinion, I found that direct manipulation of pandas data frame using pandas itself is quite effective before you import it into backtrader. However, in the case, you want to create additional lines for your data, for example, creating a new line data based on some data manipulation on the moving average data, this you can actually initialize in the strategy class itself.
-
Apparently I should have given it a second try when using setattr().
Adding:setattr(BacktraderDataDefintion.params,"col_a",-1)
Seems to work.
class BacktraderDataDefintion(bt.feeds.PandasData): lines = ("col_a",) params = ( ('datetime', None), ('open', -1), # ('high', None), # ('low', None), ('close', -1), # ('volume', None), ) if __name__ == '__main__': cerebro = bt.Cerebro() cerebro.addstrategy(TestStrategy) cerebro.broker.setcommission(commission=0.001) #Add params setattr(BacktraderDataDefintion.params,"col_a",-1) # Create a Data Feed data = BacktraderDataDefintion(dataname=createSampleData()) cerebro.adddata(data) cerebro.broker.setcash(10000.0) print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue()) cerebro.run() print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue())
However, I am lost on the "lines" class attribute. Especially because of this:
dir(BacktraderDataDefintion.lines) prints ['__class__', '__delattr__', .. 'buflen', 'close', 'col_a', 'datetime', 'extend', ... 'size', 'volume']
but:
getattr(BacktraderDataDefintion.lines,"col_a") --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) ... AttributeError: 'NoneType' object has no attribute 'lines'
How is this possible?
@Robin-Dhillon Manipulating the *.csv is not an option.
-
Sorry. Edit is not working.
Can you specify how you initialize a new line in a a strategy based on a column in the existing dataframe?
-
You need to dynamically subclass
PandasData
usingtype
and with it thelines
andparams
you want to have, based on the contents of your dataframe -
@lammy the initialization all depends on you so for example:
I have my close prices loaded into BT as:
self.dataclose = self.datas[0].close
Now I my strategy buys the asset/instrument when (self.dataclose x 2) is greater than 'x' value.
self.close2 = self.dataclose*2
So essentially, all I did was take the pandas column and just explicitly times it by two and assigned it to another objectself.close2
Now you can take this object use it in your
next()
method and create your buy/sell orders -
Big Thanks to @backtrader
Dynamic subclassing did the job. Absolutely fantastic!
Thanks for your input. May problem was though that I had a no information about the column names of my data, hence I did not know upfront if there is a e.g. "closed" column.
-
Here is an example of what I did:
Generating the
lines
andparams
tuples by passing inlist(df.columns)
def get_extra_df_columns(self, df_columns: list[str] = []) -> list[tuple]: lines: tuple = () params: tuple(tuple) = () for column in df_columns: if column not in ['symbol', 'open', 'high', 'low', 'close', 'volume']: lines = lines + (column,) params = params + ( (column, -1), ) return [lines, params]
Creating the subclass by parsing the
df
to be parsed then returning it with the subclass attributes set dynamically:def _create_data_feeder(self, database: Database, df: pd.DataFrame ): lines, params = database.get_extra_df_columns(df_columns=list(df.columns) ) return type('PandasDataFeed', (bt.feeds.PandasData, ), {'lines':lines, 'params':params} )
Adding the dataframe to cerebro via the created
data_feeder
subclassself.data_feeder = self._create_data_feeder(database=database, df=df) self.cerebro.adddata(self.data_feeder(dataname=df, name=symbol ))
The extra columns in the df are
vwap
andsupertrend
and are accessible without hardcoding thelines
andparams
class TestStrategy(bt.Strategy): def __init__(self): for i, d in enumerate(self.datas): bt.ind.SMA(d.supertrend, period=1, subplot=False, plotname='supertrend') bt.ind.SMA(d.vwap, period=1, subplot=False, plotname='vwap')