@andi I just realized that the second column is already the actual portfolio value (i.e cash + positions).
Best posts made by andi
-
RE: How to create pyfolio round trip tearsheet?
-
RE: Resampling from daily to monthly - lagging issue
@dasch said in Resampling from daily to monthly - lagging issue:
The datetime 2019-12-30 23:59:59 does not say there is no more data for this period, but the timestamp 2020-01-02 23:59:59 does. Between these two timestamps the period switches at 2020-01-01 00:00:00 and 2020-01-02 00:00:00.
I am probably not fully aware of how the resampling works in backtrader. Your description implies (to me), that the resampling takes place at runtime, maybe in the
next
method? If this is the case, I can follow your description.
However, I thought the resampling takes place before I added this feed tocerebro
. If backtrader resamples before running the strategy and adds the resampled feed, the program knows in advance, when it is the last day of the month, because it knows the complete daily feed. Am I correct with that assumption?Anyway, in real life we all know if the 30th of a month will be the last trading day or not. And if this is the case, we know the monthly closing price as soon as the closing bell rings. At this stage I can use that price for any computation and issue an order, which will then be executed at the opening of the next bar.
So again, in my view, the implementation of the
resample
method is ecomically incorrect.
Latest posts made by andi
-
Extracting strategy signals
What's the best way to extract trading signals from a given strategy?
At the end of the day, I am looking for a
pd.DataFrame
comprising all securities in the attached datafeed in columns and daily observations in rows. If there is a buy signal for any security for any day, I need a +1 in that cell, if it is a sell, I need a -1. Otherwise, I am expecting 0.I don't care about any position size, just the pure signal.
-
RE: Resampling from daily to monthly - lagging issue
@dasch said in Resampling from daily to monthly - lagging issue:
The datetime 2019-12-30 23:59:59 does not say there is no more data for this period, but the timestamp 2020-01-02 23:59:59 does. Between these two timestamps the period switches at 2020-01-01 00:00:00 and 2020-01-02 00:00:00.
I am probably not fully aware of how the resampling works in backtrader. Your description implies (to me), that the resampling takes place at runtime, maybe in the
next
method? If this is the case, I can follow your description.
However, I thought the resampling takes place before I added this feed tocerebro
. If backtrader resamples before running the strategy and adds the resampled feed, the program knows in advance, when it is the last day of the month, because it knows the complete daily feed. Am I correct with that assumption?Anyway, in real life we all know if the 30th of a month will be the last trading day or not. And if this is the case, we know the monthly closing price as soon as the closing bell rings. At this stage I can use that price for any computation and issue an order, which will then be executed at the opening of the next bar.
So again, in my view, the implementation of the
resample
method is ecomically incorrect. -
RE: Resampling from daily to monthly - lagging issue
@andi said in Resampling from daily to monthly - lagging issue:
What would you think would happen if I don't resample the daily feed but regularly add the monthly data as a second feed?
If I add another regular monthly data feed (instead of resampling my daily feed), all problems are gone. All prices behave as I previously laid out.
One is basically free to download the data with monthly periodicity or write your own resample function. I came up with something like this:
def datafeed_to_monthly( df: pd.DataFrame, ): """ Resamples a daily data feed to a monthly time frame. The monthly datetime index will excactly reflect the datetime index of the original datetime index. For example, if on the daily datetime index, the last trading day is 26th of May, this will be reflected in the resampled data. Parameters ---------- df Daily `pandas.DataFrame` comprising OHLC/OHLCV data. The dates must be the index of type `datetime`. Returns ------- pandas.DataFrame Resampled monthly OHLC/OHLCV data. """ df["date"] = df.index if len(df.columns) == 5: mapping = dict( date="last", open="first", high="max", low="min", close="last", ) else: mapping = dict( date="last", open="first", high="max", low="min", close="last", volume="sum", ) return df.resample("BM").agg(mapping).set_index("date")
As a result, I come to the conclusion that the implementation in the
resample
method is not what I would expect. I would probably consider it to be incorrect, at least with regards to resample daily to monthly. However, I am happy to discuss this interpretation. -
RE: Resampling from daily to monthly - lagging issue
@run-out I have got no access to my machine right now. As you know, I am just starting with backtrader, i. e. I am not familiar with rightegde and boundaries. However, if
rightedge=False
didn't do the trick, you could tryboundoff=1
(just guessing on my side, trial & error) .Anyway, I think it's a bit weird, that we have to scratch our heads about upsampling. In my view, all the standard settings should lead to the result that I laid out previously. I am wondering if it is possible that the resampling method has a bug? On the other hand, backtrader seems to be a very mature framework and I would be surprised that I should be the first one who stumbles upon this issue.
Using cheat-on-open doesn't sound right either. I don't know if this would solve the issue at hand. However, it may lead to other "issues" down the road. I don't want to cheat, but I would like to have a realistic setup.
What would you think would happen if I don't resample the daily feed but regularly add the monthly data as a second feed?
-
RE: Resampling from daily to monthly - lagging issue
@run-out Let me try to get hold of this topic from a different perspective.
Let's assume I am operating on monthly data exclusively, i.e. only one data feed/no resampling. Let's say, I want to issue a buy order right after the monthly close if the closing value is greater than 100. It is my understanding that backtrader will issue this order right after the close and it will get executed with the opening price of the next bar, i.e. the very first price of the following trading day (which is then actually the opening price of the next monthly candle). Is this understanding correct? That would be a real-life scenario.
Now let's compare this to a situation where I use a daily feed as well as a resampled monthly feed. I again want to get my long order executed with the opening price of the first trading day of the new month. However, if I base my trading decision on monthly data (closing > 100), I can't take the decision after the close on the last trading day, because the resampled monthly data is not updated yet. It simply doesn't reflect the actual monthly closing price.
So my question is, how can there be a difference between outright monthly data and resampled monthly data. Shouldn't it be equal??
-
RE: Resampling from daily to monthly - lagging issue
@run-out I am totally with you. What bothers me is the fact, that the very first value is not available at line 4. At the close of 2019-12-30, I am able to resample the daily data into the monthly time frame.
Let me elaborate a little bit on this issue. I need to compute trading signals as per month end (based on the monthly data feed), execution will take place on the opening of the first trading day of the next month (based on the daily data feed). Now, when computing an SMA on monthly basis, I do know the new value as soon as I do have a closing price for the month (in contrast to get that value one day later),
Let's say, I want to generate a long signal, if my monthly closing price is above the monthly SMA. If I were to use monthly data exclusively, backtrader would generate this long signal as per close of the last trading day in the month (i.e. closing price of the monthly candle). And, equally important, this closing price will be part of the latest SMA value.
However, this is not the case when resampling the data, as can be seen in the screenshot. As a result, on the last trading day, I am comparing the correct closing value with an outdated SMA value.As a general rule, the monthly OHLC values on a daily basis should be the same for the whole month with the exception of the last trading day of the month. Here, we get the next/new OHLC value.
Here's backtrader's result with more lines:
All the columns that I marked yellow should be moved upward by one row. Here's what I expect:
Are you with me?
-
RE: Find last trading day of month
I came up with the following work-around:
- Extracting the trading days from my data feeds
- Identifying the month-end dates
- Creating a timer that throws an alarm via the
notify_timer
method in my strategy class.
Extracting the trading days from my data feeds:
I stored my pricing data in csv files. So I created a simple function that reads in the csv file and extracts the days.
However you get your trading dates, make sure to save them aspandas.DatetimeIndex
.def get_trading_days( csv_file: Path, fromdate: datetime.date, todate: datetime.date ) -> pd.DatetimeIndex: """ Extracts trading days from a csv-file. Parameters ---------- csv_file Path to csv-file. fromdate First selected trading date. todate Last selected trading date. Returns ------- pandas.DatetimeIndex DatetimeIndex comprising the selected trading dates. """ path = Path("./resources").joinpath(csv_file) df = pd.read_csv(path, index_col=0) df.index = pd.to_datetime(df.index) return df.loc[(df.index >= fromdate) & (df.index <= todate)].index
Identifying month-end dates and creating a timer class:
I created a
PeriodEndTradingDays
class. This class provides a private method called_extract_period_end_dates
. The extracted dates will be serving the timer.# Timer.py import datetime from typing import Literal import pandas as pd class PeriodEndTradingDays: """ Creates a timer, which notifies if the current trading day is the last trading day of the period. Parameters ---------- dt_index Datetime index comprising all trading days. frequency Determines the frequency used in order to extract the last trading days. Could be weekly ('W'), monthly ('M'), quarterly ('Q'), or yearly ('Y'). """ def __init__( self, dt_index: pd.DatetimeIndex, frequency: Literal["W", "M", "Q", "Y"] ): self.dt_index = dt_index self.freq = frequency self.period_end_dates = self._extract_period_end_dates() def __call__(self, d): if d in self.period_end_dates: return True else: return False def _extract_period_end_dates(self): """Extracts the last trading days for a given frequency.""" if self.freq.lower() == "w": mask = pd.Series(self.dt_index.week) != pd.Series(self.dt_index.week).shift( -1 ) elif self.freq.lower() == "m": mask = pd.Series(self.dt_index.month) != pd.Series( self.dt_index.month ).shift(-1) elif self.freq.lower() == "q": mask = pd.Series(self.dt_index.quarter) != pd.Series( self.dt_index.quarter ).shift(-1) else: mask = pd.Series(self.dt_index.year) != pd.Series(self.dt_index.year).shift( -1 ) period_end_trading_dates = self.dt_index[mask.values] return [ datetime.date(year=x.year, month=x.month, day=x.day) for x in period_end_trading_dates ]
In my
main.py
script, I am saving the trading days for all my feeds in adict
calledtrading_days
.portfolio = ["DAX Index", "SPX Index"] fromdate = datetime.datetime(2019, 12, 20) todate = datetime.datetime(2021, 7, 31) trading_days = { ticker: DataFeeds.get_trading_days( f"{ticker}.csv", fromdate=fromdate, todate=todate ) for ticker in portfolio }
Later, when adding my feeds to
cerebro
, I am also adding atimer
. Theallow
keyword is important, here. It makes use of myPeriodEndTradingDays
class. You can find more about manually created timers here: https://www.backtrader.com/docu/timers/timers/for ticker in portfolio: cerebro.adddata(data=data_feeds[ticker], name=ticker) cerebro.add_timer( when=bt.timer.SESSION_END, allow=PeriodEndTradingDays(dt_index=trading_days[ticker], frequency="M"), strats=True, timername=ticker, )
Finally, in my strategy class, I overwrite the
notify_timer
method. The timer's output (True
/False
) will be saved in adict
calledrebal
def notify_timer(self, timer, when, *args, **kwargs): self.rebal[kwargs.get("timername")] = True
Lastly, in my
next
method, I can check ifself.rebal==True
. Don't forget to setself.rebal=False
at the end of thenext
method!I am happy to take any suggestions for improvement!
-
Resampling from daily to monthly - lagging issue
I am facing a lagging issue when resampling my daily data feed to a monthly time frame.
I added a regular daily feed to my cerebro instance. I then used
cerebro.resampledata
to add a monthly datafeed to it. I would like to use the monthly data feed to compute a SMA indicator. Trades will be executed on the daily data feed.I realized that the resampled monthly data feed is only available at the first trading day of a new month. That looks a bit weird to me. I would expect it to be available at the close of any given month.
Take a look at the screenshot. I created a writer to analyze the pricing. My daily data starts at 2019-12-20. The first monthly closing data point can be seen in line 5. However, this is already 2020-01-02 for the daily data feed.
Am I missing something?
-
RE: Find last trading day of month
@run-out
This is a bit weird, because if I don't add the calendar, the script runs smoothly. I am actually loading a csv file that contains data, which I downloaded from Bloomberg.Here's an excerpt from the csv file:
date,open,high,low,close,volume 2001-01-03,62.64,64.5,61.8,64.5,1087185.0 2001-01-04,65.23,65.23,63.67,63.67,1160550.0 2001-01-05,64.18,64.7,63.48,63.79,1789392.0 2001-01-08,63.68,64.3,63.65,63.73,562735.0 2001-01-09,64.54,64.6,63.67,63.81,943175.0 2001-01-10,64.06,64.11,62.92,63.48,944343.0 2001-01-11,63.9,64.44,63.4,64.44,678861.0 2001-01-12,64.83,65.42,64.68,64.89,621234.0
class BloombergCSV(bt.feeds.GenericCSVData): params = ( ("fromdate", datetime.datetime(1990, 1, 1)), ("todate", datetime.date.today()), ("nullvalue", float("NaN")), ("dtformat", "%Y-%m-%d"), ("tmformat", "%H.%M.%S"), ("datetime", 0), ("open", 1), ("high", 2), ("low", 3), ("close", 4), ("volume", 5), ("time", -1), ("openinterest", -1), )
Excerpt from main code:
if __name__ == "__main__": print("--- Initializing backtest...") cerebro = bt.Cerebro() # --- data feeds --- dax = BloombergCSV( dataname="./resources/DAXEX.csv", fromdate=datetime.datetime(2020, 1, 1) ) cerebro.adddata(data=dax, name="DAX") cerebro.resampledata(dax, name="DAX_monthly", timeframe=bt.TimeFrame.Months) # --- strategy --- cerebro.addstrategy(SMAMomentum, period=10, print_log=True) # --- analyzer --- cerebro.addanalyzer(CashMarket, _name="cashmarket") # --- execution --- print(f"Starting Portfolio Value: {cerebro.broker.getvalue():,.2f}") results = cerebro.run() print(f"Final Portfolio Value: {cerebro.broker.getvalue():,.2f}")
-
Find last trading day of month
I wanted to make use of backtrader's trading calendar functionality, especially I wanted to use the
last_monthday
method (https://www.backtrader.com/docu/tradingcalendar/tradingcalendar/). So I pip installedpandas_market_calendars
. However, I 'm a bit puzzled as to how to correctly add a market calendar tocerebro
instance.I need German trading days, so I figured that
EUREX
will be the calendar to choose.... cerebro.addcalendar("EUREX") results = cerebro.run() # This line (107) throws an error
How am I supposed to correctly add a market calendar?
Here's the traceback...
File "C:/Users/D292498/PycharmProjects/pybt/src/bt/main.py", line 107, in <module> results = cerebro.run() File "C:\Users\D292498\AppData\Local\conda\conda\envs\pybt\lib\site-packages\backtrader\cerebro.py", line 1127, in run runstrat = self.runstrategies(iterstrat) File "C:\Users\D292498\AppData\Local\conda\conda\envs\pybt\lib\site-packages\backtrader\cerebro.py", line 1298, in runstrategies self._runnext(runstrats) File "C:\Users\D292498\AppData\Local\conda\conda\envs\pybt\lib\site-packages\backtrader\cerebro.py", line 1542, in _runnext drets.append(d.next(ticks=False)) File "C:\Users\D292498\AppData\Local\conda\conda\envs\pybt\lib\site-packages\backtrader\feed.py", line 407, in next ret = self.load() File "C:\Users\D292498\AppData\Local\conda\conda\envs\pybt\lib\site-packages\backtrader\feed.py", line 523, in load retff = ff(self, *fargs, **fkwargs) File "C:\Users\D292498\AppData\Local\conda\conda\envs\pybt\lib\site-packages\backtrader\resamplerfilter.py", line 518, in __call__ onedge, docheckover = self._dataonedge(data) # for subdays File "C:\Users\D292498\AppData\Local\conda\conda\envs\pybt\lib\site-packages\backtrader\resamplerfilter.py", line 322, in _dataonedge ret = data._calendar.last_monthday(data.datetime.date()) File "C:\Users\D292498\AppData\Local\conda\conda\envs\pybt\lib\site-packages\backtrader\tradingcal.py", line 94, in last_monthday return day.month != self._nextday(day)[0].month File "C:\Users\D292498\AppData\Local\conda\conda\envs\pybt\lib\site-packages\backtrader\tradingcal.py", line 250, in _nextday i = self.dcache.searchsorted(day) File "C:\Users\D292498\AppData\Local\conda\conda\envs\pybt\lib\site-packages\pandas\core\indexes\extension.py", line 253, in searchsorted return self._data.searchsorted(value, side=side, sorter=sorter) File "C:\Users\D292498\AppData\Local\conda\conda\envs\pybt\lib\site-packages\pandas\core\arrays\_mixins.py", line 190, in searchsorted value = self._validate_searchsorted_value(value) File "C:\Users\D292498\AppData\Local\conda\conda\envs\pybt\lib\site-packages\pandas\core\arrays\datetimelike.py", line 636, in _validate_searchsorted_value return self._validate_scalar(value, allow_listlike=True, setitem=False) File "C:\Users\D292498\AppData\Local\conda\conda\envs\pybt\lib\site-packages\pandas\core\arrays\datetimelike.py", line 564, in _validate_scalar raise TypeError(msg) TypeError: value should be a 'Timestamp', 'NaT', or array of those. Got 'date' instead.