How to make pandasfeed and yahoofeed work together
-
I noticed that the daily data feed read from Yahoo is actually setting every datetime index as
23:59:59...
. I know this is designed for a lot of good reasons. However, this is inconsistent with PandasFeed, where datetime index is normally read in as00:00:00
when initialized.A example could be found below:
With a pandas dataframe as below: (where I have manually changed the datetime index from 00:00:00 to something else)
Index(['Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume'], dtype='object') 2002-01-02 23:59:59.999999 115.110001 115.750000 113.809998 115.529999 82.820213 18651900 2002-01-03 23:59:59.999999 115.650002 116.949997 115.540001 116.839996 83.759323 15743000 2002-01-04 23:59:59.999999 117.169998 117.980003 116.550003 117.620003 84.318451 20140700
And tested a simple strategy
class TestStrategy(bt.Strategy): def log(self, txt, dt=None): dt = dt or self.datas[0].datetime.date(0) print('%s, %s' % (dt, txt)) def next(self): self.log("From YahooFeed: {}".format(self.datas[0].datetime.time())) self.log("From pandasFeed: {}".format(self.datas[-1].datetime.time())) cerebro = bt.Cerebro() cerebro.adddata(bt.feeds.YahooFinanceData(dataname='SPY', fromdate=date(2002, 1, 1), todate=date(2003, 1, 1))) cerebro.adddata(bt.feeds.PandasData(dataname=pd_df, timeframe=bt.TimeFrame.Minutes)) cerebro.addstrategy(TestStrategy) # Run over everything strats = cerebro.run()
The result is as below:
2002-01-02, From YahooFeed: 23:59:59.999989 2002-01-02, From pandasFeed: 00:00:00 2002-01-03, From YahooFeed: 23:59:59.999989 2002-01-03, From pandasFeed: 00:00:00 2002-01-03, From YahooFeed: 23:59:59.999989 2002-01-03, From pandasFeed: 00:00:00
- I know
Minute
might not be the best timeframe, but it doesn't change the pandas feed's behavior - As the
datetime
is different, thenext
function is actually called twice, which is against my willings..
I am asking if there is a good practice to make them work together. Any parameters I missed in those feeders? Or should I change the datetime index in pandas before
bt
?Currently I am extending the pandasfeed and overriding its
load
function to force every line also adds the additional amount of time. - I know
-
@scott-lee said in How to make pandasfeed and yahoofeed work together:
Any parameters I missed in those feedrs?
Why don't you simply set the
sessionend
parameter of theYahoo
data feed to have a time of00:00:00
? -
Hi, thanks for the reply. Yes, I was also considering that, but I tried to avoid that based on people's discussions in https://community.backtrader.com/topic/1043/dates-and-time-loaded-from-csv-files-not-precise/5
-
There is huge difference between trying to signaling the end of the day (which is not something Python has a notion of) and setting the time to a precise time which is the beginning of the day, i.e.:
00:00:00
.The end of the day (as a default) notion is needed to properly synchronize
1-day
feeds with intraday feeds. But it plays no role if you are trying to synchronize two feeds which share the timeframe1-day
-
I see your point. I am going to force Yahoo as
00:00:00
when it's a pure daily strategy testing task.Then for a mixed intraday task, if there is a proper way to read in pandas feed's date time other than
00:00:00
? In the example above, I manually changed pandas dataframe's index as timestamps with detailed %m and %s, the datetime is still read as00:00:00
-
@scott-lee said in How to make pandasfeed and yahoofeed work together:
Then for a mixed intraday task, if there is a proper way to read in pandas feed's date time other than 00:00:00?
The software cannot imply/guess that the timestamp (in this case only the time of day part) carried by the feed is actually something you don't like.
A Yahoo data feed is only a daily data feed and the implication can be made, that the prices given by a daily bar have to rest at the end of the session (if no end of session is specified, then at the end of the day).
But if the source already contains a "time of day", why removing it?
The 2 options:
- You change the timestamp directly in the dataframe
- You add a filter, Docs - Filters, and change the timestamp of each bar already inside the engine
-
Ah, I see my problem now. Originally I was wondering why
PandasData
read my customizeddatetime
index as00:00:00
. It was caused by I changed the datetime asdata.index = pd.to_datetime(data.index) + pd.Timedelta("1 day") - pd.Timedelta("1 us")
As the time is too close to
00:00:00
, something might happen inpandas
,python
orbacktrader
then I saw00:00:00
in logs while I was expecting23:59:59, 9999
etc.After changing it to other
datetime
, the index was read properly as expected.Thanks for your helps. :) I don't have any further questions now.