For code/output blocks: Use ``` (aka backtick or grave accent) in a single line before and after the block. See: http://commonmark.org/help/

How to make pandasfeed and yahoofeed work together



  • I noticed that the daily data feed read from Yahoo is actually setting every datetime index as 23:59:59.... I know this is designed for a lot of good reasons. However, this is inconsistent with PandasFeed, where datetime index is normally read in as 00:00:00 when initialized.

    A example could be found below:

    With a pandas dataframe as below: (where I have manually changed the datetime index from 00:00:00 to something else)

    Index(['Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume'], dtype='object')
    2002-01-02 23:59:59.999999	115.110001	115.750000	113.809998	115.529999	82.820213	18651900
    2002-01-03 23:59:59.999999	115.650002	116.949997	115.540001	116.839996	83.759323	15743000
    2002-01-04 23:59:59.999999	117.169998	117.980003	116.550003	117.620003	84.318451	20140700
    

    And tested a simple strategy

    class TestStrategy(bt.Strategy):
        
        def log(self, txt, dt=None):
            dt = dt or self.datas[0].datetime.date(0)
            print('%s, %s' % (dt, txt))
            
        def next(self):
            self.log("From YahooFeed: {}".format(self.datas[0].datetime.time()))
            self.log("From pandasFeed: {}".format(self.datas[-1].datetime.time()))
            
    cerebro = bt.Cerebro()
    
    cerebro.adddata(bt.feeds.YahooFinanceData(dataname='SPY', 
                                                 fromdate=date(2002, 1, 1),
                                                 todate=date(2003, 1, 1)))
    cerebro.adddata(bt.feeds.PandasData(dataname=pd_df, timeframe=bt.TimeFrame.Minutes))
    
    cerebro.addstrategy(TestStrategy)
    # Run over everything
    strats = cerebro.run()
    

    The result is as below:

    2002-01-02, From YahooFeed: 23:59:59.999989
    2002-01-02, From pandasFeed: 00:00:00
    2002-01-03, From YahooFeed: 23:59:59.999989
    2002-01-03, From pandasFeed: 00:00:00
    2002-01-03, From YahooFeed: 23:59:59.999989
    2002-01-03, From pandasFeed: 00:00:00
    
    1. I know Minute might not be the best timeframe, but it doesn't change the pandas feed's behavior
    2. As the datetime is different, the next function is actually called twice, which is against my willings..

    I am asking if there is a good practice to make them work together. Any parameters I missed in those feeders? Or should I change the datetime index in pandas before bt?

    Currently I am extending the pandasfeed and overriding its load function to force every line also adds the additional amount of time.


  • administrators

    @scott-lee said in How to make pandasfeed and yahoofeed work together:

    Any parameters I missed in those feedrs?

    Why don't you simply set the sessionend parameter of the Yahoo data feed to have a time of 00:00:00?



  • Hi, thanks for the reply. Yes, I was also considering that, but I tried to avoid that based on people's discussions in https://community.backtrader.com/topic/1043/dates-and-time-loaded-from-csv-files-not-precise/5


  • administrators

    There is huge difference between trying to signaling the end of the day (which is not something Python has a notion of) and setting the time to a precise time which is the beginning of the day, i.e.: 00:00:00.

    The end of the day (as a default) notion is needed to properly synchronize 1-day feeds with intraday feeds. But it plays no role if you are trying to synchronize two feeds which share the timeframe 1-day



  • I see your point. I am going to force Yahoo as 00:00:00 when it's a pure daily strategy testing task.

    Then for a mixed intraday task, if there is a proper way to read in pandas feed's date time other than 00:00:00? In the example above, I manually changed pandas dataframe's index as timestamps with detailed %m and %s, the datetime is still read as 00:00:00


  • administrators

    @scott-lee said in How to make pandasfeed and yahoofeed work together:

    Then for a mixed intraday task, if there is a proper way to read in pandas feed's date time other than 00:00:00?

    The software cannot imply/guess that the timestamp (in this case only the time of day part) carried by the feed is actually something you don't like.

    A Yahoo data feed is only a daily data feed and the implication can be made, that the prices given by a daily bar have to rest at the end of the session (if no end of session is specified, then at the end of the day).

    But if the source already contains a "time of day", why removing it?

    The 2 options:

    • You change the timestamp directly in the dataframe
    • You add a filter, Docs - Filters, and change the timestamp of each bar already inside the engine


  • Ah, I see my problem now. Originally I was wondering why PandasData read my customized datetime index as 00:00:00. It was caused by I changed the datetime as

    data.index = pd.to_datetime(data.index) + pd.Timedelta("1 day") - pd.Timedelta("1 us")
    

    As the time is too close to 00:00:00, something might happen in pandas, python or backtrader then I saw 00:00:00 in logs while I was expecting 23:59:59, 9999 etc.

    After changing it to other datetime, the index was read properly as expected.

    Thanks for your helps. :) I don't have any further questions now.


Log in to reply
 

});