For code/output blocks: Use ``` (aka backtick or grave accent) in a single line before and after the block. See: http://commonmark.org/help/

Struggling to implement custom Pandas data feed



  • Hi there.

    I'm would like to backtest and work on a strategy that I currently have all the values for in a Pandas data frame. The data frame is indexed by datetime.

    From reading docs and other Pandas related posts online, i've made a class, CustomDataLoader, that extends btfeeds.PandasData. It handles the new lines from the data frame (i'll post the code at the end).

    From what I can figure, everything seems to be ok, yet when I run the simple sample backtest, I hit a long error string, primarily:
    "-[NSApplication _setup:]: unrecognized selector sent to instance 0x7fc1d465f3d0"

    Please, if could someone help me out getting the backtester to read my custom Pandas dataframe, I would be very appreciative!

    Here's the code:

    Simple Backtester:

    from __future__ import (absolute_import, division, print_function,
                            unicode_literals)
    
    import argparse
    
    import backtrader as bt
    
    import make_df
    
    
    def runstrat():
        args = parse_args()
    
        # Create a cerebro entity
        cerebro = bt.Cerebro()
    
        # Add a strategy
        cerebro.addstrategy(bt.Strategy)
    
        dataframe = make_df.get_df()
    
        if not args.noprint:
            print('--------------------------------------------------')
            print(dataframe)
            print('--------------------------------------------------')
    
        # Pass it to the backtrader datafeed and add it to the cerebro
        data = bt.feeds.PandasData(dataname=dataframe, nocase=False)
    
        cerebro.adddata(data)
    
        # Run over everything
        cerebro.run()
    
        # Plot the result
        cerebro.plot()
    
    
    def parse_args():
        parser = argparse.ArgumentParser(
            description='Pandas test script')
    
        parser.add_argument('--noheaders', action='store_true', default=False,
                            required=False,
                            help='Do not use header rows')
    
        parser.add_argument('--noprint', action='store_true', default=False,
                            help='Print the dataframe')
    
        return parser.parse_args()
    
    
    if __name__ == '__main__':
        runstrat()
    

    Custom Pandas Data Class:

    import backtrader.feeds as btfeeds
    
    
    class CustomDataLoader(btfeeds.PandasData):
        lines = ('Open', 'High', 'Low', 'Close', 'TSL_2', 'TSL_2_L/S', 'Fast_ATR', 'Slow_ATR', 'Impulse')
        params = (
            ('datetime', None),
            ('open', None),
            ('high', None),
            ('low', None),
            ('close', None),
            ('volume', None),
            ('openinterest', None),
            ('Open', 1),
            ('High', 2),
            ('Low', 3),
            ('Close', 4),
            ('TSL_2', 5),
            ('TSL_2_L/S', 6),
            ('Fast_ATR', 7),
            ('Slow_ATR', 8),
            ('Impulse', 9)
        )
        datafields = (['Open', 'High', 'Low', 'Close', 'TSL_2', 'TSL_2_L/S', 'Fast_ATR', 'Slow_ATR', 'Impulse'])
    


  • I should add that the Pandas data frame is being printed out, as per this line:

        if not args.noprint:
            print('--------------------------------------------------')
            print(dataframe)
            print('--------------------------------------------------')
    

    So, it's just the plotting that is not working - with this error:

    "-[NSApplication _setup:]: unrecognized selector sent to instance 0x7fc1d465f3d0"
    

    I have now tried another implementation of the Custom Pandas Data Class, code below, but still getting the error shown above.

    import backtrader.feeds as btfeeds
    
    
    class CustomDataLoader(btfeeds.PandasData):
        lines = ('TSL_2', 'TSL_2_L/S', 'Fast_ATR', 'Slow_ATR', 'Impulse')
        params = (
            ('datetime', None),
            ('open', 'Open'),
            ('high', 'High'),
            ('low', 'Low'),
            ('close', 'Close'),
            ('volume', None),
            ('openinterest', None),
            ('TSL_2', 5),
            ('TSL_2_L/S', 6),
            ('Fast_ATR', 7),
            ('Slow_ATR', 8),
            ('Impulse', 9)
        )
        datafields = btfeeds.PandasData.datafields + (['TSL_2', 'TSL_2_L/S', 'Fast_ATR', 'Slow_ATR', 'Impulse'])
    

    Still confused!!



  • @pipeline-punch said in Struggling to implement custom Pandas data feed:

    -[NSApplication _setup:]: unrecognized selector sent to instance

    Did you Google this first? It looks like a matplotlib crash.



  • Hi, I did google it and did see the matplotlib specific error. However, I thought backtrader implements bokeh for graphing, not matplotlib? Perhaps I am way off, forgive my ignorance.

    Under that assumption, are you saying that there looks nothing wrong with the code in the second post?

    Thanks for getting back so quickly :)



  • Ok, so I was wrong, and it does matplotlib is implemented. However, I fixed that error by adding this to my code:

    data = bt.feeds.PandasData(dataname=dataframe, volume=False, openinterest=False)
    

    That gets rid of that ugly long error, but now I am getting this error:

    ValueError: Location based indexing can only have [integer, integer slice (START point is INCLUDED, END point is EXCLUDED), listlike of integers, boolean array] types
    

    Again, i'm struggling on this error - google points (logically) to pandas indexing, but this is weird as i'm not indexing anything.

    Can anyone point me in the right direction?



  • Without a reproducible sample, it's hard to help. Suggest you upload runnable code and enough data to demonstrate the issue..



  • Looks like you've created your new CustomDataLoader data class, but still use bt built-in Pandas data class:

    data = bt.feeds.PandasData(...)
    

    You may want to try to use your new data feed.



  • @ab_trader said in Struggling to implement custom Pandas data feed:

    Looks like you've created your new CustomDataLoader data class, but still use bt built-in Pandas data class:

    Gah! Thank you so much for that!

    Big thanks to anyone trying to help so far, it really is much appreciated as I am somewhat of a new coder, with a discretionary strategy that I have been using for years that I would like to turn into an algo. I have taken some python classes at Uni, but alas, I am still learning.

    SO, even after pointing to the newly created class as you suggested, I still hit an error.

    I have greatly simplified the code into one to allow someone to hopefully help me get to the bottom of this. Here it is:

    from __future__ import (absolute_import, division, print_function,
                            unicode_literals)
    
    import argparse
    
    import backtrader as bt
    import backtrader.feeds as btfeeds
    
    from make_df import get_df
    
    
    class StratData(btfeeds.DataBase):
        lines = ('datetime', 'Open', 'High', 'Low', 'Close', 'TSL_2', 'TSL_2_L/S', 'Fast_ATR', 'Slow_ATR', 'Impulse')
        params = (
            ('datetime', None),
            ('open', 'Open'),
            ('high', 'High'),
            ('low', 'Low'),
            ('close', 'Close'),
            ('volume', None),
            ('openinterest', None),
            ('TSL_2', 4),
            ('TSL_2_L/S', 5),
            ('Fast_ATR', 6),
            ('Slow_ATR', 7),
            ('Impulse', 8)
        )
    
        # if False:
        #     # No longer needed with version 1.9.62.122
        #     datafields = btfeeds.PandasData.datafields + (
        #         ['optix_close', 'optix_pess', 'optix_opt'])
    
    
    class StrategyPrint(bt.Strategy):
    
        def next(self):
            print('%03d %f %f, %f' % (
                len(self),
                self.data.l.Close[0],
                self.data.l.TSL_2[0],
                self.data.l.Impulse[0],))
    
    
    def runstrat():
        args = parse_args()
    
        # Create a cerebro entity
        cerebro = bt.Cerebro(stdstats=False)
    
        # Add a strategy
        cerebro.addstrategy(StrategyPrint)
    
        dataframe = get_df()
    
        if not args.noprint:
            print('--------------------------------------------------')
            print(dataframe)
            print('--------------------------------------------------')
    
        # Pass it to the backtrader datafeed and add it to the cerebro
        data = StratData(dataname=dataframe)
    
        cerebro.adddata(data)
    
        # Run over everything
        cerebro.run()
    
        # Plot the result
        if not args.noplot:
            cerebro.plot(style='bar')
    
    
    def parse_args():
        parser = argparse.ArgumentParser(
            description='Pandas test script')
    
        parser.add_argument('--noheaders', action='store_true', default=False,
                            required=False,
                            help='Do not use header rows')
    
        parser.add_argument('--noprint', action='store_true', default=False,
                            help='Print the dataframe')
    
        parser.add_argument('--noplot', action='store_true', default=False,
                            help='Do not plot the chart')
    
        return parser.parse_args()
    
    
    if __name__ == '__main__':
        runstrat()
    

    Running this prints my Pandas df, here's a head():

    --------------------------------------------------
                                  Open     High  ...  Slow_ATR  Impulse
    2015-01-01 22:00:00+00:00  0.81827  0.81856  ...  0.006917    green
    2015-01-04 22:00:00+00:00  0.80734  0.81074  ...  0.006949    green
    2015-01-05 22:00:00+00:00  0.80855  0.81574  ...  0.007015    green
    2015-01-06 22:00:00+00:00  0.80828  0.80902  ...  0.006871     grey
    2015-01-07 22:00:00+00:00  0.80756  0.81308  ...  0.006773     grey
    2015-01-08 22:00:00+00:00  0.81198  0.82089  ...  0.007171    green
    

    But instead of plotting/logging, it ultimately results in this error:

    Traceback (most recent call last):
      File "/Users/kayne/Desktop/PyCharm Algos/pandas_test.py", line 92, in <module>
        runstrat()
      File "/Users/kayne/Desktop/PyCharm Algos/pandas_test.py", line 71, in runstrat
        cerebro.plot(style='bar')
      File "/anaconda3/lib/python3.7/site-packages/backtrader/cerebro.py", line 996, in plot
        plotter.show()
      File "/anaconda3/lib/python3.7/site-packages/backtrader/plot/plot.py", line 795, in show
        self.mpyplot.show()
    AttributeError: 'Plot_OldSync' object has no attribute 'mpyplot'
    

    I really do apologise if this is really basic stuff, I have looked at any backtrader Pandas related topic and docs, github etc. but still can't seem to figure this out - I know it is most likely very easy!

    Thanks in advance!



  • @pipeline-punch said in Struggling to implement custom Pandas data feed:

    Data feed class

    class StratData(btfeeds.PandasData):
    
        lines = ('TSL_2', 'TSL_2_L/S', 'Fast_ATR', 'Slow_ATR', 'Impulse')
    
        params = (
            ('open', 'Open'),
            ('high', 'High'),
            ('low', 'Low'),
            ('close', 'Close'),
            ('volume', None),
            ('openinterest', None),
            ('TSL_2', -1),
            ('TSL_2_L/S', -1),
            ('Fast_ATR', -1),
            ('Slow_ATR', -1),
            ('Impulse', -1)
        )
    

    plot call

    cerebro.plot(style='bar', volume=False)
    

    Print in the next()

            print('%03d %f %f, %f' % (len(self), self.data.l.close[0], self.data.l.TSL_2[0], self.data.l.Impulse[0],))
    

    Tested on the following csv file

    ,Open,High,Low,Close,volume,openinterest,TSL_2,TSL_2_L/S,Fast_ATR,Slow_ATR,Impulse
    2019-01-01,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    2019-01-02,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    2019-01-03,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    2019-01-04,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    2019-01-05,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    2019-01-06,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    2019-01-07,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    2019-01-08,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    2019-01-09,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    2019-01-10,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    2019-01-11,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    2019-01-12,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    2019-01-13,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    2019-01-14,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    2019-01-15,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    2019-01-16,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    2019-01-17,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    2019-01-18,100.0,120.0,95.0,110.0,10000.0,5000.0,50.0,60.0,2.0,5.0,10.0
    


  • ab_trader, you are awesome! Thank you so, so much for helping me get this up and running. Now that I know the structure of how to do this, I feel confident I can now pull in any custom Pandas data frame I would like! Now the fun "work" of statistical analysis and optimisation can begin!

    Now that I have this simple measure working, perhaps you could help with a future issue. If the custom Pandas DataFrame contained the same columns as we have been working with, but multiple "assets" instead of just one (as in the example we've been working with) - how would everything work? Similar to how we've done it above, or something a bit different to deal with the dual-indexing?

    Much appreciated :)



  • @pipeline-punch said in Struggling to implement custom Pandas data feed:

    If the custom Pandas DataFrame contained the same columns as we have been working with, but multiple "assets" instead of just one (as in the example we've been working with) - how would everything work? Similar to how we've done it above, or something a bit different to deal with the dual-indexing?

    bt's PandasData requires separate DataFrame for each stock ticker. But since

    Now that I know the structure of how to do this, I feel confident I can now pull in any custom Pandas data frame I would like!

    than you can write your own Pandas data feed which will take care about multiple assets in the single DataFrame.

    It depends on the amount of time you have and your python skills.


 

});