For code/output blocks: Use ``` (aka backtick or grave accent) in a single line before and after the block. See: http://commonmark.org/help/

Error: When strings show up where they aren't wanted...



  • Hi, thanks in advance for anyone's assistance here. I'm trying to backtest using my own data stored in a local CSV file. It's a relatively straightforward project (it's a riff on the Quickstart guide model). but I have hit a wall that has left me spinning my wheels for too long. (There doesn't appear to be a current thread that addresses this specific error message...). My CSV has 6 columns of data: one datetime time series, and 5 custom indicators (no OHLC data). When I upload my data with the custom parameters adjusted for my custom data and I run it, I get a "TypeError: must be real number, not str" stemming from an issue "self.array[self.idx + ago] = value" stemming from line 222 in the linebuffer.py. I'm pretty sure my data is uploading correctly (printing the head shows everything is normal) but clearly I'm producing strings where I need to have only int. Can someone please advise if they know where this newbie got trapped? Many thanks!
    Relevant code:

    
    lines = (
            'Date',
            'Abra1',
            'Abra2',
            'Abra3',
            'Abra4',
            'Abra5',
    )
    
    params = (
    
            ('dtformat', '%m/%d/%Y'),
    
            ('Date', 0),
            ('time', -1),
            ('open', -1),
            ('high', -1),
            ('low', -1),
            ('close', -1),
            ('volume', -1),
            ('openinterest', -1),
            ('Abra1', 1),
            ('Abra2', 2),
            ('Abra3', 3),
            ('Abra4', 4),
            ('Abra5', 5),
    )
    
    datafields = btfeeds.PandasData.datafields + (['Date', 'Abra1', 'Abra2', 'Abra3', "Abra4", "Abra5"])
    
    mydict = dict(lines=tuple(lines), params=params, datafields=bt.feeds.PandasData.datafields + list(lines),)
    
    PandasDataAbra = type('PandasDataAbra', (btfeeds.PandasData,), mydict)
    
    dataframe = pd.read_csv("C:/Users/user/PycharmProjects/sampledata.csv",
                                skiprows=0,
                                header=0,
                                parse_dates=True,
                                )
    
        data = PandasDataAbra(dataname=dataframe)
    

    Error:

    File "C:\Users\user\PycharmProjects\venv\lib\site-packages\backtrader\linebuffer.py", line 222, in __setitem__
        self.array[self.idx + ago] = value
    TypeError: must be real number, not str
    

    Head (to check out underlying data):

     Date   Abra1   Abra2   Abra3   Abra4  Abra5
    0  3/26/2007  1.2628  1.3365  16.632  18.135  0.845
    1  3/27/2007  2.4871  0.1935  16.776  18.265  0.875
    2  3/28/2007 -1.3629 -0.4005  17.268  18.577  0.450
    3  3/29/2007 -2.8490 -0.1530  17.376  18.577  0.140
    4  3/30/2007 -0.8008 -0.3105  17.400  18.603  0.415
    

    All code:

    from __future__ import (absolute_import, division, print_function,
                            unicode_literals)
    
    import datetime # For datetime objects
    import os.path  # To manage paths
    import sys  # To find out the script name (in argv[0])
    
    import pandas as pd
    import backtrader as bt
    import pandas as pd
    import backtrader.feeds as btfeeds
    from backtrader.feeds import GenericCSVData
    from backtrader.feeds import PandasData
    from backtrader import Cerebro
    from backtrader import feed
    
    
    lines = (
            'Date',
            'Abra1',
            'Abra2',
            'Abra3',
            'Abra4',
            'Abra5',
    )
    
    params = (
    
            ('dtformat', '%m/%d/%Y'),
    
            ('Date', 0),
            ('time', -1),
            ('open', -1),
            ('high', -1),
            ('low', -1),
            ('close', -1),
            ('volume', -1),
            ('openinterest', -1),
            ('Abra1', 1),
            ('Abra2', 2),
            ('Abra3', 3),
            ('Abra4', 4),
            ('Abra5', 5),
    )
    
    datafields = btfeeds.PandasData.datafields + (['Date', 'Abra1', 'Abra2', 'Abra3', "Abra4", "Abra5"])
    
    mydict = dict(lines=tuple(lines), params=params, datafields=bt.feeds.PandasData.datafields + list(lines),)
    
    PandasDataAbra = type('PandasDataAbra', (btfeeds.PandasData,), mydict)
    
    class TestStrategy(bt.Strategy):
    
        def log(self, txt, dt=None):
            ''' Logging function for this strategy'''
            dt = dt or self.datas[0].datetime.date(0)
            print('%s, %s' % (dt.isoformat(), txt))
    
        def __init__(self):
            # Keep a reference to the "close" line in the data[0] dataseries
            self.dataAbra4 = self.datas[0].Abra4
    
            # To keep track of pending orders and buy price/commission
            self.order = None
            self.buyprice = None
            self.buycomm = None
    
        def notify_order(self, order):
            if order.status in [order.Submitted, order.Accepted]:
                # Buy/Sell order submitted/accepted to/by broker - Nothing to do
                return
    
            # Check if an order has been completed
            # Attention: broker could reject order if not enough cash
            if order.status in [order.Completed]:
                if order.isbuy():
                    self.log(
                        'BUY EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f' %
                        (order.executed.price,
                         order.executed.value,
                         order.executed.comm))
    
                    self.buyprice = order.executed.price
                    self.buycomm = order.executed.comm
                else:  # Sell
                    self.log('SELL EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f' %
                             (order.executed.price,
                              order.executed.value,
                              order.executed.comm))
    
                self.bar_executed = len(self)
    
            elif order.status in [order.Canceled, order.Margin, order.Rejected]:
                self.log('Order Canceled/Margin/Rejected')
    
            # Write down: no pending order
            self.order = None
    
        def notify_trade(self, trade):
            if not trade.isclosed:
                return
    
            self.log('OPERATION PROFIT, GROSS %.2f, NET %.2f' %
                     (trade.pnl, trade.pnlcomm))
    
        def next(self):
            # Simply log the closing price of the series from the reference
            self.log('Close, %.2f' % self.Abra4[0])
    
            # Check if an order is pending ... if yes, we cannot send a 2nd one
            if self.order:
                return
    
            # Check if we are in the market
            if not self.position:
    
                # Not yet ... we MIGHT BUY if ...
                if self.Abra4[0] > self.Abra2[0] and self.Abra4[0] > 0:
    
                        # BUY, BUY, BUY!!! (with default parameters)
                        self.log('BUY CREATE, %.2f' % self.Abra4[0])
    
                        # Keep track of the created order to avoid a 2nd order
                        self.order = self.buy()
            else:
    
                # Already in the market ... we might sell
                if self.Abra4[0] < self.Abra2[0]:
                    # SELL, SELL, SELL!!! (with all possible default parameters)
                    self.log('SELL CREATE, %.2f' % self.F4[0])
    
                    # Keep track of the created order to avoid a 2nd order
                    self.order = self.sell()
    
    if __name__ == '__main__':
        cerebro: Cerebro = bt.Cerebro()
    
        cerebro.addstrategy(TestStrategy)
    
        dataframe = pd.read_csv("C:/Users/user/PycharmProjects/sampledata.csv",
                                skiprows=0,
                                header=0,
                                parse_dates=True,
                                )
    
        data = PandasDataAbra(dataname=dataframe)
    
        cerebro.adddata(data)
    
        cerebro.broker.setcash(100000.0)
    
        cerebro.addsizer(bt.sizers.FixedSize, stake=10)
    
        cerebro.broker.setcommission(commission=0.001)
    
        print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue())
        print(dataframe.head())
        cerebro.run()
    
        print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue())
    
        # cerebro.plot()
    

  • administrators

    @tmi said in Error: When strings show up where they aren't wanted...:

    It's a relatively straightforward project (it's a riff on the Quickstart guide model)

    Sorry, it is not.

    @tmi said in Error: When strings show up where they aren't wanted...:

    My CSV has 6 columns of data: one datetime time series, and 5 custom indicators (no OHLC data)

    Running with no ohlc is a very advanced use case and you have to really know what you are doing.

    @tmi said in Error: When strings show up where they aren't wanted...:

                        self.order = self.buy()
    

    With no ohlc, what are you expecting from (for example) this statement? Which of the AbraX fields from your data has to choose the broker to execute an order?

    You should consider configuring the PandasData subclass to assign open, high, low and close to correspond to the different AbraX in your data, using the column indices or the field names in the params declaration.

    If those AbraX fields have nothing to do with regular ohlc fields, well you said it was a simple variation of the quickstart example ... but it really isn't.

    @tmi said in Error: When strings show up where they aren't wanted...:

            ('Date', 0),
            ('time', -1),
    

    Please see the docs for PandasData. Those parameters/lines don't exist. There is a single field called datetime (and if Date existed, it would be a lowercase date)

    Which means that the data feed is trying to find the datetime field in the index of the dataframe, which is made of ints and not timestamps. If you see the docs (and this is standard practice when loading a datetime series), the index should be made of datetime timestamps. You probably want to check the pd.read_csv documentation to automatically set the index to the column containing the dates.

    Note: Seeing how the dates are displayed when you print the dataframe, it is likely they haven't been parsed and they are still strings, hence your error.

    Some personal advice

    @tmi said in Error: When strings show up where they aren't wanted...:

    PandasDataAbra = type('PandasDataAbra', (btfeeds.PandasData,), mydict)
    

    Unless you have good reasons for using type, the only thing you are achieving is the pollution of the global namespace with unneeded lines, params, mydict declarations.

    Keep things simple, using type isn't going to get you to magical places. Regular subclassing is already very powerful tool.

    @tmi said in Error: When strings show up where they aren't wanted...:

    datafields = btfeeds.PandasData.datafields + (['Date', 'Abra1', 'Abra2', 'Abra3', "Abra4", "Abra5"])
    

    datafields hasn't been needed for a long time.


Log in to reply
 

});