Beginner issue PandasData



  • Hello, can you guys please clarify what am I missing here? Actually I am requesting tick data from cassandra and converting it to a Pandas Dataframe then finally trying to PandasFeed it. Here's my snippet:

    from datetime import datetime
    import calendar
    import time
    import io
    import argparse
    import backtrader as bt
    import backtrader.feeds as btfeeds

    import backtrader as bt
    import pandas as pd
    from cassandra.cluster import Cluster
    from cassandra.auth import PlainTextAuthProvider
    from cassandra.query import dict_factory

    cluster = Cluster(contact_points=['52.4.116.237'], port=9042)
    session = cluster.connect('bovespa')

    session.default_timeout = 60
    session.row_factory = dict_factory

    sql_query = "SELECT symbol, dtsession, preco, qty, time FROM {}.{} where dtsession = '2017-10-10' and symbol = 'PETR4' and time < '11:00:00' limit 1000 allow filtering ;".format('bovespa', 'tb_thikshist2')

    df = pd.DataFrame()

    for row in session.execute(sql_query):
    df = df.append(pd.DataFrame(row, index=[0]))
    #print(row)

    df['fulltime'] = df['dtsession'].map(str) + ' ' + df['time'].map(str)

    #df.sort_values('fulltime')
    df['fulltime'] = pd.to_datetime(df['fulltime'], format='%Y%m%d %H:%M:%S.%f')
    df = df.set_index('fulltime')
    df.sort_index(inplace=True)
    del df['dtsession']
    del df['time']
    del df['symbol']

    df = df.resample('1T', how={'preco' : 'ohlc', 'qty' : 'sum'})
    print(df.to_string)
    class PandasData(bt.feed.DataBase):
    '''
    The dataname parameter inherited from feed.DataBase is the pandas
    DataFrame
    '''

    params = (
        # Possible values for datetime (must always be present)
        #  None : datetime is the "index" in the Pandas Dataframe
        #  -1 : autodetect position or case-wise equal name
        #  >= 0 : numeric index to the colum in the pandas dataframe
        #  string : column name (as index) in the pandas dataframe
    
    
        # Possible values below:
        #  None : column not present
        #  -1 : autodetect position or case-wise equal name
        #  >= 0 : numeric index to the colum in the pandas dataframe
        #  string : column name (as index) in the pandas dataframe
        ('open', 0),
        ('high', 1),
        ('low', 2),
        ('close', 3),
        ('volume', 4),
        ('openinterest', None),
    )
    

    class SmaCross(bt.SignalStrategy):
    params = (('pfast', 2), ('pslow', 4),)

    def __init__(self):
        sma1, sma2 = bt.ind.EMA(period=self.p.pfast), bt.ind.EMA(period=self.p.pslow)
        self.signal_add(bt.SIGNAL_LONG, bt.ind.CrossOver(sma1, sma2))
    

    data = PandasData(dataname=df,timeframe=1)
    cerebro = bt.Cerebro()
    cerebro.adddata(data)
    cerebro.addstrategy(SmaCross)
    cerebro.broker.setcash(100000.0)
    cerebro.run()
    cerebro.plot()

    Now part of my dataframe

    <bound method DataFrame.to_string of preco qty
    open high low close qty
    fulltime
    2017-10-10 10:07:00 16.129999 16.139999 16.100000 16.139999 193900
    2017-10-10 10:08:00 16.129999 16.170000 16.129999 16.170000 23600
    2017-10-10 10:09:00 16.170000 16.170000 16.129999 16.129999 26200
    2017-10-10 10:10:00 16.129999 16.150000 16.120001 16.120001 29300
    2017-10-10 10:11:00 16.129999 16.139999 16.120001 16.139999 20000
    2017-10-10 10:12:00 16.129999 16.129999 16.110001 16.120001 14400
    2017-10-10 10:13:00 16.120001 16.129999 16.070000 16.090000 42500
    2017-10-10 10:14:00 16.070000 16.080000 16.070000 16.080000 20700

    And finally my error, sorry to ask such a basic question but I've been stuck here for hours now.

    File "C:\Python27\lib\site-packages\backtrader\indicators\basicops.py", line 364, in once
    dst[i] = math.fsum(src[i - period + 1:i + 1]) / period
    IndexError: array assignment index out of range



  • I found useful to put this "hint" for beginners, sharing how I solved the problem myself. I think my issue here was about the dataframe format, I got this conclusion based on the csv's on tutorials and documentation. For those who are importing data I recommend to after resampling your df using pandas to reformat it to avoid any mistakes on its structures. This is what I did with the dataframe I got from pandas and managed to make it work.

    df = df.resample('5T').agg({'preco' : 'ohlc', 'qty' : 'sum'})
    d = {'open': df['preco']['open'], 'high': df['preco']['high'],'low':df['preco']['low'],'close':df['preco']['close'], 'volume':df['qty']['qty']}
    df2 = pd.DataFrame(data=d)
    

    On the snippet above basically I set its columns to CSV I've seen format.


Log in to reply
 

Looks like your connection to Backtrader Community was lost, please wait while we try to reconnect.