Backtrader Community

    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    For code/output blocks: Use ``` (aka backtick or grave accent) in a single line before and after the block. See: http://commonmark.org/help/

    Cerebro run spends 20 seconds without any analyzer

    General Discussion
    3
    10
    505
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Xavier Escudero
      Xavier Escudero last edited by

      Hi. I want to create a screener for filtering more than 2000 stocks, based on some dynamic parameters, entered by the user. I've created two classes:

      • ScreenerStrategy. The init method loads all data, and create indicators
      • ScreenerAnalyzer. The stop method applies filters to create results

      The execution lasts more than 20 seconds. I had an already implemented solution using pandas with a response time of mostly 2 seconds, but I want a better architecture, so I am trying to migrate it to backtrader.

      I've seen a curious thing that maybe you can help me.

      If I comment the 'addanalyzer' line the cerebro.run lasts the same, so I think that cerebro is doing a lot of things outside of the analysis that I don't need. I've used all possible parameters for cerebro but no change. Can you help me?

      class ScreenerStrategy(bt.Strategy):
          def __init__(self):
              self.inds = dict()
              self.inds['RSI'] = dict()
              self.inds['SMA'] = dict()
      
              for i, d in enumerate(self.datas):
      
                  # For each indicator we want to track it's value and whether it is
                  # bullish or bearish. We can do this by creating a new line that returns
                  # true or false.
      
                  # RSI
                  self.inds['RSI'][d._name] = dict()
                  self.inds['RSI'][d._name]['value']  = bt.indicators.RSI(d, period=14, safediv=True)
                  self.inds['RSI'][d._name]['bullish'] = self.inds['RSI'][d._name]['value']  > 50
                  self.inds['RSI'][d._name]['bearish'] = self.inds['RSI'][d._name]['value']  < 50
      
                  # SMA
                  self.inds['SMA'][d._name] = dict()
                  self.inds['SMA'][d._name]['value']  = bt.indicators.SMA(d, period=20)
                  self.inds['SMA'][d._name]['bullish'] = d.close > self.inds['SMA'][d._name]['value']
                  self.inds['SMA'][d._name]['bearish'] = d.close < self.inds['SMA'][d._name]['value']
      
      class ScreenerAnalyzer(bt.Analyzer):
          params = dict(period=10)
          
          def stop(self):
              print('-'*80)
              results = dict()
                     for key, value in self.strategy.inds.items():
                         results[key] = list()
             
                         for nested_key, nested_value in value.items():
                    ...
      
      

      Execution:

              cerebro = bt.Cerebro(runonce=True,                         
                               stdstats=False, # Remove observers
                               writer=False,                         
                               optdatas=False,
                               optreturn=False
                               )
      
              cerebro.addstrategy(ScreenerStrategy)
      
              for i in range(len(self.data_list)):            
                  data = PandasData(
                      dataname=self.data_list[i][0], # Pandas DataFrame
                      name=self.data_list[i][1] # The symbol
                      )            
                  cerebro.adddata(data)
             
             #cerebro.addanalyzer(ScreenerAnalyzer)
             cerebro.run()
      
      run-out 1 Reply Last reply Reply Quote 0
      • vladisld
        vladisld last edited by

        A wild guess may be that most of the time is wasted loading the data. I would measure it with timeit:

               start_time = timeit.default_timer()
               for i in range(len(self.data_list)):            
                   data = PandasData(
                        dataname=self.data_list[i][0], # Pandas DataFrame
                        name=self.data_list[i][1] # The symbol
                        )            
                   cerebro.adddata(data)
               elapsed = timeit.default_timer() - start_time
               print(f'loading data: {elapsed}')       
        
        
               #cerebro.addanalyzer(ScreenerAnalyzer)
               start_time = timeit.default_timer()
               cerebro.run()
               elapsed = timeit.default_timer() - start_time
               print(f'run: {elapsed}')       
        
        Xavier Escudero 2 Replies Last reply Reply Quote 0
        • Xavier Escudero
          Xavier Escudero @vladisld last edited by

          @vladisld 20 seconds is the time printed for run, not for the loading of the data

          1 Reply Last reply Reply Quote 0
          • Xavier Escudero
            Xavier Escudero @vladisld last edited by

            @vladisld I've checked that 16 seconds of the time is spent in init, creating the indicators for all stocks (more than 1400).

            My understanding, related to python init is that it was only called once, and then the same object can be called with different analysis, but it's not really an object. There's any way to load the data feed and create indicators outside the strategy/analyzer, and then execute analyzers/strategy with these data?

            1 Reply Last reply Reply Quote 0
            • run-out
              run-out @Xavier Escudero last edited by

              @Xavier-Escudero said in Cerebro run spends 20 seconds without any analyzer:

              The execution lasts more than 20 seconds. I had an already implemented solution using pandas with a response time of mostly 2 seconds, but I want a better architecture, so I am trying to migrate it to backtrader.

              My inclination would be to do the heavy lifting using pandas before you load each datas into cerebro. You can use bta-lib for the calculations, and create a line in your data for each of your RSI and SMA lines, including value/bullish/bearish indicators.

              Pandas with bta-lib should solve the time problem.

              RunBacktest.com

              Xavier Escudero 1 Reply Last reply Reply Quote 1
              • Xavier Escudero
                Xavier Escudero @run-out last edited by

                @run-out Great! Have you got any example to use with?

                I've seen that bta-lib generates lines, but you can get as well his pandas data.

                sma = btalib.sma(df, period=20)
                df = sma.df
                

                I am feeding the cerebro 'addData' with a pandas data frame that has 'ohlcv'.

                1. Do I need to join in one pandas dataframe information about ohlcv and one new column from sma? Or the lines need/can be passed aside?

                2. I am not sure then if I need to adapt as well my PandasData class (see below)

                Thanks in advance for your help!

                 cerebro = bt.Cerebro()
                 for i in range(len(self.data_list)):            
                      data = PandasData(
                           dataname=self.data_list[i][0], # Pandas DataFrame
                           name=self.data_list[i][1] # The symbol
                      )            
                      cerebro.adddata(data)
                
                class PandasData(btfeed.PandasData):
                    '''
                    The ``dataname`` parameter inherited from ``feed.DataBase`` is the pandas
                    DataFrame
                    '''
                
                    params = (
                        ('nullvalue', 0.0),
                        # Possible values for datetime (must always be present)
                        #  None : datetime is the "index" in the Pandas Dataframe
                        #  -1 : autodetect position or case-wise equal name
                        #  >= 0 : numeric index to the colum in the pandas dataframe
                        #  string : column name (as index) in the pandas dataframe
                        ('datetime', None),
                
                        # Possible values below:
                        #  None : column not present
                        #  -1 : autodetect position or case-wise equal name
                        #  >= 0 : numeric index to the colum in the pandas dataframe
                        #  string : column name (as index) in the pandas dataframe
                        ('open', 'o'),
                        ('high', 'h'),
                        ('low', 'l'),
                        ('close', 'c'),
                        ('volume', 'v'),
                        ('openinterest', None),
                    )
                
                run-out 1 Reply Last reply Reply Quote 0
                • run-out
                  run-out @Xavier Escudero last edited by

                  @Xavier-Escudero said in Cerebro run spends 20 seconds without any analyzer:

                  1. Do I need to join in one pandas dataframe information about ohlcv and one new column from sma? Or the lines need/can be passed aside?

                  Yes, I would put the bta-lib lines beside the ohlcv in the same dataframe. See here for some examples:

                  https://community.backtrader.com/topic/2971/multiple-assets-with-custom-pandas-dataframe/2

                  https://community.backtrader.com/topic/2428/create-indicator-line-from-dataframe-not-from-data-in-cerebros/5

                  RunBacktest.com

                  Xavier Escudero 1 Reply Last reply Reply Quote 1
                  • Xavier Escudero
                    Xavier Escudero @run-out last edited by

                    @run-out Thanks. I've tried but the next error is shown:

                    clslines = baselines + lines
                    TypeError: can only concatenate tuple (not "str") to tuple
                    

                    It seems the error appears when I declare the lines field:

                    class PandasData(btfeed.PandasData):
                        '''
                        The ``dataname`` parameter inherited from ``feed.DataBase`` is the pandas
                        DataFrame
                        '''    
                        lines = ('sma')
                        
                        params = (
                            ('nullvalue', 0.0),
                            # Possible values for datetime (must always be present)
                            #  None : datetime is the "index" in the Pandas Dataframe
                            #  -1 : autodetect position or case-wise equal name
                            #  >= 0 : numeric index to the colum in the pandas dataframe
                            #  string : column name (as index) in the pandas dataframe
                            ('datetime', None),
                    
                            # Possible values below:
                            #  None : column not present
                            #  -1 : autodetect position or case-wise equal name
                            #  >= 0 : numeric index to the colum in the pandas dataframe
                            #  string : column name (as index) in the pandas dataframe
                            ('open', 'o'),
                            ('high', 'h'),
                            ('low', 'l'),
                            ('close', 'c'),
                            ('volume', 'v'),
                            ('openinterest', None),
                            ('sma', -1)        
                        )
                        
                        datafields = btfeed.PandasData.datafields + (
                            [
                                'sma'
                            ]
                        )    
                    

                    And I am creating the data as:

                      df = pd.DataFrame.from_records(ticks) 
                      df = df.join(btalib.sma(df, period=20).df)
                    

                    The data shows that everything is ok, all are floats or NaN (for example first values of SMA), so I don't understand the meaning of error.

                    Thanks again.

                    Xavier Escudero 1 Reply Last reply Reply Quote 0
                    • Xavier Escudero
                      Xavier Escudero @Xavier Escudero last edited by

                      @Xavier-Escudero said in Cerebro run spends 20 seconds without any analyzer:

                      @run-out Thanks. I've tried but the next error is shown:

                      clslines = baselines + lines
                      TypeError: can only concatenate tuple (not "str") to tuple
                      

                      It seems the error appears when I declare the lines field:

                      class PandasData(btfeed.PandasData):
                          '''
                          The ``dataname`` parameter inherited from ``feed.DataBase`` is the pandas
                          DataFrame
                          '''    
                          lines = ('sma')
                          
                          params = (
                              ('nullvalue', 0.0),
                              # Possible values for datetime (must always be present)
                              #  None : datetime is the "index" in the Pandas Dataframe
                              #  -1 : autodetect position or case-wise equal name
                              #  >= 0 : numeric index to the colum in the pandas dataframe
                              #  string : column name (as index) in the pandas dataframe
                              ('datetime', None),
                      
                              # Possible values below:
                              #  None : column not present
                              #  -1 : autodetect position or case-wise equal name
                              #  >= 0 : numeric index to the colum in the pandas dataframe
                              #  string : column name (as index) in the pandas dataframe
                              ('open', 'o'),
                              ('high', 'h'),
                              ('low', 'l'),
                              ('close', 'c'),
                              ('volume', 'v'),
                              ('openinterest', None),
                              ('sma', -1)        
                          )
                          
                          datafields = btfeed.PandasData.datafields + (
                              [
                                  'sma'
                              ]
                          )    
                      

                      And I am creating the data as:

                        df = pd.DataFrame.from_records(ticks) 
                        df = df.join(btalib.sma(df, period=20).df)
                      

                      The data shows that everything is ok, all are floats or NaN (for example first values of SMA), so I don't understand the meaning of error.

                      Thanks again.

                      New: It works adding ',' at the end of lines:

                      lines = ('sma',)
                      
                      run-out 1 Reply Last reply Reply Quote 0
                      • run-out
                        run-out @Xavier Escudero last edited by

                        @Xavier-Escudero said in Cerebro run spends 20 seconds without any analyzer:

                        New: It works adding ',' at the end of lines:
                        lines = ('sma',)

                        Adding the comma turns the right side into a tuple.

                        RunBacktest.com

                        1 Reply Last reply Reply Quote 0
                        • 1 / 1
                        • First post
                          Last post
                        Copyright © 2016, 2017, 2018, 2019, 2020, 2021 NodeBB Forums | Contributors