For code/output blocks: Use ``` (aka backtick or grave accent) in a single line before and after the block. See: http://commonmark.org/help/

Running multiple instances



  • Hey guys, so I want to run 1 strategy but with various combinations to the passed parameters, and get the output so I can compare which of the parameters best suit what I'm after.

    Do any of you guys do this sort of thing, and if so - how do you approach it?

    • python threading with multiple instances of cerebro, each having its own "current" parameters
    • something else (what?)


  • read the docs

    cerebro.optstrategy



  • Hello, I'm aware of optstrategy, but I need to use analyzer / benchmark results directly into that strategy, so on each iteration - I want to extract sharpe, sortino, cagr, drawdown, etc for that particular parameters run.

    Is that possible ?



  • I.e. - add the analyzer directly INTO the strategy, and then use it's results in stop(self):



  • I can share the code I use for this. I have not attempted to generalize my code examples but I hope this helps. I have two modes for running backtest using parameters as per your question.

    1. Run individual backtest one at a time and
    2. Multi processing.

    At the heart of each is a dictionary of dictionaries of parameters for each test. So let's say you have ten test, then you would have ten dictionaries with your parameters that you will pass to runstrat, the function that will build cerebro.

    Then it's just a matter of iterating through the dictionary and building each run strat, run your back test, and wait for the results. When you have your results, save these however you like.

    Here is example code for sequentially running one backtest at a time. 'scenarios' is a dictionary of 'scenes' which are dictionaries holding one set of parameters for one backtest.

    def backtest_controller(scenarios):
        """ Runs multiple backtests """
    
        # Loop though the combinations.
        loop = 1
        for scene in scenarios:
            print("Starting loop {}".format(loop))
            loop += 1
            scene["test_number"] = str(uuid.uuid4()).replace("-", "")[:10]
    
            # Run the main strategy
            res, final_value = run_strat(scene)
            # If there are transactions, save results spreadsheet.
            if scene["save_agg"]:
                ra.result(res, scene, scene["test_number"])
    
            if scene["save_result"]:
                if len(res[0].analyzers.getbyname("transactions").get_analysis()) > 0:
                    agg_dict = result(res, scene, scene["test_number"])
                    if scene["save_db"]:
                        df_to_db(agg_dict)
            print(f"Final value {final_value:.2f}")
    
        return None
    

    Note a few things. I create a test number to help track results later between multiple tables in the results spreadsheet.

    I have three different functions for saving data, an aggregate spreadsheet with test totals, and detailed spreadsheet with multiple spreadsheets per test, and postgres db. These are ra.result, result, and df_to_db

    If you wish to use parallel processing, you need to pass a function to multiprocessing. So there's an extra function step.

    So the same code above would be converted to just running once.

    def backtest_controller_multi(scene):
        """ Runs a single backtest. """
        scene["test_number"] = str(uuid.uuid4()).replace("-", "")[:10]
    
        # Run the main strategy
        res, final_value = run_strat(scene)
    
        # If there are transactions, save results spreadsheet.
        if scene["save_agg"]:
            ra.result(res, scene, scene["test_number"])
    
        if scene["save_result"]:
            agg_dict = result(res, scene, scene["test_number"])
            return agg_dict
    

    Note the input is only one dictionary, not a dictionary of dictionaries.
    You call backtest_controller_multi using the muliprocessor as follows.

    # multiprocessing.freeze_support()
                    start_test = time.time()
                    pool = multiprocessing.Pool(processes=multiprocessing.cpu_count() - 2)
                    cum_backtest = 0
                    backtest_with_trades = 0
                    for agg_dict in pool.imap_unordered(
                        st.backtest_controller_multi, scenarios
                    ):
                        if (
                            change_params["save_result"]
                            and change_params["save_db"]
                            and agg_dict is not None
                        ):
                            df_to_db(agg_dict)
                            backtest_with_trades += 1
                        cum_backtest += 1
                        print(
                            f"Backtests: {cum_backtest:3.0f} / {total_backtests:3.0f} "
                            f"backtests with trades {cum_backtest:3.0f} -- Elapsed: {(time.time() - start_test):.2f}"
                        )
                    pool.close()
                    print(f"\nThere were {backtest_with_trades} successful trades.")
    

    Note a few things.

    • multiprocessing.freeze_support() is commented out. This line is for windows machines and allows this to run. (i'm on linux)
    • By using pool.imap_unordered in a for loop, you are able to start processing results while further threads are running backtest, saving in ram memory and speeding up the process.

    In all instances you are now passing a dictionary to runstrat instead of using argparse. You can pass in the scene as kwargs to your strategy. Of course, you MUST have all of your parameters established in your strategy or backtrader will rightfully throw an error.

    
    def run_strat(scene):
        """
        Sets up and runs a back test for a basic moving average strategy.
        :param scene: Dictionary containing all parameters.
        :return: Cerebro strategy object and total value.
        """
    
        # Cerebro create
        cerebro = bt.Cerebro()
    
        # Set Dates
        date_start = scene["from_date"]
        date_end = scene["to_date"]
    
        if scene["printon"]:
            print(
                "Running back test: loading data from {} with trading starts on {} to {}.\nLoading data...".format(
                    scene["from_date"], scene["trade_start"], scene["to_date"]
                )
            )
        else:
            pass
    
        cerebro.addstrategy(Strategy, **scene)
    
        # Broker
        cerebro.broker = bt.brokers.BackBroker(        
    
        # Cash
        cerebro.broker.setcash(scene["initinvestment"])
        cerebro.broker.setcommission(commission=scene["brokerage_fee"])
    
        ...
    
        # Cerebro run
        strat = cerebro.run(tradehistory=True)
        
        return strat, cerebro.broker.getvalue()
        
    

    You can use the parameters in the dictionary 'scene' to both set up the cerebro run and feed into your strategy.



  • @run-out I definitely missing something, just trying to understand what is the advantage of using your method over a call to optstrategy?



  • @vladisld I'm not sure. No one has answered the op question yet. Perhaps there's no advantage to mine?



  • @run-out said in Running multiple instances:

    No one has answered the op question yet. Perhaps there's no advantage to mine?

    Advantage of yours approach can be only if it can be run using multiple threads. Otherwise I would use optstrategy. As of OP question - this question was discussed on the forum several times and described in the docs in cerebro section.

    @chewbacca said in Running multiple instances:

    Is that possible ?

    Docs - Cerebro - Returning the results



  • Perhaps I posted this on the wrong question. @chewbacca I would take the advice of @ab_trader @vladisld.



  • @run-out I've encountered some issues while using optstrategy and adding analyzer directly into the strategty's init:
    For example (analyzer for use with quantstats) added to init, and the results - read in stop(), so I can directly gather what I need to evaluate the parameters run.
    @ init:
    self.testId = self.get_random_string(16)
    self._addanalyzer(CashMarket, _name=self.testId, start_date=self.p.start_date, end_date=self.p.end_date, data=self.vx2)

    @ stop():
    rets = self.analyzers.getbyname(self.testId).get_analysis()
    . . . get stuff, print it out

    For some reason BT processes keep eating memory until all memory is exhausted and the application gets killed by the kernel (OOM condition).

    I'll try your code in a few days, just to see if memory leak is still present.



  • @chewbacca said in Running multiple instances:

    Not sure if this helps, but in my method I add the analyzer to cerebro before running. I run one backtest, then extract the analyzer data after cerebro and the backtest are finished. Each backtest is set up and run once by the multiprocessing.

    If you are saving the results of each spreadsheet to a separate file, then the code above should be no problem memory wise.

    Let me know if you have any questions.


Log in to reply
 

});