For code/output blocks: Use ``` (aka backtick or grave accent) in a single line before and after the block. See: http://commonmark.org/help/

progress bar / ETA during optimization



  • While running a large optstrategy it would be useful to be get an ETA or to monitor the percentage of parameter combinations that have been completed so far. I have tried using tqdm package but it get confused by the multiprocesses. tqdm can deal with multiprocessing but it needs to be called by the function that distributes the jobs. I don't think that it is possible to do that without modifying cerebro code.
    I also implemented a simple iteration counter in strategy stop method :

    progress_i = 0
    
    class MyStrat(bt.Strategy):
        .
        .
        .
        def stop(self):
            global progress_i
            progress_i += 1
            if progress_i % 100 == 0:
                print("%d jobs finished" %progress_i)
    
    def optimize():
        .
        .
        .
        cerebro.optstrategy(MyStrat, **parameter_ranges)
        optrets = cerebro.run()
        return optrets
    

    The problem here is that if I have 48 processors, then I'll have 48 times (or maybe 47) outputs saying "100 jobs finished". I could tweak things so that progress_i is multiplied by the number of processes but I am not sure that I can specify that only one slave process prints out its process. This means I'll have 47 or 48 lines printed for each progress update, which is not very nice.

    Has anybody any suggestion to make a somewhat prettier output?


  • administrators

    It seems you want to add Optimization Callbacks.

    See: Docs - Cerebro and look for optcallback



  • Thanks. Would you have any example script using callback / optcallback ? I struggle to understand how to use them.


  • administrators

    There isn't, but it is easy:

    • If your optimization is going to end up iterating over 1000 strategies, your callback will be called 1000 times, each time with the strategy that just finished its run.


  • @backtrader Sorry I am still kind of lost. In my main function I understand that I should have

    cerebro.optstrategy(MyStrategy, **param_range,
                            printlog=False)
    cerebro.optcallbacks(my_callback)
    cerebro.run()
    

    But where do I define my_callback? And what should it look like? Will it call some method of the strategies? If yes, which one?

    I was trying to do the dumbest test, i.e. print("finished") after each strategy finishes, but I couldn't get it to work. How would you do that?


  • administrators

    You define it wherever you want, but you obviously need to be able to give it to cerebro before entering run

    From the document linked above:

    optcallback(cb)
    
      - Adds a callback to the list of callbacks that will be called with the optimizations when each of the strategies has been run
    
      - The signature: cb(strategy)
    

    You define it wherever you want, but you obviously need to be able to pass it cerebro



  • Is there a convenient way to get the total number of optimization combinations that will be run (besides counting it manually)?
    It would be nice if it would be possible to print the optimization progress as percentage or something like "done x out of y runs".

    As a workaround I am doing this currently:

    opt_count = len(list(itertools.product(*cerebro.strats)))
    

  • administrators

    @vbs said in progress bar / ETA during optimization:

    As a workaround I am doing this currently:
    opt_count = len(list(itertools.product(*cerebro.strats)))

    Which will explode with an out_of_memory exception as soon as you do anything which is not trivial.

    And no, you cannot, because you can pass something which generates values (a range is not a pure generator, but you could pass a generator) and have an infinite amount of possibilities.



  • Ok thanks. Too bad though. Yes, converting to a list can lead to huge amount of memory being consumed at that moment. But I think it could be rewritten to consume the generator and just count the loops without the intermediate list.

    How does it work having an infinite amount of runs? Would you evaluate each new result in optcallback and then somehow stop the run when it meets certain conditions?



  • @backtrader said in progress bar / ETA during optimization:

    It seems you want to add Optimization Callbacks.

    See: Docs - Cerebro and look for optcallback

    For me this callback does not work when using maxcpu=1. But it works fine for example with maxcpu=8. Is that intended or a bug possibly?
    "Does not work" means that it never gets called.

    The documentation at least does not mention any limitations:
    https://www.backtrader.com/docu/cerebro.html#backtrader.Cerebro.optcallback


  • administrators

    @vbs said in progress bar / ETA during optimization:

    maxcpus=1

    This disables multiprocessing and runs as a regular backtesting process.



  • Yes, I understand it disables multiprocessing when using maxcpus=1 but it is still executing optimization and gives me optimization results, no?

    My point is that the optcallback is not called during the optimization process when maxcpus=1. It is called though when maxcpus is greater than 1.


  • administrators

    That was understood. The comment pointed out that it is following a different path in the code and using the same code as a non-optimizing run. Hence the lack of callback invocation.



  • Ok, so it is intended behavior. I already assumed it would be related to the code path. I think it would be good if optcallback would be called regardless of the parameter maxcpus.
    I tried simply adding it myself, but I am not sure if it will introduce other unexpected behaviors or side effects. So could it be as simple as this? Any reason to not have it?

            if not self._dooptimize or self.p.maxcpus == 1:
                # If no optimmization is wished ... or 1 core is to be used
                # let's skip process "spawning"
                for iterstrat in iterstrats:
                    runstrat = self.runstrategies(iterstrat)
                    self.runstrats.append(runstrat)
                    if self._dooptimize:
                        for cb in self.optcbs:
                            cb(runstrat)  # callback receives finished strategy
    
    

    (last 3 lines added)


  • administrators

    Release 1.9.63.122



  • Did anyone figure out a progress bar solution (ideally for a Jupyter Notebook) that's simple enough to implement? @Benoît-Zuber perhaps?



  • @tw00000
    As far as I can see it works fine when using cerebro.optcallback() to manually trigger tqdm.update().



  • @tw00000 no I did not try much...



  • @vbs said in progress bar / ETA during optimization:

    @tw00000
    As far as I can see it works fine when using cerebro.optcallback() to manually trigger tqdm.update().

    I spent the last hour or so trying to figure out what that looks like, but I can't figure it out. There's not a whole lot of explanation for how to use tqdm.update() out there... Also, it would be great to have a progress bar for non-optimization runs of backtrader (i.e. just standard runs, but on long dataframes)

    The closest I've been able to get:

    Create a decorator function for tqdm, like so (borrowed from https://gist.github.com/duckythescientist/c06d87617b5d6ac1e00a622df760709d) :

    import time
    import threading
    import functools
    import tqdm
    
    def provide_progress_bar(function, estimated_time, tstep=0.2, tqdm_kwargs={}, args=[], kwargs={}):
        """Tqdm wrapper for a long-running function
    
        args:
            function - function to run
            estimated_time - how long you expect the function to take
            tstep - time delta (seconds) for progress bar updates
            tqdm_kwargs - kwargs to construct the progress bar
            args - args to pass to the function
            kwargs - keyword args to pass to the function
        ret:
            function(*args, **kwargs)
        """
        ret = [None]  # Mutable var so the function can store its return value
        def myrunner(function, ret, *args, **kwargs):
            ret[0] = function(*args, **kwargs)
    
        thread = threading.Thread(target=myrunner, args=(function, ret) + tuple(args), kwargs=kwargs)
        pbar = tqdm.tqdm(total=estimated_time, **tqdm_kwargs)
    
        thread.start()
        while thread.is_alive():
            thread.join(timeout=tstep)
            pbar.update(tstep)
        pbar.close()
        return ret[0]
    
    
    def progress_wrapped(estimated_time, tstep=0.2, tqdm_kwargs={}):
        """Decorate a function to add a progress bar"""
        def real_decorator(function):
            @functools.wraps(function)
            def wrapper(*args, **kwargs):
                return provide_progress_bar(function, estimated_time=estimated_time, tstep=tstep, tqdm_kwargs=tqdm_kwargs, args=args, kwargs=kwargs)
            return wrapper
        return real_decorator
    

    Add this to the Strategy:

    from tqdm_function import progress_wrapped
    
    @progress_wrapped(estimated_time=1500)
    class firstStrategy(bt.Strategy):
    

    But this fails with:

      0%|          | 0/1500 [00:00<?, ?it/s]Exception in thread Thread-4:
    Traceback (most recent call last):
      File "/anaconda3/lib/python3.6/threading.py", line 916, in _bootstrap_inner
        self.run()
      File "/anaconda3/lib/python3.6/threading.py", line 864, in run
        self._target(*self._args, **self._kwargs)
      File "~/tqdm_function.py", line 24, in myrunner
        ret[0] = function(*args, **kwargs)
      File "/anaconda3/lib/python3.6/site-packages/backtrader/metabase.py", line 86, in __call__
        _obj, args, kwargs = cls.donew(*args, **kwargs)
      File "/anaconda3/lib/python3.6/site-packages/backtrader/strategy.py", line 72, in donew
        _obj._id = cerebro._next_stid()
    AttributeError: 'NoneType' object has no attribute '_next_stid'
    
      0%|          | 0.2/1500 [00:00<00:21, 70.89it/s]