progress bar / ETA during optimization
-
While running a large optstrategy it would be useful to be get an ETA or to monitor the percentage of parameter combinations that have been completed so far. I have tried using tqdm package but it get confused by the multiprocesses. tqdm can deal with multiprocessing but it needs to be called by the function that distributes the jobs. I don't think that it is possible to do that without modifying cerebro code.
I also implemented a simple iteration counter in strategy stop method :progress_i = 0 class MyStrat(bt.Strategy): . . . def stop(self): global progress_i progress_i += 1 if progress_i % 100 == 0: print("%d jobs finished" %progress_i) def optimize(): . . . cerebro.optstrategy(MyStrat, **parameter_ranges) optrets = cerebro.run() return optrets
The problem here is that if I have 48 processors, then I'll have 48 times (or maybe 47) outputs saying "100 jobs finished". I could tweak things so that progress_i is multiplied by the number of processes but I am not sure that I can specify that only one slave process prints out its process. This means I'll have 47 or 48 lines printed for each progress update, which is not very nice.
Has anybody any suggestion to make a somewhat prettier output?
-
It seems you want to add Optimization Callbacks.
See: Docs - Cerebro and look for
optcallback
-
Thanks. Would you have any example script using callback / optcallback ? I struggle to understand how to use them.
-
There isn't, but it is easy:
- If your optimization is going to end up iterating over
1000
strategies, your callback will be called1000
times, each time with the strategy that just finished its run.
- If your optimization is going to end up iterating over
-
@backtrader Sorry I am still kind of lost. In my main function I understand that I should have
cerebro.optstrategy(MyStrategy, **param_range, printlog=False) cerebro.optcallbacks(my_callback) cerebro.run()
But where do I define my_callback? And what should it look like? Will it call some method of the strategies? If yes, which one?
I was trying to do the dumbest test, i.e. print("finished") after each strategy finishes, but I couldn't get it to work. How would you do that?
-
You define it wherever you want, but you obviously need to be able to give it to cerebro before entering
run
From the document linked above:
optcallback(cb) - Adds a callback to the list of callbacks that will be called with the optimizations when each of the strategies has been run - The signature: cb(strategy)
You define it wherever you want, but you obviously need to be able to pass it
cerebro
-
Is there a convenient way to get the total number of optimization combinations that will be run (besides counting it manually)?
It would be nice if it would be possible to print the optimization progress as percentage or something like "done x out of y runs".As a workaround I am doing this currently:
opt_count = len(list(itertools.product(*cerebro.strats)))
-
@vbs said in progress bar / ETA during optimization:
As a workaround I am doing this currently:
opt_count = len(list(itertools.product(*cerebro.strats)))Which will explode with an out_of_memory exception as soon as you do anything which is not trivial.
And no, you cannot, because you can pass something which generates values (a
range
is not a pure generator, but you could pass a generator) and have an infinite amount of possibilities. -
Ok thanks. Too bad though. Yes, converting to a list can lead to huge amount of memory being consumed at that moment. But I think it could be rewritten to consume the generator and just count the loops without the intermediate
list
.How does it work having an infinite amount of runs? Would you evaluate each new result in
optcallback
and then somehow stop the run when it meets certain conditions? -
@backtrader said in progress bar / ETA during optimization:
It seems you want to add Optimization Callbacks.
See: Docs - Cerebro and look for
optcallback
For me this callback does not work when using
maxcpu=1
. But it works fine for example withmaxcpu=8
. Is that intended or a bug possibly?
"Does not work" means that it never gets called.The documentation at least does not mention any limitations:
https://www.backtrader.com/docu/cerebro.html#backtrader.Cerebro.optcallback -
@vbs said in progress bar / ETA during optimization:
maxcpus=1
This disables multiprocessing and runs as a regular backtesting process.
-
Yes, I understand it disables multiprocessing when using
maxcpus=1
but it is still executing optimization and gives me optimization results, no?My point is that the
optcallback
is not called during the optimization process whenmaxcpus=1
. It is called though whenmaxcpus
is greater than1
. -
That was understood. The comment pointed out that it is following a different path in the code and using the same code as a non-optimizing run. Hence the lack of callback invocation.
-
Ok, so it is intended behavior. I already assumed it would be related to the code path. I think it would be good if
optcallback
would be called regardless of the parametermaxcpus
.
I tried simply adding it myself, but I am not sure if it will introduce other unexpected behaviors or side effects. So could it be as simple as this? Any reason to not have it?if not self._dooptimize or self.p.maxcpus == 1: # If no optimmization is wished ... or 1 core is to be used # let's skip process "spawning" for iterstrat in iterstrats: runstrat = self.runstrategies(iterstrat) self.runstrats.append(runstrat) if self._dooptimize: for cb in self.optcbs: cb(runstrat) # callback receives finished strategy
(last 3 lines added)
-
Release 1.9.63.122
-
Did anyone figure out a progress bar solution (ideally for a Jupyter Notebook) that's simple enough to implement? @Benoît-Zuber perhaps?
-
@tw00000
As far as I can see it works fine when usingcerebro.optcallback()
to manually triggertqdm.update()
. -
@tw00000 no I did not try much...
-
@vbs said in progress bar / ETA during optimization:
@tw00000
As far as I can see it works fine when usingcerebro.optcallback()
to manually triggertqdm.update()
.I spent the last hour or so trying to figure out what that looks like, but I can't figure it out. There's not a whole lot of explanation for how to use
tqdm.update()
out there... Also, it would be great to have a progress bar for non-optimization runs of backtrader (i.e. just standard runs, but on long dataframes)The closest I've been able to get:
Create a decorator function for tqdm, like so (borrowed from https://gist.github.com/duckythescientist/c06d87617b5d6ac1e00a622df760709d) :
import time import threading import functools import tqdm def provide_progress_bar(function, estimated_time, tstep=0.2, tqdm_kwargs={}, args=[], kwargs={}): """Tqdm wrapper for a long-running function args: function - function to run estimated_time - how long you expect the function to take tstep - time delta (seconds) for progress bar updates tqdm_kwargs - kwargs to construct the progress bar args - args to pass to the function kwargs - keyword args to pass to the function ret: function(*args, **kwargs) """ ret = [None] # Mutable var so the function can store its return value def myrunner(function, ret, *args, **kwargs): ret[0] = function(*args, **kwargs) thread = threading.Thread(target=myrunner, args=(function, ret) + tuple(args), kwargs=kwargs) pbar = tqdm.tqdm(total=estimated_time, **tqdm_kwargs) thread.start() while thread.is_alive(): thread.join(timeout=tstep) pbar.update(tstep) pbar.close() return ret[0] def progress_wrapped(estimated_time, tstep=0.2, tqdm_kwargs={}): """Decorate a function to add a progress bar""" def real_decorator(function): @functools.wraps(function) def wrapper(*args, **kwargs): return provide_progress_bar(function, estimated_time=estimated_time, tstep=tstep, tqdm_kwargs=tqdm_kwargs, args=args, kwargs=kwargs) return wrapper return real_decorator
Add this to the Strategy:
from tqdm_function import progress_wrapped @progress_wrapped(estimated_time=1500) class firstStrategy(bt.Strategy):
But this fails with:
0%| | 0/1500 [00:00<?, ?it/s]Exception in thread Thread-4: Traceback (most recent call last): File "/anaconda3/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/anaconda3/lib/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "~/tqdm_function.py", line 24, in myrunner ret[0] = function(*args, **kwargs) File "/anaconda3/lib/python3.6/site-packages/backtrader/metabase.py", line 86, in __call__ _obj, args, kwargs = cls.donew(*args, **kwargs) File "/anaconda3/lib/python3.6/site-packages/backtrader/strategy.py", line 72, in donew _obj._id = cerebro._next_stid() AttributeError: 'NoneType' object has no attribute '_next_stid' 0%| | 0.2/1500 [00:00<00:21, 70.89it/s]
-
Just bumping this again, it seems strange that there's no simple way to find out:
-
Estimate of how long your backtest will take
-
Progress bar for the backtest in progress
Does anyone else have a solution for this?
-