Suggestion: caching
-
Hi,
I'd like to make a suggestion to create an option to cache results.
When searching for a profitable strategy backtrader is (in my case, presumably also for others) frequently used to add an indicator or two, rerun, examine the results etc. This involves many iterations, in which many of the same indicators need to be recalculated. By caching the calculated series, lots of computing time can be saved when rerunning. In my opinion it would be a valuable addition to add caching as an option to backtrader.
There are multiple ways caching could be a achieved. One of them would be to create a cache file (that would be (re)written at the end of a run and examined for data already available at the beginning of a run). Maybe the Writer functionality could be used to implement this.
Having a cache, once could avoid calculating the same time and again.
What do you think about this suggestion? Consider it useful?
-
Seems like a sensible idea to save some processing power.
-
@hans, could you approximately measure how much computing power are used for indicator in percents? On it's fraction it depends how much it valuable feature.
-
Without knowing how much computing power @hans is consuming, for the simple case of large optimizations, the same indicator is bound to be calculated on the same data. At a given point in time in which many versions of the different indicators involved in an optimization have been calculated, the next loop may be down to for example just one.
At the same time and due to the
GIL
not allowing proper multi-threading, optimization on multi-core architectures is done with themultiprocessing
module and that means that the calculation of the indicators would have to take place on shared memory (to avoid the overhead of communicating large amounts of data between processes) and that requires departing from the direct usage of themultiprocessing
module. -
my 2 cents..
following approach might work using existing multiprocessing architecture.- since optimization iterations are mainly based on parameters which are passed on as ranges... we might simply consider to compute first, all the indicators which make use of those range based parameters and save all of those to the file, even this could be done by having multiple processes each dedicated to that specific indicator.
- Once this computation is done, then actual strategy processing should begin using various combinations of those indicators as applicable. It would only need to read the cached data from file in read only mode for that specific indicator, so no worry about locking, this would as well be multiprocessing as is used today, just that indicators have been computed earlier separately.
-
The idea can in theory work, but needs a complete re-implementation of how the indicators work.