There doesn't seem to be a fully supported Python API but there are some projects:
With regards to implementation the lates broker to be added was Oanda and should serve as a guidance. How it is foreseen (but not set in stone):
Development of a Store which is a singleton (see the Oanda store -> Source)
This is the entity which actually talks to the API and creates threads and manages events if needed be
Development of a Data Feed (as stated above, see the Oanda data feed -> Source)
Development of a Broker (as stated above, see the Oanda broker -> Source)
backtrader takes a dual approach to the problem. This is controlled with the runonce (boolean) parameter to either a instantiation of Cerebro or to cerebro.run like in
The default is runonce=True
cerebro = Cerebro(runonce=True) # or False
cerebro = Cerebro()
cerebro.run(runonce=True) # or False
This could be called a pseudo-vectorized of half-vectorized approach. Built-in operations feature a once method which calculate things in batch mode in a tight inner loop.
Data feeds are fully pre-loaded
Indicators (and sub-indicators thereof) are pre-calculated in batch-mode
Then, the Strategy instance(s) are run step-by-step
Being the goal to offer an increase in speed, but still allow for fine grained logic in the next method of the strategy
Rough calculations indicate that it is somehow between 20-30% than runonce=False
Drawback: Because indicators are precalculated (and therefore the buffers are preallocated), the data synchronization mechanism cannot pause the actual movement of a data feed when synchronizing the timestamps for the strategy, keeping the buffers to the final same length. This has no actual impact for backtesting but because matplotlib expects all things to have the same x length for plotting, it may not be possible to create a plot of the backtesting.
Drawback 2: The implementation of this mode prevented that some indicators can be fully defined in recursive terms with a single formula. A choice had to be made between having this or having the recursive formulas.
Nice Thing: If a user implements a custom Indicator and only provides a next method (intented for step-by-step, see below), the code automatically detects it and will still pre-calculate the indicator using the next method instead of the missing once method. The calculation loop will not be so tight as it could be, but users don't have to worry about implementing once
This is a 100% step-by-step mode. Also named next because only the next method of the different indicators, strategies et al., play a role.
Everything is calculated one step at a time. The reason being the addition of data feeds which would be providing the data points one step at a time (not necessarily live feeds, it could have been reading out of a socket from a database connection).
If cerebro is run with preload=False (disable the preloading of data feeds) it will switch to this mode.
@Harel-Rozental said in Smart optimizations and Backtrader:
I managed to get my optimizations to run as quick as optstrategy by subclassing Cerebro and pulling out some stuff out of run() and addstrategy() into different functions (one to initialize data loading, and another which gets re-executed from outside the class for optimization).
I also manage multiprocessing from the outside.
Do you mind to share some code? Or kind of a layout of what and where was implemented in more details?
Would be something like this
params = (
datafields = [
'datetime', 'open', 'high', 'low', 'close', 'volume', 'openinterest'
self._df = self.p.dataname
# Set the standard datafields - except for datetime
for datafield in self.datafields[1:]:
# get the column index
colidx = getattr(self.params, datafield)
if colidx < 0:
# column not present -- skip
l = getattr(self.lines, datafield)
l.array = self._df.iloc[:, colidx]
field0 = self.datafields
dts = pd.to_datetime(self.index)
getattr(self.l, field0).array = dts.apply(date2num)
Where datetime is directly taken from the index. The default column offset for the other fields is probably 1-off because of it, but it can luckily be configured
You can directly upload files into the thread. Not much input would be needed.
An alternative is to enclose the data (which is apparently csv) as a code block (with no python specifyer) and it will allow simple copy & paste.
I have the following scenario in mind:
Pass the system the following inputs:
in period duration (5Y)
out period duration (1Y)
parameters and ranges
Let's say we have 10 years worth of data, then the ideal processing would be as follows:
Break the data in multiple chunks according to the periods defined above
In Period 1: 2000-2005
Out Period 1: 2005-2006
In Period 2: 2001:2006
Out Period 2: 2006-2007
In Period 3: 2002-2007
Out Period 3: 2007-2008
Than start executiing the execution.
Optimize parameters for first In Period (2000-2005) - cerebro.optstrategy()
Get best parameters from previous run and test on Out Period 1 - cerebro.addstrategy()
Optimize again on the next In Period
Test with the parameters from previous step on the next out period
Optimize again ......
Test again on the next Out Period .....
At the end plot the combined strategy during all out periods from 2005 to 2010 by taking into consideration that parameters are being changed every year with the optimized values from last 5 years worth of data.
I am open for any ideas for implementation.
So far I am thinking to read the data in pandas, break it and loop through it in chunks and feed it to cerebro consecuently.
I have no idea how to combine the outputs from each cerebro run and how to plot all individual strategy executions into a single figure.
The idea is always to have a generic model, like it is the case with the timezones, in which you may pass pytz instances or something that, more or less, complies to the interface of pytz.
A similar approach will be taken.