I have 2 parameters I need to test with ranges from 1-200 (i'm not overoptimizing/curvefitting) and the brute force method takes forever on a laptop.
For reference: http://outlace.com/Simple-Genetic-Algorithm-in-15-lines-of-Python/
btw, loving the platform so far! nice work
]]>Several techniques have been established through trials that you can implement to get the best set that suits your needs and requirements. Let us look at some strategies:
Grid Search: One of the brute force methods, grid search is the most basic algorithm for hyperparameter tuning. Essentially, we divide the domain of the hyperparameters into a discrete grid. Then, using cross-validation, we try every possible combination of grid values. The optimal combination of hyperparameter values is the grid point that maximizes the average value in cross-validation.
Random Search: Random search is like grid search, but instead of testing all of the points in the grid, it only tests a random subset of them. The optimization will be faster, but this subset’s more minor will be less accurate. The more precise the optimization is, but the more it looks like a grid search, the bigger this dataset is.
I have prepared optimization statistics for different optimizers from Gradient-Free-Optimizers.
I did my tests on the following simple strategy (just for proof):
class SmaCross(bt.SignalStrategy):
params = (
('fast', 10),
('slow', 30),
)
def __init__(self):
sma1, sma2 = bt.ind.SMA(period=self.p.fast), bt.ind.SMA(period=self.p.slow)
crossover = bt.ind.CrossOver(sma1, sma2)
self.signal_add(bt.SIGNAL_LONG, crossover)
And the test parameters were:
"fast": np.arange(5, 150, 2),
"slow": np.arange(50, 150, 2)
The most interesting is the results for optimization times and scores:
Build-in brute force: score:10035.94200515747, time:207.48 s, para: {"fast":57, "slow":56}
HillClimbingOptimizer: score:10023.95000076294, time:7.72 s, para:{'fast': 129, 'slow': 54}
RepulsingHillClimbingOptimizer: score:10023.95000076294, time:15.99 s, para:{'fast': 119, 'slow': 60}
SimulatedAnnealingOptimizer: score:10023.95000076294, time:7.24 s, para:{'fast': 131, 'slow': 52}
RandomSearchOptimizer: score:10026.352001190186, time:20.60 s, para:{'fast': 61, 'slow': 56}
RandomRestartHillClimbingOptimizer: score:10023.95000076294, time:10.14 s, para:{'fast': 127, 'slow': 52}
RandomAnnealingOptimizer: score:10023.95000076294, time:8.63 s, para:{'fast': 125, 'slow': 56}
ParallelTemperingOptimizer: score:10021.9880027771, time:15.08 s, para:{'fast': 147, 'slow': 50}
ParticleSwarmOptimizer: score:10030.186000823975, time:15.71 s, para:{'fast': 61, 'slow': 54}
EvolutionStrategyOptimizer: score:10023.95000076294, time:14.98 s, para:{'fast': 131, 'slow': 52}
DecisionTreeOptimizer: score:10035.94200515747, time:5.05 s, para:{'fast': 57, 'slow': 56}
To summarize:
DecisionTreeOptimizer was 40x times faster!
You could check my calculations on github.
]]>Here is my very basic code example using the Simple Moving Average strategy using 1 year of TSLA daily data from Yahoo
# https://github.com/SimonBlanke/Gradient-Free-Optimizers
import numpy as np
from gradient_free_optimizers import EvolutionStrategyOptimizer
import datetime
# import dateutil.parser
# import pytz, tzlocal
import backtrader as bt
import backtrader.indicators as btind
import backtrader.feeds as btfeeds
class MA_CrossOver(bt.Strategy):
#This is a long-only strategy which operates on a moving average cross
alias = ('SMA_CrossOver',)
params = (
# period for the fast Moving Average
('fast', 10),
# period for the slow moving average
('slow', 30),
# moving average to use
('_movav', btind.MovAv.SMA)
)
def __init__(self):
sma_fast = self.p._movav(period=self.p.fast)
sma_slow = self.p._movav(period=self.p.slow)
self.buysig = btind.CrossOver(sma_fast, sma_slow)
def next(self):
if self.position.size:
if self.buysig < 0:
self.sell()
elif self.buysig > 0:
self.buy()
def runstrat(para): # smacrossover
cerebro_opt = bt.Cerebro(runonce=True, optdatas=True)
cerebro_opt.adddata(data)
cerebro_opt.addstrategy(MA_CrossOver ,fast = para["fast"], slow = para["slow"])
cerebro_opt.run()
return cerebro_opt.broker.getvalue()
#--- end runstrat ---
# Add the feed
fromdate = datetime.datetime.strptime('2020-06-01', '%Y-%m-%d')
todate = datetime.datetime.strptime('2021-06-02', '%Y-%m-%d')
data = btfeeds.YahooFinanceData(
dataname='TSLA',
fromdate=fromdate,
todate=todate)
# -- smacrossover search space ---
search_space = {
"fast": np.arange(5, 200, 1),
"slow": np.arange(5, 200, 1),
}
iterations = 1000
opt = EvolutionStrategyOptimizer(search_space)
opt.search(runstrat, n_iter=iterations) # repo says > 10000 but that's looong
best_param_fast = opt.best_para['fast']
best_param_slow = opt.best_para['slow']
print('best_param_fast: ' + str(best_param_fast))
print('best_param_slow: ' + str(best_param_slow))
At the end of the optimization run, GFO will present statistics. The above example took 17 minutes to run, most of that time spent on 'evaluation' which is running the strategy.
If there is a way to make the strategy more efficient that would certainly reduce the time. Reading the data from Yahoo online would help - this strategy takes 30% less time to run when reading locally.
The 'Iterations' is a setting of GFO - the above example uses 1,000 iterations, but their example on github uses 10,000 iterations, but that function is a simple math function and not a trading strategy.
In this example, the function we're optimizing - 'runstrat' - returns a simple net portfolio value at the end of the run. This is what GFO uses to evaluate the run and come up with the best parameters. Analyzers could be used here to return sharpe ratio, profit factor, MAE/MFE, or a combination.
I hope this helps anyone who codes in this space.
If anyone has any ideas to improve the optimization speed that would be amazing
import numpy as np
param=np.arange(0.1, 0.9, 0.05)
]]>return cerebro.broker.getvalue()
This is a super simple method of using cash in the account as a measure of performance, but the true BT way would be use Analyzers https://www.backtrader.com/docu/analyzers/analyzers/
-D
]]>I'm basically working on a project for research where I make a trading system which just sort of makes a little profit here and there, rather than trading constantly.
]]>The built-in optimization (
cerebro.optstrategy
) was always there from day one. This thread is about Genetic Optimization which implies not simply blindly traversing the cartesian product of all possibilities, but choosing and discarding paths based on outcomes ofrun
. I.e.: If you have 3 parameters and one of the runs gives you a-50%
profit, it is highly unlikely that slightly modifying one of the parameters for the nextrun
is going to suddenly make you reach. It will likely deliver a result around-50%
. You can therefore discard this part of the tree and focus on some other branches.
Thanks!
]]>cerebro.optstrategy
) was always there from day one. This thread is about Genetic Optimization which implies not simply blindly traversing the cartesian product of all possibilities, but choosing and discarding paths based on outcomes of run
. I.e.: If you have 3 parameters and one of the runs gives you a -50%
profit, it is highly unlikely that slightly modifying one of the parameters for the next run
is going to suddenly make you reach. It will likely deliver a result around -50%
. You can therefore discard this part of the tree and focus on some other branches.
]]>am I right, that it is not necessary anymore to use external libraries like optunity, because now there is a built-in optimizer (cerebro.optstrategy()) - see docs/quickstart/quickstart11.py?
Thanks in advance!
]]>sobol
, grid search
, random search
) won the race for Net profit optimization. grid search
is least time consuming. particle swarm
method might give more optimal results with larger number of evaluations and larger number of runs. Probably it will be more effective on smaller parameter spaces.
Maybe the good approach is to make 2 step optimization:
I've done optimization study on the Net profit, will post results later.
I was also thinking that my number of evaluations (750 max) covers only tiny part of the all possible configurations, and it can be a reason that grid search
is better, since it covers all the field even sparsely and has more chances to get to the maximum. Others simply stick in the localized area.
optunity.maximize
. Maybe internally it is transformed to the different amount of runs for evaluation function, I didn't check internal processing.
]]>Best solver -
grid search
, which surprised me a lot.
Excellent work. Grid Search is basically a run of all permutations of parameters so I'm also surprised it took less time. My only guesses for the reasons behind these results are:
My personal preference in optimizers is to get the top 10 iterations by measure of profit, then pick the best one that yielded a strategy-specific statistic. This ensures that I'm finding the patterns in the market as opposed to my strategy.
Full disclosure: I'm not an expert by any means and I only know what I've read out there.. Also, Python is new to me, but hope it all helps people somehow nonetheless.
If you can point me discussion board for optunity, I would appreciate it.
According to the docs, this is the place to post optunity questions and comments:
https://gitter.im/claesenm/optunity
The creator also provides an email address:
marc.claesen@esat.kuleuven.be
I don't see why grid-search should be faster than "smarter" algorithms, though obviously it is always optimal (in the constrained region).
Weird.
My intention is to try to make backtrader let me use optstrategy's speed with smart algorithms such as particle swarm or a variation of hill-climbing
]]>optunity
module. It maybe not a discussion related to bt
, but it would be nice to hear from people used/using optunity
. If you can point me discussion board for optunity
, I would appreciate it. I've run number of their solvers with different number of evaluations and received interesting results.
Strategy: 5 parameter strategy, 3 indicators. Total amount of parameter configurations 48 x 48 x 48 x 248 x 98 = 2,687,827,968 :). Recovery Factor (RF) was maximized.
For optimization I've used the following optunity
solvers: particle swarm
, sobol
, random search
, cma-es
, grid search
with standard settings. I know that other trading software widely uses particle swarm
and cma-es
, so these solvers were my main hope.
I've made 4 runs for each of the following number of evaluations: 100, 250, 500 and 750 (twice only).
Results:
particle swarm
, sobol
, random search
: for the same number of evaluations each separte run gave different sets of parameters and different optimal RF. These RFs were quite far from maximum value.
cma-es
: same optimal parameters for 100, 250 and 500 evaluations resulted in zero RFs (meaning roughly negative returns of the strategy). And only 750 evaluations resulted in some positive values, again far from maximum value.
grid search
: same optimal parameters for all runs for the same number of evaluations, number of evaluations affects slightly, resulted in maximum RF from all study. Also took 2-2.5 less time compare to other solvers for the same number of evaluations.
Best solver - grid search
, which surprised me a lot.
Inviting @d416 @Harel-Rozental
]]>pyevolve
library for evolutionary optimization. This is first shot for the system shown above (2 params only, but lets be consistent), takes longer time than particle swarm optimization.
# example of optimizing SMA crossover strategy parameters using
# evolutionary Optimization in the pyevolve python library
# http://pyevolve.sourceforge.net/
from datetime import datetime
import backtrader as bt
from pyevolve import G1DList
from pyevolve import GSimpleGA
class SmaCross(bt.SignalStrategy):
params = (
('sma1', 10),
('sma2', 30),
)
def __init__(self):
SMA1 = bt.ind.SMA(period=int(self.params.sma1))
SMA2 = bt.ind.SMA(period=int(self.params.sma2))
crossover = bt.ind.CrossOver(SMA1, SMA2)
self.signal_add(bt.SIGNAL_LONG, crossover)
data0 = bt.feeds.YahooFinanceData(dataname='YHOO',
fromdate=datetime(2011, 1, 1),
todate=datetime(2012, 12, 31))
def runstrat(chromosome):
cerebro = bt.Cerebro()
cerebro.addstrategy(SmaCross, sma1=chromosome[0], sma2=chromosome[1])
cerebro.adddata(data0)
cerebro.run()
return cerebro.broker.getvalue()
genome = G1DList.G1DList(2)
genome.setParams(rangemin=2, rangemax=55)
genome.evaluator.set(runstrat)
ga = GSimpleGA.GSimpleGA(genome)
ga.setGenerations(5)
ga.setMutationRate(0.2)
ga.setCrossoverRate(1.0)
ga.setPopulationSize(10)
ga.evolve(freq_stats=1)
print ga.bestIndividual()
]]>The PSO algorithm is part of the opptunity python package at https://github.com/claesenm/optunity
Essentially the steps are:
This method works well, although the opptunity library has some known limitations (e.g. can only work with float-based parameters.. thus having to convert to int in the strategy). Hopefully this is helpful for anyone trying to do the same thing.
# example of optimizing SMA crossover strategy parameters using
# Particle Swarm Optimization in the opptunity python library
# https://github.com/claesenm/optunity
from datetime import datetime
import backtrader as bt
import optunity
import optunity.metrics
class SmaCross(bt.SignalStrategy):
params = (
('sma1', 10),
('sma2', 30),
)
def __init__(self):
SMA1 = bt.ind.SMA(period=int(self.params.sma1))
SMA2 = bt.ind.SMA(period=int(self.params.sma2))
crossover = bt.ind.CrossOver(SMA1, SMA2)
self.signal_add(bt.SIGNAL_LONG, crossover)
data0 = bt.feeds.YahooFinanceData(dataname='YHOO',
fromdate=datetime(2011, 1, 1),
todate=datetime(2012, 12, 31))
def runstrat(sma1,sma2):
cerebro = bt.Cerebro()
cerebro.addstrategy(SmaCross, sma1=sma1, sma2=sma2)
cerebro.adddata(data0)
cerebro.run()
return cerebro.broker.getvalue()
opt = optunity.maximize(runstrat, num_evals=100, sma1=[2, 55], sma2=[2, 55])
optimal_pars, details, _ = opt
print('Optimal Parameters:')
print('sma1 = %.2f' % optimal_pars['sma1'])
print('sma2 = %.2f' % optimal_pars['sma2'])
cerebro = bt.Cerebro()
cerebro.addstrategy(SmaCross, sma1=optimal_pars['sma1'], sma2=optimal_pars['sma2'])
cerebro.adddata(data0)
cerebro.run()
cerebro.plot()
]]>One possible way would be to implement the reward/scoring mechanism by means of an analyzer. This is possible because analyzers can host sub-analyzers.
This mechanism offers an advantage: switching to a different reward mechanism implies only switching from analyzer_reward_A
to analyzer_reward_B
.
In any case such analyzer_reward_X
would need to offer a well-defined interface.