linear regression and std #211

From Issue #211
Hi,
Could you include in the next release both linear regression and standard deviation? I think these indicators help people to calculate ratios over the time series.
S1= timeseries close S2= timeseries close rolling_beta = pd.ols(y=S1, x=S2, window_type='rolling', window=30) spread = S2  rolling_beta.beta['x'] * S1 std_30 = pd.Series.rolling(spread,window=30,center=False).std()
Rgds,
JJ 
Standard Deviation is already included (since many versions ago)

ok, thanks, i will wait up Standard Deviation. Congrats on the community!!!!

There seem to be several linear regressions, including channels, slopes ...

This code could be adapted to use two timeseries backtrader:
https://github.com/vikasrtr/pyLinearRegression/blob/master/models/LinearRegression.py
https://github.com/vikasrtr/pyLinearRegression/blob/master/models/LinearRegressionGradientDescent.py 
def linreg(X, Y): """ Linear regression y = ax + b """ if len(X) != len(Y): raise ValueError, 'unequal length' N = len(X) Sx = Sy = Sxx = Syy = Sxy = 0.0 for x, y in map(None, X, Y): Sx = Sx + x Sy = Sy + y Sxx = Sxx + x*x Syy = Syy + y*y Sxy = Sxy + x*y det = Sxx * N  Sx * Sx a, b = (Sxy * N  Sy * Sx)/det, (Sxx * Sy  Sx * Sxy)/det meanerror = residual = 0.0 for x, y in map(None, X, Y): meanerror = meanerror + (y  Sy/N)**2 residual = residual + (y  a * x  b)**2 RR = 1  residual/meanerror ss = residual / (N2) Var_a, Var_b = ss * N / det, ss * Sxx / det return a, b, RR

My idea is to use linear regression to test these posts:
https://www.quantopian.com/posts/howtobuildapairstradingstrategyonquantopian
https://www.quantinsti.com/blog/pairtradingstrategybacktestingusingquantstrat/

if X and Y are cointegrated: calculate Beta between X and Y calculate spread as X  Beta * Y calculate zscore of spread # entering trade (spread is away from mean by two sigmas): if zscore > 2: sell spread (sell 1000 of X, buy 1000 * Beta of Y) if zscore < 2: buy spread (buy 1000 of X, sell 1000 * Beta of Y) # exiting trade (spread converged close to mean): if we're short spread and zscore < 1: close the trades if we're long spread and zscore > 1: close the trades loop: repeat above on each new bar, recalculating rolling Beta and spread etc.

hi guys,
I am interested in that as well :D
Remroc

It seems that the beta is the important part here
import pandas as pd import backtrader as bt class OLS_Beta(bt.indicators.PeriodN): _mindatas = 2 # ensure at least 2 data feeds are passed lines = (('beta'),) params = (('period', 30),) def next(self): y, x = (d.get(size=self.p.period) for d in (self.data0, self.data1)) r_beta = pd.ols(y=y, x=x, window_type='rolling', window=self.p.period) self.lines.beta[0] = r_beta.beta['x']
or the spread directly
import pandas as pd import backtrader as bt class Spread(bt.indicators.PeriodN): _mindatas = 2 # ensure at least 2 data feeds are passed lines = (('spread'),) params = (('period', 30),) def next(self): y, x = (d.get(size=self.p.period) for d in (self.data0, self.data1)) r_beta = pd.ols(y=y, x=x, window_type='rolling', window=self.p.period) self.lines.spread[0] = self.data1[0]  r_beta.beta['x'] * self.data0[0]
Is this in line?

Yes, I think these classes are covering our requeriments

Awesome. I ll give it a try tonight...
U rock as usual, thanks Dro :astonished: 
Hello DRo,
I was not able to run your proposition. I am receiving :
TypeError: 'builtin_function_or_method' object is not iterable
And, it seems pd.ols will be deprecated. I have tried with statsmodel.api :
http://statsmodels.sourceforge.net/stable/examples/notebooks/generated/ols.htmlimport statsmodels.api as sm class OLS_Transformation(btind.PeriodN): _mindatas = 2 # ensure at least 2 data feeds are passed lines = (('slope'),('intercept'),('spread'),('spread_mean'),('spread_std'),('zscore'),) params = (('period', 10),) def next(self): #y, x = (d.get(size=self.p.period) for d in (self.data0, self.data1)) p0 = self.data0.get(size=self.p.period) p1 = sm.add_constant(self.data1.get(size=self.p.period),prepend=True) slope, intercept = sm.OLS(p0,p1).fit().params #r_beta = pd.ols(y=p1, x=x, window_type='rolling', window=self.params.period) self.lines.slope[0] = slope self.lines.intercept[0] = intercept self.lines.spread[0] = self.data0.close[0]  (slope * self.data1.close[0] + intercept) self.lines.spread_mean[0] = btind.MovAv.SMA(self.lines.spread, period=self.p.period) self.lines.spread_std[0] = btind.StandardDeviation(self.lines.spread, period=self.p.period) self.lines.zscore[0] = (self.lines.spread[0]  self.lines.spread_mean[0])/self.lines.spread_std[0]
Do you have any recommandations for this path ?
Many thanks...
remroc

Those were typed snippets no actual tested code. The actual line which produced the error would be helpful in understanding what part of the snippet is trying to iterate a
builtin_function_or_method

Hello DRo,
Below is the received error by using the snippet.
For info, in code I am using only Class OLS_Beta and have initiated a self.beta signal in the Strategy's init through :
self.beta = OLS_Beta(self.data0, self.data1)
The window parameter of pd.ols is window and not windows...
Many thanks for your insights :D
C:\Dev\Anaconda2\python.exe C:/Trading/backtradermaster1.9.8.99David/samples/pairtrading/pairtrading.py C:/Trading/backtradermaster1.9.8.99David/samples/pairtrading/pairtrading.py:48: FutureWarning: The pandas.stats.ols module is deprecated and will be removed in a future version. We refer to external packages like statsmodels, see some examples here: http://statsmodels.sourceforge.net/stable/regression.html r_beta = pd.ols(y=y, x=x, window_type='rolling', window=self.p.period) Traceback (most recent call last): File "C:/Trading/backtradermaster1.9.8.99David/samples/pairtrading/pairtrading.py", line 329, in <module> runstrategy() File "C:/Trading/backtradermaster1.9.8.99David/samples/pairtrading/pairtrading.py", line 272, in runstrategy oldsync=args.oldsync) File "C:\Dev\Anaconda2\lib\sitepackages\backtrader\cerebro.py", line 809, in run runstrat = self.runstrategies(iterstrat) File "C:\Dev\Anaconda2\lib\sitepackages\backtrader\cerebro.py", line 926, in runstrategies self._runonce(runstrats) File "C:\Dev\Anaconda2\lib\sitepackages\backtrader\cerebro.py", line 1245, in _runonce strat._once() File "C:\Dev\Anaconda2\lib\sitepackages\backtrader\lineiterator.py", line 274, in _once indicator._once() File "C:\Dev\Anaconda2\lib\sitepackages\backtrader\lineiterator.py", line 294, in _once self.oncestart(self._minperiod  1, self._minperiod) File "C:\Dev\Anaconda2\lib\sitepackages\backtrader\indicator.py", line 124, in oncestart_via_nextstart self.nextstart() File "C:\Dev\Anaconda2\lib\sitepackages\backtrader\lineiterator.py", line 324, in nextstart self.next() File "C:/Trading/backtradermaster1.9.8.99David/samples/pairtrading/pairtrading.py", line 48, in next r_beta = pd.ols(y=y, x=x, window_type='rolling', window=self.p.period) File "C:\Dev\Anaconda2\lib\sitepackages\pandas\stats\interface.py", line 143, in ols return klass(**kwargs) File "C:\Dev\Anaconda2\lib\sitepackages\pandas\stats\ols.py", line 642, in __init__ OLS.__init__(self, y=y, x=x, weights=weights, **self._args) File "C:\Dev\Anaconda2\lib\sitepackages\pandas\stats\ols.py", line 70, in __init__ self._index, self._time_has_obs) = self._prepare_data() File "C:\Dev\Anaconda2\lib\sitepackages\pandas\stats\ols.py", line 102, in _prepare_data self._weights_orig) File "C:\Dev\Anaconda2\lib\sitepackages\pandas\stats\ols.py", line 1298, in _filter_data lhs = Series(lhs, index=rhs.index) File "C:\Dev\Anaconda2\lib\sitepackages\pandas\core\series.py", line 137, in __init__ index = _ensure_index(index) File "C:\Dev\Anaconda2\lib\sitepackages\pandas\indexes\base.py", line 3409, in _ensure_index return Index(index_like) File "C:\Dev\Anaconda2\lib\sitepackages\pandas\indexes\base.py", line 287, in __new__ subarr = com._asarray_tuplesafe(data, dtype=object) File "C:\Dev\Anaconda2\lib\sitepackages\pandas\core\common.py", line 1384, in _asarray_tuplesafe values = list(values) TypeError: 'builtin_function_or_method' object is not iterable Process finished with exit code 1

The problem with the snippet being that
pd.ols
chokes on regular Python arraylike structures (array.array
,list
, etc) needingpandas
specific structures.The adapted and working code
class OLS_Beta(bt.indicators.PeriodN): _mindatas = 2 # ensure at least 2 data feeds are passed lines = (('beta'),) params = (('period', 30),) def next(self): y, x = (pd.Series(d.get(size=self.p.period)) for d in self.datas) r_beta = pd.ols(y=y, x=x, window_type='full_sample') self.lines.beta[0] = r_beta.beta['x']
In which the values from the
dataX
feeds is put into apd.Series
instance. There is, imho, no need to use a rolling operation becausepd.ols
only receives the needed data each time (the latest available data).The same concept can be applied other
pandas
operations and I guess to the code ported over tostatsmodel

@backtrader thanks DRo !
I will try this tomorrow...

Hello DRo,
Thanks, it is working. Pair Trading is operational...
I have implemented the statsmodel as well and seems to retrieve the same beta and spread...
Quick question, how to :
 show 2 data.lines in the same subplot (ie PEP and KO in the same subplot) ?
 increase the height of the indicator's subplot ?
 plot only 1 line of a multilines indicator in a dedicated subplot ?
Many thanks,
RemRoc

Plotting options are detailed here: https://www.backtrader.com/docu/plotting/plotting.html

@backtrader thanks,
Is it possible to contribute to backtrader by sharing the pair trading strategy as a sample of the distribution ?
If yes, please let me know how to do it ?
Thanks
Remroc