linear regression and std #211

There may be two ways for it, which may be used simultaneously or not:

Using a category in the forum for user examples/contributed code

Creating a pull request in the main repository https://github.com/mementum/backtrader
There could be also a
contrib
package with for example the following subpackages:indicators
strategies
analyzers
observers


Hi,
I have checked the strategy:
class OLS_Transformation(bt.indicators.PeriodN): _mindatas = 2 # ensure at least 2 data feeds are passed lines = (('beta'),('spread'),('spread_mean'),('spread_std'),('zscore'),) params = (('period', 30),) def next(self): y, x = (pd.Series(d.get(size=self.p.period)) for d in self.datas) r_beta = pd.ols(y=y, x=x, window_type='full_sample') self.lines.beta[0] = r_beta.beta['x'] self.lines.spread[0] = self.data1[0]  r_beta.beta['x'] * self.data0[0] self.lines.spread_mean[0] = bt.indicators.MovAv.SMA(self.lines.spread, period=self.p.period) self.lines.spread_std[0] = bt.indicators.StandardDeviation(self.lines.spread, period=self.p.period) self.lines.zscore[0] = (self.lines.spread[0]  self.lines.spread_mean[0])/self.lines.spread_std[0]
I have the following problem:
in next
self.lines.spread_mean[0] = bt.indicators.MovAv.SMA(self.lines.spread, period=self.p.period)File "build\bdist.winamd64\egg\backtrader\linebuffer.py", line 221, in setitem
self.array[self.idx + ago] = valueTypeError: a float is required

Yes, that code is terribly wrong. The assignment of a
bt.indicators.SMA
instance tospread_mean[0]
which is in anarray.array
of typed
(akadouble
orfloat
in Python) is bound to fail each an every time.Lines objects (Indicators et al.) are meant to be instantiated during
__init__
. 
Thanks, i was using the above code, i have changed and now it is working well.
self.ols=OLS_Transformation(self.data0,self.data1)
self.spread_mean = bt.indicators.MovingAverageSimple(self.ols.spread, period=self.p.period)
self.spread_std = bt.indicators.StandardDeviation(self.ols.spread, period=self.p.period) 
@backtrader : DRo, I have sent a pull request for 3 files :
 sample folder : pairtrading.py
 data folder : dailyKO.csv & dailyPEP.csv
https://github.com/mementum/backtrader/pull/223/files
Let me know how you would improve code...
Thanks & cheers,
Remroc

@junajo10 a possible approach to keep it clear is to separate the OLS_Beta in 2 parts
class OLS_Slope_Intercept(bt.indicators.PeriodN): _mindatas = 2 # ensure at least 2 data feeds are passed lines = ('slope', 'intercept',) params = (('period', 10),) def next(self): p0 = pd.Series(self.data0.get(size=self.p.period)) p1 = pd.Series(self.data1.get(size=self.p.period)) p1 = sm.add_constant(p1, prepend=True) slope, intercept = sm.OLS(p0, p1).fit().params self.lines.slope[0] = slope self.lines.intercept[0] = intercept class OLS_Transformation(bt.indicators.PeriodN): _mindatas = 2 # ensure at least 2 data feeds are passed lines = ('spread', 'spread_mean', 'spread_std', 'zscore',) params = (('period', 10),) def __init__(self): self._slint = OLS_Slope_Intercept(*self.datas) self.l.spread = self.data0  (self._slint.slope * self.data1 + self._slint.intercept) self.l.spread_mean = bt.ind.SMA(self.l.spread, period=self.p.period) self.l.spread_std = bt.ind.StdDev(self.l.spread, period=self.p.period) self.l.zscore = (self.l.spread  self.l.spread_mean) / self.l.spread_std

@remroc to be handled

I think a new feature is mandatory, check cointegration between both pairs.
import pandas as pd import backtrader as bt class is Cointegrated(bt.indicators.PeriodN): _mindatas = 2 # ensure at least 2 data feeds are passed lines = (('cointegration'),) def next(self): y, x = (d for d in (self.data0, self.data1)) results = coint(x,y) self.lines.cointegration[0] = results[1] # you should define a pvalue limit and compare with the line value before operating

thanks junajo10

Do you know how to solve this problem?
self.ols=OLS_Transformation(self.data0,self.data1) self.hurst=hurst(pd.Series(self.ols.spread)) >Intel MKL ERROR: Parameter 6 was incorrect on entry to DGELSD. nan def hurst(ts): """Returns the Hurst Exponent of the time series vector ts""" # Create the range of lag values lags = range(2, 100) # Calculate the array of the variances of the lagged differences tau = [sqrt(std(subtract(ts[lag:], ts[:lag]))) for lag in lags] # Use a linear fit to estimate the Hurst Exponent poly = polyfit(log(lags), log(tau), 1) # Return the Hurst exponent from the polyfit output return poly[0]*2.0

That problem cannot for sure be solved.
OLS_Transformation
is a class/object in the backtrader ecosystem (aka lines object) and it is not precalculated as such. It is a lazily evaluated object.Something for example which is likely to fail (without knowing which the actual culprit is)
pd.Series(self.ols.spread)
A
pandas.Series
expects actual data and not an object which is yet to be evaluated. 
Thanks, do you know any alternative way to develop hurst in the backtrader ecosystem?

Happy Xmas guys
Cheers
Remroc 
@backtrader thanks DRo for the merges :thumbsup:

Something like this
from numpy import asarray, log10, polyfit, sqrt, std, subtract class HurstExponent(bt.indicators.PeriodN): ''' References:  https://www.quantopian.com/posts/hurstexponent  https://www.quantopian.com/posts/somecodefromerniechansnewbookimplementedinpython Interpretation of the results 1. Geometric random walk (H=0.5) 2. Meanreverting series (H<0.5) 3. Trending Series (H>0.5) ''' alias = ('Hurst',) lines = ('hurst',) params = (('period', 40),) def __init__(self): super(HurstExponent, self).__init__() # Prepare the lags array self.lags = asarray(range(2, self.p.period // 2)) self.log10lags = log10(self.lags) def next(self): # Fetch the data ts = asarray(self.data.get(size=self.p.period)) # Calculate the array of the variances of the lagged differences tau = [sqrt(std(subtract(ts[lag:], ts[:lag]))) for lag in self.lags] # Use a linear fit to estimate the Hurst Exponent poly = polyfit(self.log10lags, log10(tau), 1) # Return the Hurst exponent from the polyfit output self.lines.hurst[0] = poly[0] * 2.0
A sample chart:
Although something must be wrong ... because the value is never close to
0.5
. It's just a sample. 
I have tested and it is working well. could you add in the next release? Also, if you consider important, halflife quantopian post help people to calibrate the period in the pair trading strategy.
https://www.quantopian.com/posts/pairtradewithcointegrationandmeanreversiontests

@junajo10 thanks for sharing hurst implem and the quantiopian post :thumbsup:

Hurst being added to the development branch using the
frompackages
functionality 
And also OLS_Slope_InterceptN, OLS_BetaN, OLS_TransformationN and CointN (all untested) also using
packages
andfrompackages
Notice the
N
at the end of the name to indicate that each delivered value refers to a calculation performed on the lastN
values. This is specified using the standardperiod
parameter.Commit: https://github.com/mementum/backtrader/commit/65704a90869a88ff4bb9f5ab559f8803e677e9e9

FWIW, WRT cointegration testing, I've been pretty impressed with this code vs. the tools available in stats.