Hi @backtrader,
Any update/ETA on this?
Hi @backtrader,
Any update/ETA on this?
@backtrader, I'm not entirely sure what you mean with the first option, but if it is what I think it is, the way the Quandl
API would be used would be similar to how GenericCSV
works. I'm OK with this. That said, I'm also OK with option 2.
I appreciate your looking into this and look forward to a resolution of the matter. Of course, I trust your judgment regarding which option you plan to implement. Personally, I feel that option 2 would be more user-friendly. Please let us know when a solution is available.
Thanks, as usual, for your effective and quick replies!
@nomnom, I agree with what @backtrader says. If you retrieved the data from Yahoo like I did (in my case using YahooFinanceData
), you will observe random inconsistencies (sometimes, everything is fine, sometimes, everything is bad up to a point and then gets better**, sometimes the whole dataset is bad). If you read the post which @backtrader referenced, you will see that I had exactly the same issues when I tried to plot the data using candlesticks.
Try seeing if you get the same problem with a different data source (e.g. Quandl).
** Try extending the timeframe for your example (make it go all the way up to yesterday). I'm willing to bet that you'll see that it "improves" after a certain date.
One more thing: I think the column with the empty value does not represent the volume. From the Quandl web UI for BMW (click on Table
), we see the following columns:
Date,Open,High,Low,Close,Change,Traded Volume,Turnover,Last Price of the day,Daily Traded Units,Daily Turnover
And, correspondingly, we see that linetokens
is:
['2016-07-27', '77.64', '78.8', '77.51', '78.36', '', '1971401.0', '154530852.0', '', '', '']
So we are expecting to see the volume where Change
actually is.
I checked 3 different databases: WIKI, FSE and NSE. The columns returned are different in each case.
WIKI: Date,Open,High,Low,Close,Volume,Ex-Dividend,SplitRatio,Adj.Open,Adj.High,Adj.Low,Adj.Close,Adj.Volume
NSE: Date,Open,High,Low,Last,Close,TotalTradeQuantity,Turnover(Lacs)
FSE: Date,Open,High,Low,Close,Change,TradedVolume,Turnover,LastPriceoftheDay,DailyTradedUnits,DailyTurnover
So, @backtrader, I guess you will have to treat each one differently. One way to retrieve the columns programmatically is to query the database like this: https://www.quandl.com/api/v3/datasets.json?database_code=<WIKI/NSE/FSE/ETC>&per_page=100&sort_by=id&page=1&api_key=<XXXX>
. Every "dataset" node contains a child called "column_names" which is a list of the columns returned.
Hope this helps.
@dasch, yeah I thought so too.
@backtrader, can this be handled internally such that a warning is thrown and the empty value is interpreted as 0/NaN/infinity/etc. so that the process may continue to execute?
@dasch, it still doesn't work even if I provide the API key. I get the same error with this line:
quandl_finance_data = bt.feeds.Quandl(dataname="BMW_X", fromdate=from_date, todate=to_date, round=False, adjclose=False, apikey="xxxx", dataset="FSE")
However, if I switch to the WIKI
dataset (instead of FSE
) and use a ticker like YHOO
, everything works fine. So I'm guessing the problem is with the usage of the FSE
dataset.
Hi again @dasch and @backtrader,
As usual, thanks for your quick replies! Instead of retrieving the data from every symbol twice (once adjusted, once not) and then taking the correct values from each, I decided to see if I could switch from Yahoo to Quandl instead. I found the BMW ticker (from the FSE
dataset, which seems to be free to access) and decided to take it for a test run. Sadly, I didn't get very far. Here's the code (pretty much the same as before except the data acquisition part):
from datetime import datetime, timedelta, date
import backtrader as bt
# get Yahoo finance data
lookback_timedelta_days = timedelta(days=365)
now_datetime = datetime.now()
to_date = datetime(now_datetime.year, now_datetime.month, now_datetime.day)
from_date = to_date - lookback_timedelta_days
# OLD FEED FROM YAHOO
# yahoo_finance_data = bt.feeds.YahooFinanceData(dataname="BMW.F", fromdate=from_date, todate=to_date, adjclose=False,
# swapcloses=False, round=False)
# NEW FEED FROM QUANDL
quandl_finance_data = bt.feeds.Quandl(dataname="BMW_X", fromdate=from_date, todate=to_date, dataset='FSE', round=False,
adjclose=False)
# start cerebro
cerebro = bt.Cerebro()
cerebro.adddata(data=quandl_finance_data)
# run cerebro
cerebro_result = cerebro.run()
# capture results
date_series = cerebro_result[0].data.datetime.array
open_series = list(cerebro_result[0].data.open.array)
high_series = list(cerebro_result[0].data.high.array)
low_series = list(cerebro_result[0].data.low.array)
close_series = list(cerebro_result[0].data.close.array)
volume_series = list(cerebro_result[0].data.volume.array)
with open('results.txt', 'w') as file_obj:
file_obj.write('Date,Open,High,Low,Close,Volume\n')
for day_iter in range(len(date_series)):
this_date = date.fromordinal(int(date_series[day_iter]))
this_open = open_series[day_iter]
this_high = high_series[day_iter]
this_low = low_series[day_iter]
this_close = close_series[day_iter]
this_volume = volume_series[day_iter]
file_obj.write('%s,%f,%f,%f,%f,%d\n' % (this_date, this_open, this_high, this_low, this_close, this_volume))
I get this error:
/usr/bin/python2.7 <bla>/btYahooIssue.py
Traceback (most recent call last):
File "<bla>/btYahooIssue.py", line 22, in <module>
cerebro_result = cerebro.run()
File "/usr/local/lib/python2.7/dist-packages/backtrader/cerebro.py", line 1073, in run
runstrat = self.runstrategies(iterstrat)
File "/usr/local/lib/python2.7/dist-packages/backtrader/cerebro.py", line 1149, in runstrategies
data.preload()
File "/usr/local/lib/python2.7/dist-packages/backtrader/feed.py", line 682, in preload
while self.load():
File "/usr/local/lib/python2.7/dist-packages/backtrader/feed.py", line 476, in load
_loadret = self._load()
File "/usr/local/lib/python2.7/dist-packages/backtrader/feed.py", line 704, in _load
return self._loadline(linetokens)
File "/usr/local/lib/python2.7/dist-packages/backtrader/feeds/quandl.py", line 111, in _loadline
v = float(linetokens[next(i)])
ValueError: could not convert string to float:
Process finished with exit code 1
What am I doing wrong?
Thanks, @dasch and @backtrader for your quick answers!
@dasch: I tried your suggestion, but unfortunately, the results are still inconsistent. One good tip, however, from your answer was to use round=False
. Thanks for that! I guess we will never get the correct results due to the seemingly inconsistent behavior of the data returned by Yahoo. From the BT data feed reference:
If True the allegedly adjusted close and non-adjusted close will be swapped. The downloads with the new v7 API show at random times the closing prices swapped. There is no known pattern
If I use your suggestion, i.e., adjclose=False, swapcloses=True, round=False
, we get the following:
From Yahoo:
Date,Open,High,Low,Close,Adj Close,Volume
2016-07-27,76.838997,78.705002,76.838997,78.142998,75.114166,12079
Ours:
Date,Open,High,Low,Close,Volume
2016-07-27,79.937379,81.878627,79.937379,78.142998,12079
We see that the close (and volume) values match, but nothing else does.
Now, if I use adjclose=True, swapcloses=True, round=False
, we get the following:
From Yahoo:
Date,Open,High,Low,Close,Adj Close,Volume
2016-07-27,76.838997,78.705002,76.838997,78.142998,75.114166,12079
Ours:
Date,Open,High,Low,Close,Volume
2016-07-27,76.838997,78.705002,76.838997,75.114166,11610
We see that the everything matches except for the close and volume values (exactly the opposite of the previous case). After a certain date, they do start matching, but I can't rely on something like this.
So it seems that if I want to get all the OHLCV values correct, I have to use a combination of the two approaches above.
@backtrader: I suppose I have been living under a rock. So far, I have only required the close values and since I always used adjclose=False
, I got the proper values. Now, I need the other values too and therefore have only started venturing out from under my rock just now. For the record, I never meant to imply that there is some bug with BT. Please don't take it personally. I have always maintained in all my other posts that BT is an absolutely amazing product. Unfortunately, I do not see any alternative to using YahooFinanceData because I require mostly German and Indian equities and so far, I haven't been able to find an alternative service that provides these free of charge (I'm only doing this as a hobby in my free time and would therefore be unwilling to pay for a service). If you know a better source, I am all ears. Thank you again, for your continued support!
Hello again BT community!
I have noticed an interesting problem which occurs on certain Yahoo symbols (I think - I have to admit, I haven't tested very extensively) when using the YahooFinanceData API. I notice that for the affected symbols, the OHLCV (open, high, low, close, volume) values are incorrect up to a certain point in time, after which they are correct.
To demonstrate this odd behavior, I have written a small script. Please execute the code below:
from datetime import datetime, timedelta, date
import backtrader as bt
# get Yahoo finance data
lookback_timedelta_days = timedelta(days=365)
now_datetime = datetime.now()
to_date = datetime(now_datetime.year, now_datetime.month, now_datetime.day)
from_date = to_date - lookback_timedelta_days
yahoo_finance_data = bt.feeds.YahooFinanceData(dataname="BMW.F", fromdate=from_date, todate=to_date, adjclose=False)
# start cerebro
cerebro = bt.Cerebro()
cerebro.adddata(data=yahoo_finance_data)
# run cerebro
cerebro_result = cerebro.run()
# capture results
date_series = cerebro_result[0].data.datetime.array
open_series = list(cerebro_result[0].data.open.array)
high_series = list(cerebro_result[0].data.high.array)
low_series = list(cerebro_result[0].data.low.array)
close_series = list(cerebro_result[0].data.close.array)
volume_series = list(cerebro_result[0].data.volume.array)
with open('results.txt', 'w') as file_obj:
file_obj.write('Date,Open,High,Low,Close,Volume')
for day_iter in range(len(date_series)):
this_date = date.fromordinal(int(date_series[day_iter]))
this_open = open_series[day_iter]
this_high = high_series[day_iter]
this_low = low_series[day_iter]
this_close = close_series[day_iter]
this_volume = volume_series[day_iter]
file_obj.write('%s,%f,%f,%f,%f,%d\n' % (this_date, this_open, this_high, this_low, this_close, this_volume))
Sorry for the messy coding. It's just intended as a demo. As you see, we retrieve the historical OHLCV data for the symbol "BMW.F" going a year back and then write it to a file (results.txt) with the aim that we can compare this file against the one retrieved from Yahoo's web-interface (go here, then click "download data"). Please save the file and remove the "adjusted close" column because we don't need it (I would have attached the file, but uploads are not allowed).
If you compare the two files, you will see that there are big differences in the OHLC values up to May 11 2017. There are never any errors for the volume information, by the way. After this point, there are still differences, but they are very small (maybe precision errors?). Please see the image below to see what I mean. Up to May 11 2017, we see consistent differences on the order of ~3 Euros. After this point, the difference is on the order of ~<=0.01 Euros.
I have also created a candlestick chart using the data retrieved from the YahooFinanceData API. It confirms what we see above. Before May 11, the data looks extremely weird where not only are the values wrong, but (almost) every day, the close is lower than the open. After May 11, we see that reality is reflected more accurately.
Finally, as I said, this doesn't seem to happen for all symbols. Here are a few others where I notice this behavior with the date where bad data becomes good (ARL.DE: June 1 2017; ADJ.DE; May 4 2017; AIRA.DE; April 11 2017, etc.).
Thanks for reading this long post. I hope you can figure out what the issue is! Please let me know if I can provide any additional help.
pip show backtrader
Name: backtrader
Version: 1.9.51.121
Summary: BackTesting Engine
Home-page: https://github.com/mementum/backtrader
Author: Daniel Rodriguez
Author-email: danjrod@gmail.com
License: GPLv3+
Location: /usr/local/lib/python2.7/dist-packages
Requires:
@backtrader, is there any news/update on this issue? Thanks in advance.
@Maxim-Korobov I was going crazy with the same error. Thanks for this post!
@backtrader, thanks. It's a very valid use-case for me at least. Do you think this could be included in a future commit? Many thanks in advance!
Hello,
Is there any way to use bt.feeds.YahooFinanceData() with a proxy?
Thanks in advance
Once again, thank you so much, @backtrader, for your quick and helpful answer. I think I have everything I need for now.
Hi again @backtrader,
Many thanks for your quick answer! I'm afraid I'll require a follow-up :(.
First things first, where is the get() API documented? I couldn't find it.
I tried your first solution using the get() API and this works:
import datetime # For datetime objects
import numpy
# Import the backtrader platform
import backtrader as bt
# Create a Stratey
class TestStrategy(bt.Strategy):
def __init__(self):
# Keep a reference to the "close" line in the data[0] dataseries
self.datavolume = self.datas[0].volume
def next(self):
# historical volume based decisions <- WORKS!!
if self.datavolume[0] / numpy.mean(self.datavolume.get(size=10, ago=-1)) >= 1.25:
print('buy')
self.buy()
if __name__ == '__main__':
# Create a cerebro entity
cerebro = bt.Cerebro()
# Add a strategy
cerebro.addstrategy(TestStrategy)
# Datas are in a subfolder of the samples. Need to find where the script is
# because it could have been called from anywhere
datapath = 'goog.csv'
# Create a Data Feed
data = bt.feeds.YahooFinanceCSVData(
dataname=datapath,
# Do not pass values before this date
fromdate=datetime.datetime(2016, 1, 1),
# Do not pass values before this date
todate=datetime.datetime(2016, 12, 31),
# Do not pass values after this date
reverse=True)
# Add the Data Feed to Cerebro
cerebro.adddata(data)
# Set our desired cash start
cerebro.broker.setcash(100000.0)
# Print out the starting conditions
print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue())
# Run over everything
cerebro.run()
# Print out the final result
print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue())
Here above, I hope I'm using the size and ago parameters properly. Based on your description, I'm supposing that size indicates my "lookback" period and ago indicates from what point prior to the current data point the lookback shall begin.
Unfortunately, I couldn't get your second solution to work (maybe I've missed something):
import datetime # For datetime objects
# Import the backtrader platform
import backtrader as bt
# Create a Stratey
class TestStrategy(bt.Strategy):
def __init__(self):
# Keep a reference to the "close" line in the data[0] dataseries
self.datavolume = self.datas[0].volume
self.mysignal = (self.data.volume / bt.ind.Average(self.data.volume, period=10)) >= 1.25 #<- DOES NOT WORK!!
def next(self):
# signal-based decision
if self.mysignal:
print('buy')
self.buy()
if __name__ == '__main__':
# Create a cerebro entity
cerebro = bt.Cerebro()
# Add a strategy
cerebro.addstrategy(TestStrategy)
# Datas are in a subfolder of the samples. Need to find where the script is
# because it could have been called from anywhere
datapath = 'goog.csv'
# Create a Data Feed
data = bt.feeds.YahooFinanceCSVData(
dataname=datapath,
# Do not pass values before this date
fromdate=datetime.datetime(2016, 1, 1),
# Do not pass values before this date
todate=datetime.datetime(2016, 12, 31),
# Do not pass values after this date
reverse=True)
# Add the Data Feed to Cerebro
cerebro.adddata(data)
# Set our desired cash start
cerebro.broker.setcash(100000.0)
# Print out the starting conditions
print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue())
# Run over everything
cerebro.run()
# Print out the final result
print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue())
The line where I add the signal fails with the error:
<path>python bt_example.py
Starting Portfolio Value: 100000.00
Traceback (most recent call last):
File "bt_example.py", line 74, in <module>
cerebro.run()
File "C:\Python27\lib\site-packages\backtrader\cerebro.py", line 810, in run
runstrat = self.runstrategies(iterstrat)
File "C:\Python27\lib\site-packages\backtrader\cerebro.py", line 877, in runstrategies
strat = stratcls(*sargs, **skwargs)
File "C:\Python27\lib\site-packages\backtrader\metabase.py", line 87, in __call__
_obj, args, kwargs = cls.doinit(_obj, *args, **kwargs)
File "C:\Python27\lib\site-packages\backtrader\metabase.py", line 77, in doinit
_obj.__init__(*args, **kwargs)
File "bt_example.py", line 13, in __init__
self.mysignal = (self.data.volume / bt.ind.Average(self.data.volume, period=10)) >= 1.25
TypeError: unsupported operand type(s) for /: 'LineBuffer' and 'Average'
Also, even if this were to work, how would we calculate the average for all volume values starting from the previous day and going X days back? As far as I understand, with this code snippet, the signal is generated when the current volume divided by the average of the last X volume values exceeds or equals 1.25.
One more question: is there a simple way to filter out volume values which are 0?
Finally, I may be wrong, but there's perhaps a typo in the indicator documentation (scroll down to the documentation for "Average"):
Shouldn't it be:
Formula:
av = sum(data(period)) / period
Dear bt community,
First of all, let me commend you on this amazing software! What you have created here is truly amazing!
I am quite new to bt and while I've been able to do everything I've wanted so far, the one thing that escapes me is the ability to retrieve historical data in a strategy. What I would like to do is to make a buy/sell decision based on the average traded volume over the last x days (this is obviously not a real-world scenario, just an example). I looked through the documentation and I see that it's possible to get individual values, e.g., something like self.datas[0].volume[-10]. However, what I want to do is to get something like self.datas[0].volume[-1 : -10 : -1], i.e., to get the last 10 traded volumes in one shot. If I do this, however, I get a TypeError.
Here's a concrete example (hacked from an example in the quickstart documentation):
import datetime # For datetime objects
import numpy
# Import the backtrader platform
import backtrader as bt
# Create a Stratey
class TestStrategy(bt.Strategy):
def __init__(self):
# Keep a reference to the "close" line in the data[0] dataseries
self.datavolume = self.datas[0].volume
def next(self):
# historical volume based decisions <- THIS DOES NOT WORK!!
if self.datavolume[0] / numpy.mean(self.datavolume[-1 : -10 : -1]) >= 1.25:
# BUY, BUY, BUY!!! (with all possible default parameters)
self.buy()
# single-value volume based decisions <- THIS WORKS
if self.datavolume[0] > self.datavolume[-1]:
# current close less than previous close
if self.datavolume[-1] > self.datavolume[-2]:
# BUY, BUY, BUY!!! (with all possible default parameters)
self.buy()
if __name__ == '__main__':
# Create a cerebro entity
cerebro = bt.Cerebro()
# Add a strategy
cerebro.addstrategy(TestStrategy)
# Datas are in a subfolder of the samples. Need to find where the script is
# because it could have been called from anywhere
datapath = 'goog.csv'
# Create a Data Feed
data = bt.feeds.YahooFinanceCSVData(
dataname=datapath,
# Do not pass values before this date
fromdate=datetime.datetime(2016, 1, 1),
# Do not pass values before this date
todate=datetime.datetime(2016, 12, 31),
# Do not pass values after this date
reverse=True)
# Add the Data Feed to Cerebro
cerebro.adddata(data)
# Set our desired cash start
cerebro.broker.setcash(100000.0)
# Print out the starting conditions
print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue())
# Run over everything
cerebro.run()
# Print out the final result
print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue())
Please see the part where I say "<- THIS DOES NOT WORK!!" :). The exact error is:
<path>python bt_example.py
Starting Portfolio Value: 100000.00
Traceback (most recent call last):
File "bt_example.py", line 64, in <module>
cerebro.run()
File "C:\Python27\lib\site-packages\backtrader\cerebro.py", line 810, in run
runstrat = self.runstrategies(iterstrat)
File "C:\Python27\lib\site-packages\backtrader\cerebro.py", line 929, in runstrategies
self._runonce(runstrats)
File "C:\Python27\lib\site-packages\backtrader\cerebro.py", line 1302, in _runonce
strat._oncepost(dt0)
File "C:\Python27\lib\site-packages\backtrader\strategy.py", line 269, in _oncepost
self.nextstart() # only called for the 1st value
File "C:\Python27\lib\site-packages\backtrader\lineiterator.py", line 324, in nextstart
self.next()
File "bt_example.py", line 18, in next
if self.datavolume[0] / numpy.mean(self.datavolume[-1 : -10 : -1]) >= 1.25:
File "C:\Python27\lib\site-packages\backtrader\linebuffer.py", line 163, in __getitem__
return self.array[self.idx + ago]
TypeError: unsupported operand type(s) for +: 'int' and 'slice'
Does anyone know how I can achieve what I want without resorting to for loops, etc.?
Many thanks in advance!