Trouble with date format using backtrader & panda
-
Hi,
I am sorry, I feel other already had this trouble, but I couldn't exactly find any solution.
1/Context
I intend to study/implement strategies to be run with crypto-currency.
I could spot the "Backtest-rookies" blog that gives some tutorials with CCXT, panda & backtrader: awsome! :)I used the script they propose to extract data from a crypto exchange using CCXT and saving it in a CSV file:
https://backtest-rookies.com/2018/03/08/download-cryptocurrency-data-with-ccxt/
(for backtesting 1st, this is what I need)In the script, panda.DataFrame is used to store OHLCV data, and to print it in a CSV file
# Get data data = exchange.fetch_ohlcv(args.symbol, args.timeframe) header = ['Timestamp', 'Open', 'High', 'Low', 'Close', 'Volume'] df = pd.DataFrame(data, columns=header).set_index('Timestamp') # Save it symbol_out = args.symbol.replace("/","") filename = '{}-{}-{}.csv'.format(args.exchange, symbol_out,args.timeframe) df.to_csv(filename)
2/ Settings
To import back data in backtrader, I used the script given in backtradder help:
https://www.backtrader.com/docu/pandas-datafeed/pandas-datafeed.html?highlight=pandaAfter encountering an error with the initial script that was already mentionned by another user in the forum, I tweaked the initial script with the following line:
https://community.backtrader.com/topic/676/bug-using-pandas-hdfSo the modified script:
from __future__ import (absolute_import, division, print_function, unicode_literals) import argparse import backtrader as bt import backtrader.feeds as btfeeds import pandas as pd def runstrat(): args = parse_args() # Create a cerebro entity cerebro = bt.Cerebro(stdstats=False) # Add a strategy cerebro.addstrategy(bt.Strategy) # Get a pandas dataframe datapath = ('../data_crypto/binance-BTCUSDT-1h_light.csv') # Simulate the header row isn't there if noheaders requested skiprows = 1 if args.noheaders else 0 header = None if args.noheaders else 0 dataframe = pd.read_csv(datapath, skiprows=skiprows, header=header, parse_dates=True, index_col=0) dataframe.index = pd.to_datetime(dataframe.index) if not args.noprint: print('--------------------------------------------------') print(dataframe) print('--------------------------------------------------') # Pass it to the backtrader datafeed and add it to the cerebro data = bt.feeds.PandasData(dataname=dataframe) cerebro.adddata(data) # Run over everything cerebro.run() # Plot the result cerebro.plot(style='bar') def parse_args(): parser = argparse.ArgumentParser( description='Pandas test script') parser.add_argument('--noheaders', action='store_true', default=False, required=False, help='Do not use header rows') parser.add_argument('--noprint', action='store_true', default=False, help='Print the dataframe') return parser.parse_args() if __name__ == '__main__': runstrat()
Added line is:
dataframe.index = pd.to_datetime(dataframe.index)
3/ Trouble
When displayed by backtrader in the log, I can see that date & time is lost. It prints date starting from 1970-01-01 00:25:54
Well, in 1970, I am quite sure Bitcoin was not existing yet... :)I can further add that time data in saved CSV file reads for instance:
Timestamp,Open,High,Low,Close,Volume 1554897600000,5197.01,5245.0,5195.33,5234.42,1635.59772 1554901200000,5234.42,5246.39,5216.17,5231.85,1342.13614 1554904800000,5231.85,5237.33,5220.0,5231.0,899.434882 1554908400000,5231.01,5246.0,5212.04,5245.06,1175.720961
Here is what it looks like in the log after reading by backtrader:
Open High Low Close Volume Timestamp 1970-01-01 00:25:54.897600 5197.01 5245.00 5195.33 5234.42 1635.597720 1970-01-01 00:25:54.901200 5234.42 5246.39 5216.17 5231.85 1342.136140 1970-01-01 00:25:54.904800 5231.85 5237.33 5220.00 5231.00 899.434882 1970-01-01 00:25:54.908400 5231.01 5246.00 5212.04 5245.06 1175.720961
I understand I am losing the date/time information when reading back the CSV data.
Please, does anyone know what would be the trick I am missing (probably a whole lot...)I thank you in advance for your help.
Have a good day.
Best regards,
Pierre -
@pierrot said in Trouble with date format using backtrader & panda:
Here is what it looks like in the log after reading by backtrader:
Open High Low Close Volume Timestamp 1970-01-01 00:25:54.897600 5197.01 5245.00 5195.33 5234.42 1635.597720 1970-01-01 00:25:54.901200 5234.42 5246.39 5216.17 5231.85 1342.136140 1970-01-01 00:25:54.904800 5231.85 5237.33 5220.00 5231.00 899.434882 1970-01-01 00:25:54.908400 5231.01 5246.00 5212.04 5245.06 1175.720961
That is not the backtrader log. It is the result of this
@pierrot said in Trouble with date format using backtrader & panda:
if not args.noprint: print('--------------------------------------------------') print(dataframe) print('--------------------------------------------------')
It's the actual content of the dataframe BEFORE you have loaded it as a data feed into backtrader. It is obvious that if the dataframe says the year is
1970
, the same is going to happen inside the backtest.You need to help
pandas
to understand what you are actually using as input. Try the followingdataframe.index = pd.to_datetime(dataframe.index, unit='ms')
-
Hi @backtrader,
Thank you very much for your reply.
I have just found out the solution, and am coming to explain it.
You are perfectly right, the trouble comes from the date format that is written in the csv file 1st.So to the interested reader, no modification is required in backtrader python script.
Only a single line does the job in the script to write the csv file, that is found in backtest-rookies blog:data = [[exchange.iso8601(candle[0])] + candle[1:] for candle in data]
solution comes actually from CCXT bugtracker:
https://github.com/ccxt/ccxt/issues/4478I will post this as a comment to the backtest-rookies blog as well.
Thanks again @backtrader
Have a good afternoon.
Best regards,
Pierre