Nightmarish Quantitative Errors Before Christmas

We made a horrible error in our previous post and would like to correct it before Christmas comes and any of our readers goes over the original notebook and is misled. Also we will present a word of warning for a pandas function that due to its ease of use , assumptions being the mothers of all errors, is prone to misuse and we, misused without a thinking twice, not even once.


We will illustrate the past and future price change error with this example that can be also of help for first time users of the research environment in Quantconnect.


The offending code is this:

time_frames = [1,3,5,10,15,22,66,132]
# P for price, R for returns, D for direction of the price change:
for column in price_history.columns:
    for frame in time_frames:
        price_history['P_'+str(frame)+"_"+column+'_FUT'] =  price_history[column].shift(-frame)
        price_history['R_'+str(frame)+"_"+column+'_FUT'] = price_history[column].pct_change(-frame)
        price_history['D_'+str(frame)+"_"+column+'_FUT'] = price_history[column].pct_change(-frame) > 0
        price_history['R_'+str(frame)+"_"+column+'_PAST'] = price_history[column].pct_change(frame)
        price_history['D_'+str(frame)+"_"+column+'_PAST'] = price_history[column].pct_change(frame) > 0

When computing our targets, the future returns and direction, we use "-frame" to indicate to move the index of the frame in the negative direction, then we take the percentage change, expecting the change with respect to the future, and that is what we obtain, the reference is not altered and if the future is a higher value, the returns will be negative when calculated thus. Error. True will indicate that the price dropped, False that the price went up. The correct code is:

time_frames = [1,3,5,10,15,22,66,132]
# P for price, R for returns, D for direction of the price change:
for column in price_history.columns:
    for frame in time_frames:
        price_history['P_'+str(frame)+"_"+column+'_FUT'] =  price_history[column].shift(-frame)
        price_history['R_'+str(frame)+"_"+column+'_FUT'] = -price_history[column].pct_change(-frame)
        price_history['D_'+str(frame)+"_"+column+'_FUT'] = -price_history[column].pct_change(-frame) > 0
        price_history['R_'+str(frame)+"_"+column+'_PAST'] = price_history[column].pct_change(frame)
        price_history['D_'+str(frame)+"_"+column+'_PAST'] = price_history[column].pct_change(frame) > 0

The error is difficult to catch in research, no strange values are observed in the predictive power of the data, the machine learning model does not really care what the direction is, just that whatever it is, it can be predicted with a certain consistency. The problem comes in the next phase: when the backtest is happening we link the prediction of the machine learning model as "price going up" when True, "down" when false, and we meant the absolute opposite.


With the changes made, the dataframes now show future and past with the correct directionality:


Remember that information in ostirion.net does not constitute financial advice, we do not hold positions in any of the companies or assets that we mention in our posts at the time of posting. If you are in need of algorithmic model development, deployment, verification or validation do not hesitate and contact us. We will be also glad to help you with your predictive machine learning or artificial intelligence challenges.

11 views0 comments

© 2019 Ostirion

-Honest work & Mathematical precision-