top of page

Correlation Index Using a Parameter Grid

In our last analysis, we found that the correlation index for the top traded stocks in the SPY index could potentially contain information about its future behavior. Using the 60 trading-day realized correlation for an arbitrary number of companies, we found that entering or exiting positions in the index did not improve our investment performance. It did not hurt it either.

So, how much good and how much harm can we do by using this correlation index to fuel our decisions regarding SPY? We will try looking at the performance extremes inside a parameter grid for the model. This is usually called an optimization procedure in many fields; in the field of financial markets, it generally leads to overfitting instead of optimization.

Optimization tools such as this one present in Quantconnect are often misunderstood, misused, and abuse. Let us first use it and later discuss the implications of outstanding results and horrible results and interpret them coldly, without hatred or love the false discovery.

We will modify our previous model at the bottom of this previous post to accept external parameters that can be altered in a grid. In this specific tool, parameters are read from a sidebar integrated into the tool that will automatically launch subsequent backtests. Other tools read parameters from text files with various formats that generate regular variables for the code to read. We will initially input the following values as parameters:

  • The number of stocks that conform the "top".

  • The period used to compute the correlation.

  • The lowest correlation value to generate a "sell" signal.

The code for parameter input is this:

# Get Parameters:
# Universe size that conforms the top:
    self.n_stocks = int(self.GetParameter("n_stocks"))
    self.Log('Defaulting stocks parameter.')
    self.n_stocks = 15

# Period to compute correlation
    self.corr_period = int(self.GetParameter("corr_period"))
    self.Log('Defaulting period parameter.')
    self.corr_period = 60
# Minimum correlation for "sell" signal:
    self.min_corr = float(self.GetParameter("min_corr"))
    self.Log('Defaulting minimum correlation parameter.')
    self.min_corr = 0.2
# Risk control parameters:
    self.rc = float(self.GetParameter("risk_factor"))
    self.Log('Defaulting risk parameter.')
    self.rc = 0.03

We are trying to read each value from the parameter storage with a python try-except block, providing a minimum warning to the user if a value is not found and defaulting to some value. There is an extra parameter in our code, the risk factor, that we are not using today and is there for future models. Also, we are careful with the type of variable we are expecting from the parameter grid to avoid errors. The grid parameters are cast into the appropriate type.

These parameter values are part of our algorithm's signal generation and the universe selection modules, so these values have to end up in the correct class. We have to pass them at the initialization of the algorithm:


And make sure they find their way into the initialization of each of the modules and become attributes of the corresponding class:

def __init__(self, market, corr_period, min_corr, risk):
        self.symbol_data = {} = market

        # These are taken from parameter grid:
        self.period = corr_period
        self.min_corr = min_corr

        # Normal, non-parametric variables:
        self.Name = 'Correlation at Top'
        self.fut_ret = risk  
        # Future returns are not calculated.
        self.counter = False
        self.refresh = 2

The model is all set for a grid optimization test. We will first perform a trial run with an exaggerated parameter, setting as an example the minimum correlation to 0.8 as our parameter input:

Leading to these results where, as expected, the minimum correlation value is too high, and a buy signal is triggered very infrequently in a five-year period:

Running the optimization requires defining the parameter variables we want to iterate through and the value of the step. For a quick and inexpensive sample (around $1 in computation costs), we will iterate for the correlation period and the minimum correlation values, leaving the number of top stocks fixed at 15:

The analysis, being a daily data analysis with 16 tickers in the universe, takes just a few minutes. We have 30 possible combinations of correlation periods and minimum correlations. At the end of the optimization we are gifted with this overwhelmingly informing chart:

We can see that for each combination of correlation period and minimum correlation the equity at the end of the period varies up to 100%. Where are the good values? Are these values consistent with our hypothesis that low correlations lead to market drops? The parameter grid results table gives us this information:

In terms of the Sharpe ratio, the merits of the model are concentrated in the lowest correlation values. This is consistent with our hypothesis and past observations. Still, it does not ensure that in the future, this behavior will hold. In any case, the model is just slightly above the performance of the SPY for these 5 years, a little bit better than a buy and hold strategy. We use this grid optimization as a sanity check for the model, not as a verification of the strategy. We could not scientifically claim that these results are expected in the future just with this data. What we can do now, with this verification, is transposing the model with these "best" parameters, correlation period of 20 and minimum correlation of 0.2 to a longer period. We are not hating or loving the optimization; we are just using it.

Lets simulate the period from 2005 to now using the low correlation at the top strategy:

In the long run, the correlation at the top index is better at preventing losses during crisis periods. It does a good job, especially during the 2008 subprime crisis when the correlation index could show the symptoms of something wrong in certain sectors of the economy, hopefully present in the top 15 SPY stocks. Sudden, full system crashes such as the COVID19 panic are more difficult to spot by the model, as the correlation is maintained: everyone is dropping hard, so the model thinks that there is nothing wrong. This long period trial does outperform the buy-and-hold strategy at an additional 200% total return, with a Sharpe ratio of 0.8 compared to the 0.5-ish of the historical SPY holding strategy.

In the very long run, this strategy, as analyzed via optimization and window expansion, may have some merit. In our next post, the last in this correlation at the top series, we will add momentum and volatility measures, at their simplest accepted definitions, to better inform the model. We will set up a meta-optimization to generate the false discovery of the best combination of correlation at the top, momentum, and volatility that totally beats and destroys the market... if you have a time machine. We will also find the worst of the discoveries, the one that generates the most losses, so that we have the full picture of the optimization process and do not fool ourselves into a false discovery or reject a valid one. There is a trade-off between the acceptance of the bad and the rejection of the good.

Information in does not constitute financial advice; we do not hold positions in any of the companies or assets that we mention in our posts at the time of posting. If you require quantitative model development, deployment, verification, or validation, do not hesitate and contact us. We will also be glad to help you with your machine learning or artificial intelligence challenges when applied to asset management, trading, or risk evaluations.

30 views0 comments


bottom of page