Friday, 25 April 2008

Has the Commitments Data Lost Its Relevance? Uh, No

I was doing some out-of-sample testing on my trading setup for the NASDAQ 100 index based on the Commitments of Traders reports, and I found something interesting. It dispels the notion that this weekly government data on trader positioning in major markets has lost relevance in recent years. The NDX setup, which combines the signals from a couple of superior setups for the commercial traders and small traders, achieved a 20.8-percent compound annual growth rate from 2003 to 2007 (or 19.7 percent if a 0.2-percent commission is included per trade). During that time, the index itself had a CAGR of 15.4 percent. Not bad when you consider that the setup was in the market only 69 percent of the time. (When the two signals don't concur, the setup goes to cash.) The details of this setup are in a note at the end of this post, so you can replicate the signals.

On a side note, I think it's fairly obvious it's harder to beat the market during a bull period like we've seen since 2003. That fact alone would tend to reduce the margin of outperformance. The real question is how does the strategy do once the bull ends, as it may be doing now. A connected question is, how do you know what kind of market it is in the first place? I think the best longer-term strategy will perform well in a bull, bear or congested market. So even if we could prove the COTs data has lost its power since 2003, which as I show above it clearly hasn't, I think you'd want to see how it does in a bear market before jumping to any conclusions.

As for the out-of-sample testing on my NDX setup, it was also interesting. This kind of testing is done to help see how the setup would have done in real-live trading. In six tests (I plan to add a few more as I refine my testing further), the setup achieved an out-of-sample CAGR of 91 percent of the in-sample period. In other words, if CAGR during the test window was 10 percent, then the out-of-sample CAGR would have been 9.1 percent. Out-of-sample "efficiency" above 60 to 70 percent is considered robust. Out-of-sample efficiency for regressed annual return was 76 percent. The out-of-sample efficiency for the Sharpe score was 1.5, with an out-of-sample Robust Sharpe efficiency of 1.1 and an out-of-sample largest drawdown only 38 percent as large as the in-sample drawdown. On average, the out-of-sample efficiency of all these scores was 1.4 - meaning that it was 40 percent better than the backtesting window.

I also like to get a more robust look at all this out-of-sample data by dividing it by its standard deviation in the six out-of-sample tests. I find this evens out any unusual data spikes resulting from a small number of exceptionally strong or weak trades in a single test window. By that measure, the average out-of-sample score was even better - 2.0 - which suggests that the out-of-sample data was quite consistent across the six tests, another good sign. I intend to start publishing more data like this for all my setups on my Latest Signals page table. I've also been improving my testing procedures to optimize my search for the top setups and hope to have an improve S&P 500 setup to announce soon. See you in a little while with this afternoon's COTs data update.

NDX setup parameter values: My trading setup for the NASDAQ 100 combines the signals from two different setups. The first setup goes long when the commercial trader net percentage-of-open-interest position is -1.05 standard deviations or greater (higher) than its three-week moving average. It goes short when the position is -1.05 standard deviations or lower than the moving average. The setup uses a five-week trade delay. The second setup goes long when the small trader net percentage-of-open-interest position is 0.45 standard deviations or lower than its 14-week moving average and goes short when the position is 1.4 standard deviations or higher than the average. This setup has no trade delay. I'll post a spreadsheet for this setup some time early next week.


Ryan said...

Interesting analysis.

As a Pardo fan, do evaluate the robustness of model by trying parameter set on different markets? If no why not?

Do you use daily data or weekly data? I have realized that I am greatly underestimating my drawdown by using only weekly open and close.


Alex Roslin said...

Hi Ryan,

Thanks for your message. Pardo actually recommends using different parameter values for different markets. I blogged about this on March 19, here:

I use weekly price data for testing, but I would consider a daily close below my stop level a reason to exit a trade.