Stock Market Nirvana: Butter in Bangladesh
Hallelulah to Jason Zweig at The Wall Street Journal for tackling the subject of data mining through his interview with David Leinweber, author of Nerds on Wall Street. All this talk about Goldman Sachs, High Frequency Trading (HFT) and quantitative models is making my head spin and distorting the true value of data modeling. Quantitative modeling should serve as a handy device in your tool-box, not a robotic “black box” solely relied on for buy and sell recommendations. As the article points out, all types of sites and trading platforms are hawking their proprietary tools and models du jour.
The problem with many of these models, even for the ones that work, is that financial market behavior and factors are constantly changing. Therefore any strategy exploiting outsized profits will eventually be discovered by other financial vultures and exploited away. As Mr. Leinweber points out, these models become meaningless if the data is sliced and diced to form manipulated relationships and predictive advice that make no sense.
Butter in Bangladesh: To drive home the shortcomings of data mining, Leinweber uses a powerful example in his book, Nerds on Wall Street, of butter production in Bangladesh. In searching for the most absurd data possible to explain the returns of the S&P 500 index, Leinweiber discovered that butter production in Bangladesh was an excellent predictor of stock market returns, explaining 75% of the variation of historical returns. The Wall Street Journal goes onto add:
By tossing in U.S. cheese production and the total population of sheep in both Bangladesh and the U.S., Mr. Leinweber was able to “predict” past U.S. stock returns with 99% accuracy.
For some money managers, the satirical stab Leinweber was making with the ridiculous analysis was lost in translation – after the results were introduced Leinweber had multiple people request his dairy-sheep model. “A distressing number of people don’t get that it was a joke,” Leinweber sighed.
Super Bowl Crystal Ball: Leinweber is not the first person to discover the illogical use of meaningless factors in quantitative models. Industry observers have noticed stocks tend to perform well in years the old National Football league team wins the Super Bowl. Unfortunately, this year we had two “old” NFL teams play each other (Pittsburgh Steelers and Arizona Cardinals). Oops, I guess we need to readjust those models again.
Other bizarre studies have been done linking stock market performance to the number of nine-year-olds living in the U.S. and another linking positive stock market returns to smog reduction.
Data Mining Avoidance Rules:
1) Sniff Test: The data results have to make sense. Correlation between variables does not necessarily equate to causation.
2) Cut Data into Slices: By dividing the data into pieces, you can see how robust the relationships are across the whole data set.
3) Account for Costs: The results may look wonderful, but the model creator must verify the inclusion of all trading costs, fees, and taxes to increase confidence results will work in the real world.
4) Let Data Brew: What looks good on paper might not work in real life. “If a strategy’s worthwhile,” Mr. Leinweber says, “then it’ll still be worthwhile in six months or a year.”
Not everyone has a PhD in statistics, however you don’t need one to skeptically ask tough questions. Doing so will help avoid the buried land mines in many quantitative models. Happy butter churning…
Wade W. Slome, CFA, CFP®
Plan. Invest. Prosper.
Entry filed under: Education, Financial Markets, Trading. Tags: bangladesh, butter, data mining, David Leinweber, Jason Zweig, models, Nerds on Wall Street, NFL, quantitative modeling, S&P 500, Wade Slome, Wall Street Journal.