January 22, 2013

Nate Silver’s The signal and the noise

TL;DR Interesting and entertaining book. Fun anecdotes and stories. Paradoxically the narrative is a bit noisy and perhaps lacking of a more formal approach to some topics (frequentism vs bayesianism). Nevertheless very enjoyable and very stimulating.

Key takeaways from the book

  • We are designed to “find patterns in random noise”. Information overload forces us to simplify the world which introduces a lot of bias.

    [But] men may construe things after their fashion

    clean from the purpose of the things themselves

    The tragedy of Julius CaesarShakespeare.

  • We need models and theory into which the data makes sense. Data mining is not enough. Contrary to Chris Anderson’s Data Deluge article in which he says “With enough data, the numbers speak for themselves.”.

  • Incentives not to predict correctly : rating agencies and the housing bubble

  • Risk vs Uncertainty – Pearl Harbor’s example very telling

    There is a tendency in our planning to confuse the unfamiliar with the improbable. The contingency we have not considered seriously looks strange; what looks strange is thought improbable; what is improbable need not be considered seriously.

  • The Bayesian approach.

    The explanation of the Bayesian approach is didactic but then it enters into a machinean divide between Frequentist and Bayesian.

    This is maybe a part of the book where Nate Silver is in contradiction with himself. Frequentist approach is described as the devil:

    > Essentially, the frequentist approach toward statistics seeks to wash its hands of the reason that predictions most often go wrong: human error. It views uncertainty as something intrinsic to the experiment rather than something intrinsic to our ability to understand the real world.

    And at the same time, Nate Silver uses a lot of the “frequentist” tools in his analysis. Somehow Nate Silver’s “frequentist” term describes both an “evil” mindset and an extremely useful statistical theory and toolbox. With respect to the quote above, I would counter that considering uncertainty in the experiment in cases where the we do understand the “world” around the experiment works. Plus in practice we often use both “bayesian” and “frequentist” approaches. More thoughts on this in Normal Deviate’s blog post

I particularly liked:

  • the part on trading incentives and herding

    The trader sells but the market rises. This scenario, however, is a disaster. Not only will the trader have significantly underperformed his peers—he’ll have done so after having stuck his neck out and screaming that they were fools. It is extremely likely that he will be fired. And he will not be well-liked, so his prospects for future employment will be dim. His career earnings potential will have been dramatically reduced.

  • the tale of Kasparov vs Big Blue. Really fascinating.

  • the analysis of Poker.

NB: Nate Silver is best known for his predictions for the last US presidential election. During the campaign, Nate Silver has been analyzing polls and estimating probabilities for Obama or Romney to win. His blog 538 (FiveThirtyEight blog) has been widely praised for its success in election forecasts.

The Signal and the Noise: Why Most Predictions Fail but Some Don’t