Pitfalls in statistics

2010-03-19

Statistics are a powerful thing, but they are easily to use sloppily and incorrectly. This article identifies a few important pitfalls related to statistical significance, problems with testing multiple hypotheses simultaneously, issues with clinical trials and meta-analyses, and the usefulness of Bayes’ Theorem.

While statistics are a far sounder way of knowing about the world than anecdotes and intuition, it is definitely worrisome that they are so poorly understood. My graduate statistics course at Oxford is evidence enough of just how poorly the subject can be taught.

Pearl March 22, 2010 at 2:26 pm

before I go after those and forget again what I came to mention, you saw this? http://www.tarsandswatch.org/tar-nation-play-game-now

Milan March 22, 2010 at 2:35 pm

Some other statistically related things:

How to lie with statistics (the second entry is mine)

The seductiveness of the bell curve

The Black Swan

Milan March 22, 2010 at 2:37 pm

Thanks also for the link to the game.

Some more on climate change and games:

UK dev and climate scientist in £1m project

Chevron’s climate game

Stabilization Wedge Game

. July 15, 2010 at 1:49 pm

By the Numbers
A terrific new book of essays encourages us all to be skeptical about statistics.
By Jack Shafer
Posted Wednesday, July 14, 2010, at 5:00 PM ET

“The creation, selection, promotion, and proliferation of numbers are … the stuff of politics,” the editors write in their introduction. No debate lasts very long without a reference to data, and as the numbers boil their way into the argument, you must challenge them or be burned blind by them. The essays presented in Sex, Drugs, and Body Counts—about human trafficking, the Bosnian death count, the Darfur genocide, armed conflict, drugs, terrorism, and more—counsel exactly that sort of skepticism. Here are the questions the book’s editors say readers should ask when confronted with numbers:

Where do the estimates come from, who produces them, what legitimating function do they serve, and how (if at all) are they explained in official reporting? What are the implications and consequences (intended and un-intended) of choosing one set of numbers over another? To what degree are the numbers accepted or challenged, and why? What purpose do they serve?

Later, they write:

Numbers should provoke especially tough questions when the activity being measured is secretive, hidden, and clandestine. “How could they know that? How could they measure that?”

Often, the editors write, even the most rigorous-seeming statistics conceal squishy measurements. Inflated numbers are designed to create the sense that something must be done now. Depressed counts are intended to convince the recipients that the problem is too small to worry about. Whether it’s body counts in Iraq or kilos of Colombian drugs, the creators and disseminators of the numbers usually have greater interest in their size than their veracity.

. July 16, 2010 at 2:18 am

The politics of census forms

Editorial
Friday, july 16, 2010

http://www.thestar.com/article/836488–the-politics-of-census-forms

“The stunning announcement — without any public consultation — has caught Canadians by surprise. But it has also brought them together in unexpected ways.

The public sees this move for what it is — an echo of the paranoid impulses of America’s far right, which has long campaigned against the U.S. census as a symbol of government intrusiveness. Clement claims some MPs get complaints from some constituents. But where are the beefs? The federal Privacy Commissioner says only a handful of concerns have been raised in recent years.

There is no groundswell of anti-census sentiment in the land. To the contrary, Clement’s pandering has sparked a united outcry that cuts across ethnic, linguistic, political and class lines: be they bankers or social activists, they want to keep the census intact. This fiasco has also brought the opposition parties together. Harper should heed the cry and reverse the decision.”

Milan December 12, 2011 at 8:48 pm

This is a nice illustration of the limitations of summary statistics:

Anscombe’s quartet