In the November 2011 Issue of Psychological Science, Joseph P. Simmons, Leif D. Nelson and Uri Simonsohn published an interesting article about the undisclosed flexibility in data collection, analysis, and reporting that leads to an increase of actual false-positive rates in psychology. The researchers stated that it is unacceptably easy to publish “statistically significant” evidence consistent with any hypothesis.
The major problem they found is what they call the “researcher degrees of freedom” – or to be more correct: the decisions researchers making within a research process: e.g. what observations should be included or rejected? How much data should be collected? Which control variables should be used?
“It is rare, and sometimes impractical, for researchers to make all these decisions beforehand. Rather, it is common (and accepted practice) for researchers to explore various analytic alternatives, to search for a combination that yields “statistical significance,” and to then report only what “worked.”
Simmons et al mentioned that they don’t think that researchers act with malicious intent but that ambiguity in how best to make these decisions and the researcher’s desire to find a statistically significant result are the culprits. In many cases, Simmons et al stated, a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not.
For demonstrating the problem the authors conducted two experiments designed to demonstrate something false: that certain songs can change listeners’ age.
In a first study, the researchers investigated whether listening to a children’s song induces an age contrast, making people feel older. They made some tests and collected data – and got a significant result.
In a second study, they conceptually replicated and extended the first study: They investigated whether listening to a song about older age makes people actually younger. And also for this study they “succeeded”.
But they don’t stopped at this point. The authors used computer simulations of experimental data to estimate how the “degrees of freedom” influence research results. They assessed the impact of four common degrees of freedom:
- the flexibility in choosing among dependent variables,
- the flexibility in choosing the sample size,
- the flexibility of using covariates, and
- the flexibility of reporting subsets of experimental conditions.
…also various combinations of these degrees of freedom were examined.
Readers interested in details should examine the full paper.
For solving the problems discussed, Simmons et al are suggesting six concrete requirements for authors and four guidelines for reviewers, all of which impose a minimal burden on the publication process.
# Update #
One thing I forgot to mention is that the authors don’t think that posting materials and data might solve the problem. Despite they’re strongly supporting the idea, that all journals requiring authors to make their original materials
and data publicly available they don’t think that the challenge -the researcher’s degree of freedom- will be adressed appropriately, because this would impose too high costs on readers and reviewers to examine the data.
“Readers should not need to download data, load it into their statistical packages, and start running analyses to learn the importance of controlling for father’s age; nor should they need to read pages of additional materials to learn that the researchers simply dropped the “Hot Potato” condition.
Furthermore, if a journal allows the redaction of a condition from the report, for example, it would presumably also allow its redaction from the raw data and “original” materials, making the entire transparency effort futile.”
photo: Bill McIntyre, flickr.com