20130620 - Misinterpretation of 'p' (2000) (3e)

[Data] [<Normal page] [PEREZGONZALEZ Jose D [ed] (2013). Misinterpretation of 'p' (2000) (3e)5. Knowledge (ISSN 1177-4576), 2013, pages 103-105.] [DOI]

Misinterpretations of 'p' and 'sig'

Haller and Krauss (20002) carried out a study on common misinterpretations of tests of significance among German psychology students and academics, which partly replicates one done by Oakes (19863). Typically, most of these misinterpretations confuse p-values (ie, the probability of the data when assuming that the null hypothesis is true) and, especially, statistical significance7, with the probability of proving or disproving hypotheses (be this the null hypothesis or an alternative hypothesis). Another misinterpretation is the so-called "replication fallacy", which occurs when the probability of the data is assumed to represent the probability of finding similar results if the research were to be repeated.

Haller and Krauss found that most participants held at least one misinterpretation out of the six presented (see table 1). They also found that, overall, 100% of psychology students held one or more misinterpretations (mean=2.5), almost 90% of psychology researchers also held one or more misinterpretations (mean=2), and 80% of instructors of statistics in psychology also held one or more misinterpretations (mean=1.9). The authors thought worrisome the high percentage of instructors with misinterpretations, as these may pass those misinterpretations down to students. Another interesting result, one not highlighted by the authors, though, is the high percentage of researchers (including instructors when carrying out and publishing research) with misinterpretations, as these would perpetuate those when publishing, peer-reviewing others' publications, and making research-informed decisions (such as chairing committees, granting funding, etc).

Table 1. Percentages of misinterpretations regarding tests of significance
Common misinterpretations6 Instructors Researchers Students
f % f % f %
Significance disproves the null hypothesis 3 10% 6 15% 15 34%
The p-value informs of the probability of the null hypothesis 5 17% 10 26% 14 32%
Significance proves the alternative hypothesis 3 10% 5 13% 9 20%
The p-value informs of the probability of the alternative hypothesis 10 33% 13 33% 26 59%
'P' informs of the probability of a wrong decision when rejecting the null 22 73% 26 67% 30 68%
The p-value informs of the probability of the results if replicated 11 37% 19 49% 18 41%
(Participants who answered that all of above were false) 6 20% 4 10% 0 0%


Research approach

Replication study using a German sample. The original study had been carried out by Oakes (19863) with a British sample of psychology academics 15 years earlier.


A convenient sample of 113 participants from departments of psychology in 6 German universities. 44 participants were psychology students, 39 participants were research psychologists not involved with teaching statistics, and 30 participants were instructors of statistics in psychology (including lecturers and tutors).


Oakes's (19863) questionnaire translated into German:

  • The questionnaire consisted of a small scenario and six statements. The scenario described a small research with two-independent samples, and provided the relevant results: a t-test with 'p=0.01'.
  • The six statements asked for a true / false decision regarding whether each particular statement reflected a logical interpretation of the results. Unbeknownst to the participants, all statements were false, representing six common misinterpretations of tests of significance.
  • The study also provided the 'hint' that "several or none of the statements may be correct".


Descriptive statistics.

Generalization potential

This particular research was done with a sample of psychology academics and students from different universities in Germany, and its design appears to be more valid than that of previous studies. It also found similar trends than did Oakes, 19863, in the U.K., and Falk and Greenbaum, 19951, in Israel. Thus, these results may be generalizable to the following populations (in order of decreasing generalization power):

  • German, British and Israeli psychology academics and researchers (including students).
  • Psychology professionals trained in German, British and Israeli universities.
  • Psychology professionals and academics elsewhere.
  • Other scientists (especially from the social sciences, medicine and business) which rely on the null hypothesis significance testing (NHST) procedure.
1. FALK Ruma & Charles W GREENBAUM (1995). Significance tests die hard: the amazing persistence of a probabilistic misconception. Theory & Psychology, 1995, volume 5, number 1, pages 75-98. DOI 10.1177/0959354395051004.
2. HALLER Heiko & Stefan KRAUSS (2000). Misinterpretations of significance. A problem students share with their teachers. Methods of Psychological Research Online, 2002, volume 7, number 1, pages 1-20.
3. OAKES Michael (1986). Statistical inference: a commentary for the social and behavioral sciences. John Wiley & Sons (Chichester, UK), 1986.
4. PEREZGONZALEZ Jose D [ed] (2012). Misinterpretation of 'p' (2000) (2e)5. Journal of Knowledge Advancement & Integration (ISSN 1177-4576), 2012, pages 146-148.
+++ Notes +++
5. This edition updates the previous edition4 by estimating frequencies for table 1, thus making the table more comparable across related articles.
6. The original research statements have been rephrased here.
7. The example provided to participants used p=0.01, thus it can be interpreted as having the dual role of a 'p-value' and a 'conventional level of significance'.

Want to know more?

WikiofScience - Hypotheses testing (disambiguation)
This WikiofScience page lists alternative methods for testing the probability of data or hypotheses.
WikiofScience - Null hypothesis significance testing
This WikiofScience page reflects on the pseudoscientific bases of the null hypothesis significance testing (NHST) procedure typically used in the social sciences and medicine.
WikiofScience - Related studies
You can find more information on two related studies on WikiofScience. One was the original study done by Oakes in 1986; the other study was a replication done by Falk and Greenbaum in 1995.


Jose D PEREZGONZALEZ (2013). Massey University, Turitea Campus, Private Bag 11-222, Palmerston North 4442, New Zealand. (JDPerezgonzalezJDPerezgonzalez).

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License