P-values in statistics
'P-values' are statistics which describe the probability of obtaining the observed data. They are typically used as an inferential statistic of the probability of observing similar data in the population at large. As such probability, they are expressed as a decimal number ranging from 0 to 1 (eg, p = .025).
Statistical packages normally report 'p-values' as 'p' or as 'significance'. 'P' can be taken as shorthand for 'probability (of the data)'. 'Significance' should be taken to mean 'exact levels of significance' (as described by Gigerenzeer ###). This, however, adds to the confusion of what a 'p-value' is, partly perpetuating a pseudoscientific approach to hypothesis testing.
For the sake of decreasing cognitive workload and increasing clarity, it seems beneficial to separate a 'p-value' from any decision that can be based upon such a value, such as tests of significance or hypothesis testing. Such separation implies both a conceptual separation and a nomenclature separation:
- The conceptual separation would seek to refer to 'p-values' as a descriptive statistic about the probability of data (including posterior probabilities, likelihood, etc). They can, of course, also be used inferentially, as well. 'P-values' should then be reported as an exact statistic, such as means, medians, and standard deviations are2.
- The nomenclature separation would seek to reserve the concept of 'probability (of the data)' or 'p-values' for referring to above, with 'p' being an acceptable shorthand reference under such usage. Other concepts such as 'significance', 'level of significance', or 'exact level of significance' should be discouraged in this context. This should be so because significance refers to the interpretation of a 'p-value' or to a decision made upon it, yet both are independent of the actual statistic itself.
|(Video by KhanAcademy, embedded from YouTube on 31 March 2012)|
Want to know more?