|[Pseudoscience]||[<Normal page] [PEREZGONZALEZ Jose D (2013). Pseudoscience in Goode (2002). Knowledge (ISSN 2324-1624), 2013, pages 127-128.] [DOI]|
Are pilots at risk of accidents due to fatigue?
In 2002, Goode1 carried out research to assess the potential effect of duty hours (as a proxy for fatigue) on human-factors-related accidents in commercial aviation. Although the study kept relatively well to a quasiscientific approach in its methods, it also made some unwarranted pseudoscientific statements as well as misleading ones. Both are summarized in the next section.
- The title ("Are pilots at risk of accidents due to fatigue?") and parts of the abstract ("[…] there is likely to be a reduction in the risk of commercial aviation accidents due to pilot fatigue") suggested a correlation between fatigue and accidents in the study (and both the research problem and the discussion sustained an interpretive bias in favor of such relationship). However, the study did not actually measure fatigue, only duty-hours; thus, it misleads the reader towards a correlation that is unwarranted given the research methods used.
- The author was ambivalent in his use of inferences. On the one hand, he stated that "identifying fatigue in the flight crew exposure data can be done only by inference" (page 312) while also posing the assumption that if results were not significant then "one could infer that pilot human factor accidents are not affected by work schedule parameters" (pages 309, 310, 311). On the other hand, however, when results turned out significant, he stopped inferring, being instead positive that "there is a discernible pattern of increased probability of an accident the greater the hours of duty time for pilots" (pages 311, 312). In reality, inferential analysis is applicable for inferring either way, not just when results are not significant.
- Furthermore, the assumption that if distributions were as expected under the null hypothesis, then "one could infer that pilot human factor accidents are not affected by work schedule parameters" (pages 309, 310, 311) was also incorrect: not achieving statistical significance is not proof in favor of the null hypothesis.
- The author stated that chi-square test results exceeding the 5% significance threshold were highly significant, and that because of that "there is a discernible pattern of increased probability of an accident the greater the hours of duty time for pilots" (pages 311, 312). Such statement seemed to suggest two incorrect conclusions if based only on such statistical significance: that the results definitely showed that the human factors accidents sampled were affected by the work schedule parameters measured, and that the results had practical importance. Both are incorrect insofar such decisions are not warranted by any mathematical result but are made by the researcher, taking into consideration not only the statistics but also the overall quality of the research methods.
- The graph used on page 312 was misleading. It plotted two set of proportions using different scales. The percentage scale for exposure and accidents run from 0% to 40%, and this was matched to a different scale showing relative accident proportion running from 0 to 6.
Stepping up to science
This study was exploratory and correlational in nature, and did not provide replication of data. Thus, it falls under the quasiscientific category. The following are steps that the author could have taken to resolve above evidence of pseudoscience:
- The title, abstract, research problem and discussion should have been more specific to the variables measured by the study: contemporary duty hours (as a proxy for historical duty hours) and historical accidents. Any connection to fatigue in the study should have been clearly stated as a theoretical or reasonable one, not as an empirical one.
- Inferences work both ways: for non-significant results as well as for significant ones. Although the author was correct in using the philosophical frame of 'inferring' from results to conclusions instead of the the one of 'proving' or 'disproving' hypotheses, such philosophical frame needs to be extended to interpreting significant results as well.
- The correct interpretation for a lack of statistical significance is that results are not significant. This can be due to a myriad of reasons, chief among them methodological ones (such as a lack of research power). It can also be due to a lack of correlation between the research variables, from where it could be inferred that there was no correlation between those variables in the population at large.
- The correct interpretation for a show of statistical significance is that the results are significant. This is to be interpreted as follows: either a rare association that happens less than 5% of the time when no association exists actually occurred, or it can be inferred that there was a true association between duty time and aviation accidents. The latter is a logical inference made given the low probability of the results and the methodological constrains of the research.
- Importance (or practical significance) does not depend on statistical significance but on the effect size of the results and a reasonable interpretation of such effect size.
- The graphical representation of data should be clear and relevant, avoiding misleading representations of the results.
- This study would move onto scientific territory if it were replicated elsewhere, preferably with samples other than American airlines.
Want to know more?
- PubMed - Original article's abstract
- Access to the original article can be obtained via ScienceDirect.
- WikiofScience - Relationship between pilot duty hours and accidents
- This WikiofScience page offers a review of Goode's work, once the pseudoscience evidence is remove.
Jose D PEREZGONZALEZ (2013). Massey University, Turitea Campus, Private Bag 11-222, Palmerston North 4442, New Zealand. (JDPerezgonzalez).