Skip to contents

Software for error detection goes well beyond scrutiny. This vignette presents broadly similar packages and apps, with no claim to completeness.

Please contact me if you know about relevant software that isn’t listed here (email: jung-lukas@gmx.net).

  • For good reason, statcheck by Sacha Epskamp and Michèle Nuijten is the best-known error detection software. It reconstructs p-values and tests them for consistency with their respective statistic, such as t or F. Even better, it operates on PDF files automatically, enabling users to scan massive amounts of published articles. Steve Haroz built a simple edition of the statcheck web app.

  • James Heathers’ SPRITE algorithm reconstructs possible distributions of raw data from summary statistics. James also wrote a light introduction to SPRITE. For R users, it was implemented in rsprite2 by Lukas Wallrich, building up on code by Nick Brown. Jordan Anaya developed a Python-based SPRITE app.

  • The R package validate by Mark P.J. van der Loo provides numerous tools for data checking.

  • The delta-F test for linearity, a.k.a. the “Förster test”, was implemented in Dale J. Barr’s R package forsterUVA.

  • Several R packages leverage the Benford distribution of naturally occurring numbers to assess whether reported numbers are, in fact, natural. These packages include:

    • benford.analysis by Carlos Cinelli contains various sophisticated tools for inspecting data using the Benford distribution.

    • jfa by Koen Derks offers a full statistical auditing suite (including Benford analysis).

  • Emerging from the Pruitt investigations, there is now R software for analyzing sequences:

    • The package twopointzerothree (by an anonymous developer) checks data for sequences of perfectly correlated numbers. These numbers are either duplicates of each other or they are duplicates offset by some constant amount; hence the name.

    • Similarly, the sequenceSniffer app by Anne Rutten detects repetitions in sequences.