Skip to contents

Consistency tests

Check the numerical consistency of summary statistics; three tests currently implemented —

GRIM

Test reported means or percentages for numerical consistency with reported sample sizes

grim()
The GRIM test (granularity-related inconsistency of means)
grim_map()
GRIM-test many cases at once
grim_map_seq()
GRIM-testing with dispersed inputs
grim_map_total_n()
GRIM-testing with hypothetical group sizes
grim_plot()
Visualize GRIM test results
grim_probability() grim_ratio() grim_total()
Possible GRIM inconsistencies
grim_granularity() grim_items()
Granularity of non-continuous scales

GRIMMER

Test reported means and standard deviations for numerical consistency with reported sample sizes

grimmer()
The GRIMMER test (granularity-related inconsistency of means mapped to error repeats)
grimmer_map()
GRIMMER-test many cases at once
grimmer_map_seq()
GRIMMER-testing with dispersed inputs
grimmer_map_total_n()
GRIMMER-testing with hypothetical group sizes

DEBIT

Test reported means and standard deviations of binary data for numerical consistency with reported sample sizes

debit()
The DEBIT (descriptive binary) test
debit_map()
Apply DEBIT to many cases
debit_map_seq()
Using DEBIT with dispersed inputs
debit_map_total_n()
Use DEBIT with hypothetical group sizes
debit_plot()
Visualize DEBIT results
sd_binary_groups() sd_binary_0_n() sd_binary_1_n() sd_binary_mean_n()
Standard deviation of binary data

Duplicate detection

Blunt functions to tentatively discover and count duplicate numeric values; interpret results with care

duplicate_count()
Count duplicate values
duplicate_count_colpair()
Count duplicate values by column
duplicate_tally()
Count duplicates at each observation

Summarize scrutiny tests

Follow up on scrutiny’s testing functions by computing specific summary statistics

audit()
Summarize scrutiny objects
audit_seq() audit_total_n()
Summarize output of sequence mappers and total-n mappers

Developer tools

Helpers for implementing error detection techniques

Function factories

Easily create new mapper functions for consistency tests

function_map()
Create new *_map() functions
function_map_seq()
Create new *_map_seq() functions
function_map_total_n()
Create new *_map_total_n() functions
reverse_map_seq()
Reverse the *_map_seq() process
reverse_map_total_n()
Reverse the *_map_total_n() process

Rounding

Reconstruct rounding procedures

round_up_from() round_down_from() round_up() round_down()
Common rounding procedures
round_ceiling() round_floor() round_trunc() anti_trunc() round_anti_trunc()
Uncommon rounding procedures
reround()
General interface to reconstructing rounded numbers
reround_to_fraction() reround_to_fraction_level()
Generalized rounding to the nearest fraction of a specified denominator
unround()
Reconstruct rounding bounds
rounding_bias()
Compute rounding bias

Counting decimal places

decimal_places() decimal_places_scalar()
Count decimal places
decimal_places_df()
Count decimal places in a data frame

Sequence generation

seq_endpoint() seq_distance() seq_endpoint_df() seq_distance_df()
Sequence generation at decimal level
seq_test_ranking()
Rank sequence test results
seq_disperse() seq_disperse_df()
Sequence generation with dispersion at decimal level
disperse() disperse2() disperse_total()
Vary hypothetical group sizes
seq_length() `seq_length<-`()
Set sequence length
is_seq_linear() is_seq_ascending() is_seq_descending() is_seq_dispersed()
Is a vector a certain kind of sequence?

Documentation templates

Return standardized building blocks for documenting specific kinds of functions

write_doc_audit()
Documentation template for audit()
write_doc_audit_seq()
Documentation template for audit_seq()
write_doc_audit_total_n()
Documentation template for audit_total_n()
write_doc_factory_map_conventions()
Documentation template for function factory conventions

Consistency test helpers

Call these helpers inside of your own functions that implement consistency tests at various levels

check_mapper_input_colnames()
Check that a mapper's input has correct column names
manage_helper_col()
Helper column operations
manage_key_colnames()
Enable name-independent key column identification
unnest_consistency_cols()
Unnest a test result column
audit_cols_minimal()
Compute minimal audit() summaries
check_audit_special()
Alert user if more specific audit_*() summaries are available

Data frame predicates

Test for mapper function output

is_map_df() is_map_basic_df() is_map_seq_df() is_map_total_n_df()
Is an object a consistency test output tibble?

Testing for subsets / supersets

Predicate functions for logical relations between vectors, supporting flexible data entry and tidy evaluation

Testing for numeric-like vectors

is_numeric_like()
Test whether a vector is numeric or coercible to numeric

Data wrangling

Automate formatting tasks with a special relevance to error detection

restore_zeros() restore_zeros_df()
Restore trailing zeros
split_by_parens()
Split columns by parentheses, brackets, braces, or similar
before_parens() inside_parens()
Extract substrings from before and inside parentheses
row_to_colnames()
Turn row values into column names

Datasets

Small example datasets to demonstrate how grim_map() and debit_map() work

pigs1
Means and sample sizes for GRIM-testing
pigs2
Percentages and sample sizes for GRIM-testing
pigs3
Binary means and standard deviations for using DEBIT
pigs4
Data with duplications
pigs5
Means, SDs, and sample sizes for GRIMMER-testing

Superseded

duplicate_detect() superseded
Detect duplicate values