function_map_seq()
is the engine that powers functions such as
grim_map_seq()
. It creates new, "factory-made" functions that apply
consistency tests such as GRIM or GRIMMER to sequences of specified
variables. The sequences are centered around the reported values of those
variables.
By default, only inconsistent values are dispersed from and tested. This provides an easy and powerful way to assess whether small errors in computing or reporting may be responsible for inconsistencies in published statistics.
For background and more examples, see the sequence mapper section of Consistency tests in depth.
Usage
function_map_seq(
.fun,
.var = Inf,
.reported,
.name_test,
.name_key_result = "consistency",
.name_class = NULL,
.args_disabled = NULL,
.dispersion = 1:5,
.out_min = "auto",
.out_max = NULL,
.include_reported = FALSE,
.include_consistent = FALSE,
...
)
Arguments
- .fun
Function such as
grim_map()
, or one made byfunction_map()
: It will be used to test columns in a data frame for consistency. Test results are logical and need to be contained in a column called"consistency"
that is added to the input data frame. This modified data frame is then returned by.fun
.- .var
String. Variables that will be dispersed by the manufactured function. Defaults to
.reported
.- .reported
String. All variables the manufactured function can disperse in principle.
- .name_test
String (length 1). The name of the consistency test, such as
"GRIM"
, to be optionally shown in a message when using the manufactured function.- .name_key_result
(Experimental) Optionally, a single string that will be the name of the key result column in the output. Default is
"consistency"
.- .name_class
String. If specified, the tibbles returned by the manufactured function will inherit this string as an S3 class. Default is
NULL
, i.e., no extra class.- .args_disabled
String. Optionally, names of the basic
*_map()
function's arguments. These arguments will throw an error if specified when calling the factory-made function.- .dispersion
Numeric. Sequence with steps up and down from the reported values. It will be adjusted to these values' decimal level. For example, with a reported
8.34
, the step size is0.01
. Default is1:5
, for five steps up and down.- .out_min, .out_max
If specified when calling a factory-made function, output will be restricted so that it's not below
.out_min
or above.out_max
. Defaults are"auto"
for.out_min
, i.e., a minimum of one decimal unit above zero; andNULL
for.out_max
, i.e., no maximum.- .include_reported
Logical. Should the reported values themselves be included in the sequences originating from them? Default is
FALSE
because this might be redundant and bias the results.- .include_consistent
Logical. Should the function also process consistent cases (from among those reported), not just inconsistent ones? Default is
FALSE
because the focus should be on clarifying inconsistencies.- ...
These dots must be empty.
Value
A function such as those below. ("Testable statistics" are variables
that can be selected via var
, and are then varied. All variables except
for those in parentheses are selected by default.)
Manufactured function | Testable statistics | Test vignette |
grim_map_seq() | "x" , "n" , ("items" ) | vignette("grim") |
grimmer_map_seq() | "x" , "sd" , "n" , ("items" ) | vignette("grimmer") |
debit_map_seq() | "x" , "sd" , "n" | vignette("debit") |
The factory-made function will also have dots, ...
, to pass arguments
down to .fun
, i.e., the basic mapper function such as grim_map()
.
Details
All arguments of function_map_seq()
set the defaults for the
arguments in the manufactured function. They can still be specified
differently when calling the latter.
If functions created this way are exported from other packages, they should be written as if they were created with purrr adverbs; see explanations there, and examples in the export section of Consistency tests in depth.
This function is a so-called function factory: It produces other functions,
such as grim_map_seq()
. More specifically, it is a function operator
because it also takes functions as inputs, such as grim_map()
. See
Wickham (2019, ch. 10-11).
Conventions
The name of a function returned by
function_map_seq()
should mechanically follow from that of
the input function. For example, grim_map_seq()
derives
from grim_map()
. This pattern fits best if the input function itself
is named after the test it performs on a data frame, followed by _map
:
grim_map()
applies GRIM, grimmer_map()
applies GRIMMER, etc.
Much the same is true for the classes of data frames returned by the
manufactured function via the .name_class
argument of
function_map_seq()
. It should be the function's own name preceded
by the name of the package that contains it, or by an acronym of that
package's name. Therefore, some existing classes are
scr_grim_map_seq
and scr_grimmer_map_seq
.
References
Wickham, H. (2019). Advanced R (Second Edition). CRC Press/Taylor and Francis Group. https://adv-r.hadley.nz/index.html
Examples
# Function definition of `grim_map_seq()`:
grim_map_seq <- function_map_seq(
.fun = grim_map,
.reported = c("x", "n"),
.name_test = "GRIM",
)