The moder package determines single or multiple modes (most frequent values). By default, its mode_
functions check whether missing values make this impossible, and return NA
in this case. They have no dependencies.
Mode functions fill a gap in measures of central tendency in R. mean()
and median()
are built into the standard library, but there is a lack of properly NA
-sensitive functions for calculating the mode. Use moder for this!
Get started
Find the first mode with mode_first()
Everything is fine here:
mode_first(c(7, 8, 8, 9, 9, 9))
#> [1] 9
But what if some values are missing? Maybe there are so many missings that it’s impossible to tell which value is the most frequent one. If both NA
s below are secretly 2
, then 2
is the (first) mode. Otherwise, 1
is. The mode is unclear, so the function returns NA
:
mode_first(c(1, 1, 2, NA, NA))
#> [1] NA
Ignore NA
s using na.rm = TRUE
if there is a strong rationale for it:
mode_first(c(1, 1, 2, NA, NA), na.rm = TRUE)
#> [1] 1
The next example is different. Even if the NA
stands in for 8
, there will only be three instances of 8
but four instances of 7
. The mode is 7
, independent of the true value behind NA
.
mode_first(c(7, 7, 7, 7, 8, 8, NA))
#> [1] 7
Find all modes with mode_all()
This function captures multiple modes:
If some values are missing but there would be multiple modes when ignoring NA
s, mode_all()
returns NA
. That’s because missings can easily create an imbalance between the equally-frequent known values:
If NA
masks either 1
or 2
, that number is the (single) mode. As before, if the mode depends on missing values, the function returns NA
.
Yet na.rm = TRUE
makes the function ignore this:
Find the single mode (or NA
) with mode_single()
mode_single()
is stricter than mode_first()
: It returns NA
by default if there are multiple modes. Otherwise, it works the same way.
mode_single(c(3, 4, 4, 5, 5, 5))
#> [1] 5
mode_single(c("x", "x", "y", "y", "z"))
#> [1] NA
Find possible modes
These minimal and maximal sets of modes are possible given the missing value:
mode_possible_min(c("a", "a", "a", "b", "b", "c", NA))
#> [1] "a"
mode_possible_max(c("a", "a", "a", "b", "b", "c", NA))
#> [1] "a" "b"
Acknowledgements
Ken Williams’ mode functions on Stack Overflow were pivotal to moder.