Skip to contents

mode_is_trivial() checks whether all values in a given vector are equally frequent. The mode is not too informative in such cases.

Usage

mode_is_trivial(x, na.rm = FALSE, max_unique = NULL)

Arguments

x

A vector to search for its modes.

na.rm

Logical. Should missing values in x be removed before computation proceeds? Default is FALSE.

max_unique

Numeric or string. If the maximum number of unique values in x is known, set max_unique to that number. This rules out that NAs represent values beyond that number (see examples). Set it to "known" instead if no values beyond those already known can occur. Default is NULL, which assumes no maximum.

Value

Logical (length 1).

Details

The function returns TRUE whenever x has length < 3 because no value is more frequent than another one. Otherwise, it returns NA in these cases:

  • Some x values are missing, all known values are equal, and the number of missing values is divisible by the number of unique known values. Thus, the missings don't necessarily break the tie among known values, and it is unknown whether there is a value with a different frequency.

  • All known values are modes if the NAs "fill up" the non-modal values exactly, i.e., without any NAs remaining.

  • Some NAs remain after "filling up" the non-modal values with NAs (so that they are hypothetically modes), and the number of remaining NAs is divisible by the number of unique known values.

  • There are so many missing values that they might form mode-sized groups of values that are not among the known values, and the number of NAs is divisible by the modal frequency so that all (partly hypothetical) values might be equally frequent. You can limit the number of such hypothetical values by specifying max_unique. The function might then return FALSE instead of NA.

Examples

# The mode is trivial if
# all values are equal...
mode_is_trivial(c(1, 1, 1))
#> [1] TRUE

# ...and even if all unique
# values are equally frequent:
mode_is_trivial(c(1, 1, 2, 2))
#> [1] TRUE

# It's also trivial if
# all values are different:
mode_is_trivial(c(1, 2, 3))
#> [1] TRUE

# Here, the mode is nontrivial
# because `1` is more frequent than `2`:
mode_is_trivial(c(1, 1, 2))
#> [1] FALSE

# Two of the `NA`s might be `8`s, and
# the other three might represent a value
# different from both `7` and `8`. Thus,
# it's possible that all three distinct
# values are equally frequent:
mode_is_trivial(c(7, 7, 7, 8, rep(NA, 5)))
#> [1] NA

# The same is not true if all values,
# even the missing ones, must represent
# one of the known values:
mode_is_trivial(c(7, 7, 7, 8, rep(NA, 5)), max_unique = "known")
#> [1] FALSE