Compute the sample median.
By default, median2()
works like the standard median()
unless one or
more values are missing: median()
always returns NA
in this case, but
median2()
checks if the median can be determined nevertheless.
Arguments
- x
Numeric or similar. Vector to search for its median.
- na.rm
Logical. If set to
TRUE
, missing values are removed before computation proceeds. Default isFALSE
.- na.rm.amount
Numeric. Alternative to
na.rm
that only removes a specified number of missing values. Default is0
.- na.rm.from
String. If
na.rm.amount
is used, from which position inx
should missing values be removed? Options are"first"
,"last"
, and"random"
. Default is"first"
.- even
Character. What to do if
x
has an even length and contains no missing values (or they were removed). The default,"mean"
, averages the two central values,"low"
returns the lower central value, and"high"
returns the higher one.- ...
Optional further arguments for methods. Not used in the default method.
Value
Length-1 vector of the same type as x
. The only exception occurs if
x
is logical or integer and its length is even, in which case the return
value is double.
The output is NA
(of the same type as x
) if and only if the median
can't be determined because of missing values, or if there are no values.
Details
median2()
is a generic function, so new methods can be defined for
it. As with stats::median()
from base R, the default method described
here should work for most classes for which a median is a reasonable
concept (e.g., "Date
").
If a new method is necessary, please make sure it deals with missing values like the default method does. See Implementing the algorithm for further details.
Examples
# If no values are missing,
# it works like `median()`:
median(1:4)
#> [1] 2.5
median2(1:4)
#> [1] 2.5
median(c(1:3, 100, 1000))
#> [1] 3
median2(c(1:3, 100, 1000))
#> [1] 3
# With some `NA`s, the median can
# sometimes still be determined...
median2(c(0, 1, 1, 1, NA))
#> [1] 1
median2(c(0, 0, NA, 0, 0, NA, NA))
#> [1] 0
# ...unless there are too many `NA`s...
median2(c(0, 1, 1, 1, NA, NA))
#> [1] NA
# ...or too many unique values:
median2(c(0, 1, 2, 3, NA))
#> [1] NA