horns()
measures the dispersion in a sample of clamped
observations based on the scale limits. It ranges from 0 to 1:
0 means no variation, i.e., all observations have the same value.
1 means that the observations are evenly split between the extremes, with none in between.
horns_uniform()
computes the value that horns()
would return for a
uniform distribution within given scale limits. This can be useful as a
point of reference for horns()
.
These two functions create the horns
and horns_uniform
columns in
closure_generate()
.
horns_rescaled()
is a version of horns()
that is normalized by scale
length, such that 0.5
always indicates a uniform distribution,
independent of the number of scale points. It is meant to enable comparison
across scales of different lengths, but it is harder to interpret for an
individual scale. Even so, the range and the meaning of 0
and 1
are the
same as for horns()
.
Usage
horns(freqs, scale_min, scale_max)
horns_uniform(scale_min, scale_max)
horns_rescaled(freqs, scale_min, scale_max)
Arguments
- freqs
Numeric. Vector with the frequencies (relative or absolute) of binned observations; e.g., a vector with 5 elements for a 1-5 scale.
- scale_min, scale_max
Numeric (length 1 each). Minimum and maximum of the scale on which the values were measured. These can be lower and upper bounds (e.g., with a 1-5 Likert scale) or empirical min and max reported in an article. The latter should be preferred if available because they constrain the scale further.
Details
The horns index \(h\) is defined as:
$$ h = \frac {\sum_{i=1}^{k} f_i (s_i - \bar{s})^2} {\frac{1}{4} (s_{\max} - s_{\min})^2} $$
where \(k\) is the number of scale points (i.e., the length of freqs
here), \(f_i\) is the relative frequency of the \(i\)th scale point,
\(s_i\); \(\bar{s}\) is the sample mean, \(s_{\max}\) is the upper
bound of the scale, and \(s_{\min}\) is its lower bound.
Its name was inspired by Heathers (2017a)
which defines the "horns of no confidence" as a reconstructed sample "where
an incorrect, impossible or unlikely value set has all its constituents
stacked into its highest or lowest bins to try meet a ludicrously high SD".
In its purest form, this is a case where horns()
returns 1
. However,
note that the implications for the plausibility of any given set of summary
statistics depend on the substantive context of the data (Heathers 2017b).
Examples
# For simplicity, all examples use a 1-5 scale and a total N of 300.
# ---- With all values at the extremes
horns(freqs = c(300, 0, 0, 0, 0), scale_min = 1, scale_max = 5)
#> [1] 0
horns(c(150, 0, 0, 0, 150), 1, 5)
#> [1] 1
horns(c(100, 0, 0, 0, 200), 1, 5)
#> [1] 0.8888889
# ---- With some values in between
horns(c(60, 60, 60, 60, 60), 1, 5)
#> [1] 0.5
horns(c(200, 50, 30, 20, 0), 1, 5)
#> [1] 0.2113889
horns(c(150, 100, 50, 0, 0), 1, 5)
#> [1] 0.1388889
horns(c(100, 40, 20, 40, 100), 1, 5)
#> [1] 0.7333333