A delay-discounting task asks a person to trade a smaller-sooner reward against a larger-later one across a range of delays. The indifference point (IP) at each delay is the smaller-sooner amount that feels equivalent to the later reward, expressed as a proportion of that later reward. By construction an IP lives on the closed interval \([0, 1]\):
Real titration data pile up at both ends, and the example data
shipped with this package (data(dd_ip)) reproduce the
pattern: a meaningful fraction of observations sit exactly at 0
or 1. That single fact rules out the two error distributions people
reach for first:
The Scale-Location-Truncated beta (SLT-beta) distribution (Kim, Koffarnus & Franck, 2024; Kim, Kaplan, Koffarnus & Franck, 2025) keeps the beta’s natural fit to proportion data but assigns finite probability to the endpoints, so 0 and 1 are modeled rather than thrown away.
Write the mean of the IP at delay \(D\) as a discounting function of a single rate \(k\), using either Mazur’s hyperbola \(\mu = 1/(1 + kD)\) or the exponential \(\mu = e^{-kD}\), and let \(\phi\) be a precision parameter (larger \(\phi\) gives a tighter spread around the curve). With shape parameters \(a = \mu\phi\) and \(b = (1-\mu)\phi\), and two fixed micro-constants \(s = 1{+}10^{-7}\) and \(l = 10^{-8}\), the log-density of an observed IP \(y \in [0,1]\) is
\[ \log f(y \mid \mu, \phi) = \underbrace{\log\Gamma(a{+}b) - \log\Gamma(a) - \log\Gamma(b)}_{-\log B(a,b)} + (a{-}1)\log\!\big(\tfrac{y}{s}{+}l\big) + (b{-}1)\log\!\big(1 - \big(\tfrac{y}{s}{+}l\big)\big) - \log s - \log Z, \]
where the truncation normalizer is \(Z = \mathrm{pbeta}(\tfrac{1}{s}{+}l,\,a,\,b) - \mathrm{pbeta}(l,\,a,\,b)\).
The intuition is a microscopic stretch: the data window \(y\in[0,1]\) is mapped just inside the open interval where the ordinary beta density is finite; the kernel is therefore never evaluated at exactly 0 or 1. As \(s\to 1\) and \(l\to 0\) the SLT-beta collapses to the ordinary beta.
slt_logpdf <- function(y, mu, phi, s = 1.0000001, l = 1e-8) {
a <- mu * phi; b <- (1 - mu) * phi
lgamma(a + b) - lgamma(a) - lgamma(b) +
(a - 1) * log(y / s + l) + (b - 1) * log(1 - (y / s + l)) -
log(s) - log(pbeta(1 / s + l, a, b) - pbeta(l, a, b))
}
yy <- seq(0, 1, length.out = 400)
op <- par(mfrow = c(1, 2), mar = c(4, 4, 2, 1))
for (p in list(c(mu = 0.5, phi = 4), c(mu = 0.15, phi = 6))) {
plot(yy, exp(slt_logpdf(yy, p["mu"], p["phi"])), type = "l", lwd = 2,
xlab = "indifference point", ylab = "density",
main = sprintf("mu = %.2f, phi = %g", p["mu"], p["phi"]))
lines(yy, dbeta(yy, p["mu"] * p["phi"], (1 - p["mu"]) * p["phi"]), lty = 2)
points(c(0, 1), exp(slt_logpdf(c(0, 1), p["mu"], p["phi"])), pch = 16)
}SLT-beta (solid) vs. ordinary beta (dashed). The SLT-beta carries finite density to y = 0 and y = 1; the ordinary beta diverges or vanishes there.
Because the variance of a beta is \(\mu(1-\mu)/(\phi+1)\), the SLT-beta automatically gives less noise near the floor and ceiling and more in the middle: the heteroscedastic pattern real IP data actually show, and exactly what a Gaussian model gets wrong.
The package ships dd_ip, a synthetic delay-discounting
dataset (100 subjects across six delays, in the id,
x (delay), y (indifference point) long format)
that reproduces the boundary pattern of real titration data. A handful
of synthetic values fall just above 1 and are clamped to the
ceiling.
data(dd_ip)
dd_ip$id <- factor(dd_ip$id)
dd_ip$y <- pmin(dd_ip$y, 1) # clamp a few synthetic values to the [0, 1] ceiling
c(observations = nrow(dd_ip),
subjects = nlevels(dd_ip$id),
at_floor_or_ceiling = sum(dd_ip$y %in% c(0, 1)))
#> observations subjects at_floor_or_ceiling
#> 600 100 129Many observations sit at exactly 0 and 1. Counting subjects with at least one boundary value, and asking which error models can even evaluate them:
beta_logpdf <- function(y, mu, phi) { # ordinary beta: -Inf at y in {0,1}
a <- mu * phi; b <- (1 - mu) * phi
lgamma(a + b) - lgamma(a) - lgamma(b) + (a - 1) * log(y) + (b - 1) * log(1 - y)
}
ids <- levels(dd_ip$id)
yof <- function(i) dd_ip$y[dd_ip$id == i]
bnd <- vapply(ids, function(i) any(yof(i) %in% c(0, 1)), logical(1))
beta_ok <- vapply(ids, function(i) all(is.finite(beta_logpdf(yof(i), 0.5, 8))), logical(1))
slt_ok <- vapply(ids, function(i) all(is.finite(slt_logpdf(yof(i), 0.5, 8))), logical(1))
data.frame(
subjects = length(ids),
with_boundary_IP = sum(bnd),
evaluable_under_ordinary_beta = sum(beta_ok),
evaluable_under_SLT_beta = sum(slt_ok)
)
#> subjects with_boundary_IP evaluable_under_ordinary_beta
#> 1 100 76 24
#> evaluable_under_SLT_beta
#> 1 100Every subject with a boundary value is lost to ordinary beta regression and retained by the SLT-beta. Fitting each subject individually (estimating \(\log k\) and \(\log\phi\) so that both stay positive) and comparing the recovered rate to nonlinear least squares confirms that the SLT-beta agrees with NLS on the bulk of subjects and succeeds on the boundary subjects that NLS-plus-beta could not model:
fit_slt <- function(y, D) { # per-subject SLT MLE (log k, log phi)
nll <- function(th) {
mu <- pmin(pmax(1 / (1 + exp(th[1]) * D), 1e-6), 1 - 1e-6)
v <- -sum(slt_logpdf(y, mu, exp(th[2]))); if (is.finite(v)) v else 1e10
}
o <- optim(c(log(0.01), log(8)), nll, control = list(reltol = 1e-11))
exp(o$par[1])
}
nls_k <- function(y, D) { # multi-start NLS (single start is fragile)
for (st in c(exp(-10), 1e-3, 1e-2, 1e-1)) {
m <- tryCatch(nls(y ~ 1 / (1 + k * D), start = list(k = st)), error = function(e) NULL)
if (!is.null(m)) return(coef(m)[["k"]])
}
NA_real_
}
xof <- function(i) dd_ip$x[dd_ip$id == i]
k_slt <- vapply(ids, function(i) fit_slt(yof(i), xof(i)), numeric(1))
k_nls <- vapply(ids, function(i) nls_k(yof(i), xof(i)), numeric(1))
sprintf("cor(log k): SLT vs NLS = %.3f | median k: SLT = %.4f, NLS = %.4f",
cor(log(k_slt), log(k_nls), use = "complete.obs"),
median(k_slt), median(k_nls, na.rm = TRUE))
#> [1] "cor(log k): SLT vs NLS = 0.978 | median k: SLT = 0.5328, NLS = 0.5038"A caveat the next section resolves: fit one subject at a time and the SLT-beta likelihood can occasionally run off to a degenerate solution. For a person whose indifference points are almost all 0s and 1s, a fit with near-zero precision that places all its mass on the endpoints can score a higher likelihood than any sensible discounting curve. The fix is not to fit people in isolation.
A mixed-effects model shares information across people. The subject
random intercept on \(\log k\) acts as
a prior that pulls each person’s estimate toward the population; the
degenerate per-subject solutions described above are thus regularized
away, which is one of the main reasons to prefer
fit_dd_tmb() over fitting each subject separately.
The following chunks require TMB (a compiled C++ template;
has_tmb = TRUE on this build):
fit <- fit_dd_tmb(
data = dd_ip, # columns: id, x (delay), y (indifference point)
equation = "mazur", # or "exponential"
family = "sltb", # default; "gaussian" reproduces the NLS-style fit
random_effects = k ~ 1 # random intercept on log k (fit internally on log scale)
)
#> Fitting TMB mixed-effects discounting model (mazur, sltb)...
#> Subjects: 100, Observations: 600
#> Multi-start: best NLL = -2245.78 (start set 1 of 3)
#> Converged (NLL = -2245.78). Done.
summary(fit)
#>
#> TMB Mixed-Effects Discounting Model Summary
#> ==================================================
#>
#> Call:
#> fit_dd_tmb(data = dd_ip, equation = "mazur", family = "sltb",
#> random_effects = k ~ 1)
#>
#> Equation: mazur
#> Family: sltb
#> Backend: TMB_mixed
#> Convergence: Yes
#> Subjects: 100 Observations: 600
#>
#> --- Fixed Effects (k) ---
#> term estimate std.error statistic p.value
#> k:(Intercept) 0.3655 0.042 -8.7651 <2e-16
#>
#> --- Variance Components ---
#> Component Estimate Scale
#> sigma_u (log10-k RE SD) 0.4635 log10
#> phi (precision) 8.9213 natural
#>
#> --- Fit Statistics ---
#> Log-likelihood: 2245.78
#> AIC: -4485.56 BIC: -4472.37
tidy(fit) # population log k, precision; back-transformed k
#> # A tibble: 3 × 8
#> term estimate std.error statistic p.value component estimate_scale
#> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 k:(Intercept) 0.366 0.0420 -8.77 1.87e-18 fixed natural
#> 2 sigma_u (log1… 0.464 NA NA NA variance log10
#> 3 phi (precisio… 8.92 NA NA NA variance natural
#> # ℹ 1 more variable: term_display <chr>
ranef(fit) # per-subject discount rates
#> id u_i k
#> 1 P1 -0.38236419 0.243029667
#> 2 P10 0.89627476 0.951404751
#> 3 P100 0.11224820 0.412035020
#> 4 P11 0.04970521 0.385427478
#> 5 P12 -0.56628851 0.199711092
#> 6 P13 0.35044262 0.531309814
#> 7 P14 -0.68745308 0.175483795
#> 8 P15 0.37270213 0.544084198
#> 9 P16 0.35771995 0.535452814
#> 10 P17 -0.83495986 0.149920516
#> 11 P18 0.56670332 0.669259114
#> 12 P19 -0.72473678 0.168637581
#> 13 P2 0.14790837 0.428020142
#> 14 P20 0.57456253 0.674896830
#> 15 P21 0.57828145 0.677581081
#> 16 P22 -0.68844173 0.175298716
#> 17 P23 -1.12811081 0.109640892
#> 18 P24 0.59038612 0.686392174
#> 19 P25 0.50477070 0.626449068
#> 20 P26 -2.37264320 0.029045339
#> 21 P27 0.47191768 0.604862881
#> 22 P28 0.77873779 0.839231355
#> 23 P29 -0.13457312 0.316607698
#> 24 P3 0.09144984 0.402988977
#> 25 P30 -0.14907303 0.311745447
#> 26 P31 1.27264359 1.421768886
#> 27 P32 0.65210765 0.733133386
#> 28 P33 0.97853349 1.038713784
#> 29 P34 0.89342387 0.948514118
#> 30 P35 1.13486992 1.227339699
#> 31 P36 0.40341731 0.562216990
#> 32 P37 -0.27346274 0.272985737
#> 33 P38 1.05166878 1.123045505
#> 34 P39 -1.22037771 0.099357979
#> 35 P4 0.03818545 0.380717425
#> 36 P40 -0.09825501 0.329121712
#> 37 P41 0.36177434 0.537774984
#> 38 P42 0.35603167 0.534488804
#> 39 P43 -1.47320900 0.075858492
#> 40 P44 -2.59810502 0.022833099
#> 41 P45 0.93522195 0.991788504
#> 42 P46 -0.58244912 0.196295808
#> 43 P47 0.55310519 0.659615657
#> 44 P48 -1.44965354 0.077789899
#> 45 P49 -1.89016940 0.048609876
#> 46 P5 0.88540804 0.940433533
#> 47 P50 0.51567933 0.633785655
#> 48 P51 0.56939435 0.671184174
#> 49 P52 0.69095662 0.764172185
#> 50 P53 0.34820776 0.530043952
#> 51 P54 0.15132118 0.429582119
#> 52 P55 -2.57732994 0.023345062
#> 53 P56 -0.38855228 0.241429778
#> 54 P57 0.64995449 0.731450450
#> 55 P58 -0.58170763 0.196451224
#> 56 P59 0.51883862 0.635926432
#> 57 P6 -0.17586961 0.302955408
#> 58 P60 0.32237447 0.515628606
#> 59 P61 -0.04535667 0.348238885
#> 60 P62 0.32261638 0.515761759
#> 61 P63 0.17996621 0.442919135
#> 62 P64 0.27369445 0.489521361
#> 63 P65 0.33490735 0.522572500
#> 64 P66 -0.65291379 0.182073826
#> 65 P67 -1.34451945 0.087027706
#> 66 P68 -0.42210016 0.232937762
#> 67 P69 0.30424952 0.505749292
#> 68 P7 -0.18982200 0.298477197
#> 69 P70 -0.47443689 0.220282247
#> 70 P71 -3.62279144 0.007648515
#> 71 P72 1.01860420 1.084102760
#> 72 P73 0.39635476 0.557994809
#> 73 P74 -0.51052306 0.211959044
#> 74 P75 1.10465276 1.188386704
#> 75 P76 0.08854329 0.401740720
#> 76 P77 -0.48818524 0.217073358
#> 77 P78 0.51163496 0.631055651
#> 78 P79 -2.66282838 0.021308982
#> 79 P8 0.14380620 0.426150172
#> 80 P80 0.30080213 0.503891767
#> 81 P81 1.14401107 1.239373237
#> 82 P82 0.31114100 0.509483112
#> 83 P83 0.05790988 0.388817589
#> 84 P84 0.18431215 0.444978448
#> 85 P85 -1.12244962 0.110305400
#> 86 P86 0.92938712 0.985631023
#> 87 P87 0.09506885 0.404548634
#> 88 P88 0.31350286 0.510769104
#> 89 P89 -0.43901139 0.228770907
#> 90 P9 0.41635290 0.570033245
#> 91 P90 0.73021176 0.796870458
#> 92 P91 -0.13013197 0.318112066
#> 93 P92 0.71785036 0.786425648
#> 94 P93 0.40033465 0.560370178
#> 95 P94 1.04563497 1.115836112
#> 96 P95 -1.07720768 0.115762641
#> 97 P96 -1.87521275 0.049392110
#> 98 P97 0.54039037 0.650724370
#> 99 P98 0.73834755 0.803820409
#> 100 P99 0.36389045 0.538990993Group contrasts use the same estimated-marginal-means surface
computed on the log k scale and back-transformed, so a
between-group comparison reads as a ratio of discount
rates. The example below adds a synthetic two-group factor for
illustration:
set.seed(2)
grp <- data.frame(
id = levels(dd_ip$id),
group = factor(sample(c("A", "B"), nlevels(dd_ip$id), replace = TRUE))
)
dd_grp <- merge(dd_ip, grp, by = "id")
fit_grp <- fit_dd_tmb(dd_grp, equation = "mazur", factors = "group")
#> Fitting TMB mixed-effects discounting model (mazur, sltb)...
#> Subjects: 100, Observations: 600
#> Multi-start: best NLL = -2247.59 (start set 3 of 3)
#> Converged (NLL = -2247.59). Done.
get_dd_param_emms(fit_grp) # EMM of k per group
#> # A tibble: 2 × 6
#> level k k_log std.error conf.low conf.high
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 group=A 0.455 -0.788 0.161 0.332 0.623
#> 2 group=B 0.296 -1.22 0.157 0.217 0.403
get_dd_comparisons(fit_grp, compare_specs = ~group, adjust = "holm") # ratio-of-k contrasts
#> $k
#> $k$emmeans
#> # A tibble: 2 × 6
#> level k k_log std.error conf.low conf.high
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 group=A 0.455 -0.788 0.161 0.332 0.623
#> 2 group=B 0.296 -1.22 0.157 0.217 0.403
#>
#> $k$contrasts_log10
#> # A tibble: 1 × 8
#> contrast estimate std.error statistic df conf.low conf.high p.value
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 group=A - group… 0.187 0.0974 1.92 Inf -0.00408 0.378 0.0551
#>
#> $k$contrasts_ratio
#> # A tibble: 1 × 5
#> contrast ratio conf.low conf.high p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 group=A - group=B 1.54 0.991 2.39 0.0551
#>
#>
#> attr(,"class")
#> [1] "beezdiscounting_comparison"
#> attr(,"backend")
#> [1] "tmb"
#> attr(,"adjustment_method")
#> [1] "holm"
#> attr(,"compare_specs_used")
#> [1] "~group"
#> attr(,"contrast_type_used")
#> [1] "pairwise"
#> attr(,"contrast_by_used")
#> [1] "NULL"family = "sltb" (default) |
family = "gaussian" |
|
|---|---|---|
| Respects \([0,1]\) | yes (incl. boundary mass at 0 and 1) | no (can predict outside) |
| Variance | shrinks near bounds, grows mid-range | constant |
| Boundary data | modeled | only via the bounded mean |
Matches legacy fit_dd() |
— | yes (NLS-style) |
Use "sltb" for inference on real IP data; use
"gaussian" when continuity with a classic least-squares
discounting analysis is required or as a quick baseline.
The SLT-beta density was validated before release: it integrates to
one, has finite density at both boundaries, reproduces the published
moments, and recovers known discount rates from simulated data (rank
correlation 0.99 for Mazur, 0.95 for the exponential). On the bundled
example data, every boundary subject is fit successfully where ordinary
beta regression cannot be evaluated at all. The full Monte-Carlo study
lives in the package’s dev/ directory.