--- title: "Mixed-effects discounting with TMB" author: "Brent Kaplan" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Mixed-effects discounting with TMB} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} has_tmb <- requireNamespace("TMB", quietly = TRUE) knitr::opts_chunk$set(collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4.2, eval = has_tmb) set.seed(1) library(beezdiscounting) ``` Fitting each subject's discounting curve in isolation ignores that subjects are samples from a population. The mixed-effects tier in `beezdiscounting` estimates a population discount rate and each subject's rate at the same time, borrowing strength across subjects so that noisy individuals are pulled toward the group. `fit_dd_tmb()` is the engine for that tier. For a practical guide to this style of multilevel analysis of discounting indifference points, see Young (2017). This vignette is a tour of the `fit_dd_tmb()` workflow and the methods that act on a fitted model. It pairs with two other vignettes: `vignette("sltb-discounting")` explains *why* indifference points need a bounded error distribution and develops the scale-location-truncated beta (SLT-beta) likelihood used below, and `vignette("dd-group-comparisons")` goes deeper on contrasting groups. Here the focus is the model object and what you can do with it. ## Why TMB The discount rate is estimated on the natural-log scale, where the log-k predictor is linear in the fixed effects, which makes each subject's rate `k_i = exp(population log-k + random deviation)`. Integrating the random deviations out of the likelihood has no closed form. Template Model Builder (TMB) handles it with a Laplace approximation and automatic differentiation: exact gradients, a fast compiled objective, and maximum-likelihood estimates in seconds rather than minutes. TMB must be installed for any of this to run. ## Some data `simulate_dd_ip()` generates indifference-point data from the same model `fit_dd_tmb()` fits, which makes it convenient for a reproducible walk-through. The built-in `dd_ip` dataset works exactly the same way. ```{r sim} sim <- simulate_dd_ip( n_subjects = 15, delays = c(7, 30, 90, 180, 365, 730), log_k_pop = log(0.01), sigma_u = 0.5, phi = 10, family = "sltb", equation = "mazur", seed = 1 ) head(sim) ``` The data are long format with columns `id`, `x` (delay), and `y` (indifference point on `[0, 1]`), the same layout used throughout the package. ## Fitting a model A one-group hyperbolic model with a random intercept on log-k and the SLT-beta likelihood: ```{r fit} fit <- fit_dd_tmb( sim, equation = "mazur", family = "sltb", random_effects = k ~ 1 ) fit ``` The `random_effects = k ~ 1` formula puts a subject random intercept on the discount rate. `family = "sltb"` selects the bounded error distribution; use `family = "gaussian"` for unbounded residuals. ## Inspecting the fit `summary()` reports the fixed effect (the population `k` on the natural scale), the variance components, and fit statistics: ```{r summary} summary(fit) ``` `tidy()` returns the same information as a data frame (one row per term, with the scale each estimate is reported on), which is handy for tables and for stacking models: ```{r tidy} tidy(fit) ``` `glance()` gives a one-row model summary for comparisons: ```{r glance} glance(fit) ``` ## Subject-level rates `ranef()` returns each subject's random deviation (`u_i`) and their fitted discount rate (`k`): ```{r ranef} head(ranef(fit)) ``` These are the shrinkage estimates: subjects with noisy or extreme data are pulled toward the population rate, more so when they have fewer or messier observations. ## Predictions `predict()` returns fitted indifference points. The `level` argument chooses between the population-average curve (random effects set to their mean) and subject-specific curves. Predict the population curve on a fine grid of delays and plot it over the raw points: ```{r predict-plot} grid <- data.frame(x = seq(1, 730, length.out = 100)) head(predict(fit, newdata = grid, level = "population")) # plot() draws the population curve over the observed points directly; type = # "individual" adds the per-subject (shrinkage) curves. plot(fit, type = "population") ``` For data that includes an `id` column, request both levels at once with `level = c("population", "subject")` to get the population prediction and each subject's prediction side by side. ## Diagnostics `augment()` attaches fitted values and residuals to the model frame, including standardized residuals that account for the SLT-beta's non-constant variance: ```{r augment} aug <- augment(fit) head(aug) plot(fit, type = "resid") ``` ## Choosing an equation `fit_dd_tmb()` fits four discounting functions: `"mazur"` and `"exponential"` (one-parameter), and `"green-myerson"` and `"rachlin"` (two-parameter hyperboloids that add a curvature exponent). Fit several and compare them with `glance()`: ```{r compare-eq} equations <- c("mazur", "exponential", "green-myerson") fits <- lapply(equations, function(eq) fit_dd_tmb(sim, equation = eq, family = "sltb", random_effects = k ~ 1)) do.call(rbind, lapply(fits, function(f) glance(f)[c("equation", "logLik", "AIC", "BIC")])) ``` Because these data were generated from a hyperbola, Mazur wins on AIC; the exponential fits worse, and the extra parameter in Green-Myerson does not earn its keep. With real data the ranking is an empirical question worth checking. ## Beyond one group and one random effect Add between-subject factors with the `factors` argument, then get per-group rates and contrasts. Simulate two groups and fit a model with a condition effect on log-k: ```{r groups} sim2 <- simulate_dd_ip( n_subjects = 30, delays = c(7, 30, 90, 180, 365, 730), log_k_pop = log(0.01), sigma_u = 0.6, phi = 12, family = "sltb", equation = "mazur", n_conditions = 2, delta_k = c(0, log(2)), seed = 42 ) fit2 <- fit_dd_tmb(sim2, equation = "mazur", family = "sltb", random_effects = k ~ 1, factors = "condition") get_dd_param_emms(fit2) ``` `get_dd_comparisons()` turns those marginal means into contrasts, reported both on the log10 scale and as multiplicative ratios with Holm-adjusted p-values: ```{r comparisons} cmp <- get_dd_comparisons(fit2, compare_specs = ~ condition, contrast_type = "pairwise", adjust = "holm") cmp$k$contrasts_ratio ``` See `vignette("dd-group-comparisons")` for the full treatment of group inference. The random-effects structure can grow too. With the SLT-beta family, `random_effects = k + phi ~ 1` adds a subject random effect on the precision, letting each subject have their own spread around the curve; the two-parameter equations support `k + s ~ 1` for a random curvature. The `covariance_structure` argument controls whether those two random effects are allowed to correlate. ## Practical notes - **TMB is required.** The fitting functions load a compiled C++ template; if TMB is not installed, the calls fail. - **Convergence.** `fit_dd_tmb()` runs a short multi-start by default and keeps the best fit that passes a sanity check. A returned model carries a `converged` flag; when standard errors cannot be computed, the uncertainty columns come back `NA`. - **Column names.** Whatever you call your columns via `id_var`, `x_var`, and `y_var`, the model frame and all method output use the canonical `id`, `x`, `y`. - **Factor coding.** Between-subject factors enter through `model.matrix()` with the session's contrasts (treatment coding by default). The group means and contrasts from `get_dd_param_emms()` and `get_dd_comparisons()` are estimated marginal means, so they do not depend on the contrast coding or on the choice of reference level. The raw fixed-effect coefficients are read relative to that reference level, so if you interpret them directly (rather than through `emmeans`), effect coding (`contr.sum`) and centering any continuous covariates make the main effects easier to read, especially with interactions (Young, 2017). ## Relationship to Young (2017) The model fit here is the one-stage nonlinear multilevel approach Young (2017) advocates for indifference-point data: the population and each subject are estimated together on the log-k scale, with a random intercept on log k, instead of fitting each subject in isolation and averaging the per-subject estimates. The two-parameter equations can extend this, via `random_effects = k + s ~ 1`, to a correlated random effect on the curvature, as Young describes for the hyperboloids. Two things differ from Young's treatment. First, the default error family is the bounded SLT-beta rather than normal residuals, so the non-constant variance of indifference points (and observations at exactly 0 or 1) is modeled instead of assumed away; `family = "gaussian"` recovers the normal-error model for a closer comparison. Second, estimation uses TMB's Laplace approximation with exact automatic differentiation in place of `nlme`. See `vignette("sltb-discounting")` for the error model and Young (2017) for the multilevel framework. ## See also - `vignette("sltb-discounting")` for the SLT-beta error model and why bounded indifference points need it. - `vignette("dd-group-comparisons")` for comparing discount rates between groups. - `vignette("delay-discounting-basics")` for single-subject fitting, `k`, and AUC. - `vignette("bayesian-discounting")` for the same models in a Bayesian framework. ## References - Young, M. E. (2017). Discounting: A practical guide to multilevel analysis of indifference data. *Journal of the Experimental Analysis of Behavior, 108*(1), 97-112.