The beezdemand package includes functionality for
fitting two-part mixed effects hurdle demand models
using Template Model Builder (TMB). This approach is particularly useful
when:
| Scenario | Recommended Approach |
|---|---|
| Few zeros, zeros are measurement artifacts | fit_demand_fixed() or
fit_demand_mixed() |
| Many zeros, zeros represent true non-consumption | fit_demand_hurdle() |
| Need to understand factors affecting whether someone consumes | fit_demand_hurdle() |
| Need individual-level estimates of “quitting price” | fit_demand_hurdle() |
The hurdle model has two parts:
\[\text{logit}(\pi_{ij}) = \beta_0 + \beta_1 \cdot \log(\text{price} + \epsilon) + a_i\]
Where:
With 3 random effects: \[\log(Q_{ij}) = (\log Q_0 + b_i) + k \cdot (\exp(-\alpha_i \cdot \text{price}) - 1) + \varepsilon_{ij}, \qquad \alpha_i = \exp(\log\alpha + c_i)\]
With 2 random effects: \[\log(Q_{ij}) = (\log Q_0 + b_i) + k \cdot (\exp(-\alpha \cdot \text{price}) - 1) + \varepsilon_{ij}, \qquad \alpha = \exp(\log\alpha)\]
Where:
The random effects follow a multivariate normal distribution:
Your data should be in long format with columns for:
library(beezdemand)
# Example data structure
knitr::kable(
head(apt, 10),
caption = "Example APT data structure (first 10 rows)"
)| id | x | y | group |
|---|---|---|---|
| 19 | 0.0 | 10 | a |
| 19 | 0.5 | 10 | a |
| 19 | 1.0 | 10 | a |
| 19 | 1.5 | 8 | a |
| 19 | 2.0 | 8 | a |
| 19 | 2.5 | 8 | a |
| 19 | 3.0 | 7 | a |
| 19 | 4.0 | 7 | a |
| 19 | 5.0 | 7 | a |
| 19 | 6.0 | 6 | a |
# View summary
summary(fit2)
#>
#> Two-Part Mixed Effects Hurdle Demand Model
#> ============================================
#>
#> Call:
#> fit_demand_hurdle(data = apt, y_var = "y", x_var = "x", id_var = "id",
#> random_effects = c("zeros", "q0"), verbose = 0)
#>
#> Convergence: Yes
#> Number of subjects: 10
#> Number of observations: 160
#> Random effects: 2 (zeros, q0)
#>
#> Fixed Effects:
#> --------------
#> Estimate Std. Error z value
#> beta0 -392.08424 146.77064 -2.671
#> beta1 135.79403 50.43489 2.692
#> log_q0 1.93813 0.11712 16.548
#> log_k 0.55505 0.09013 6.158
#> log_alpha -2.31478 0.16822 -13.760
#> logsigma_a 5.76791 1.48287 3.890
#> logsigma_b -1.04114 0.22836 -4.559
#> logsigma_e -1.63184 0.06063 -26.914
#> rho_ab_raw -0.19191 0.23754 -0.808
#>
#> Variance Components:
#> --------------------
#> Estimate Std. Error
#> alpha 0.0988 0.0166
#> k 1.7420 0.1570
#> var_a 102316.4496 303443.2535
#> var_b 0.1246 0.0569
#> cov_ab -21.4106 12.6866
#> var_e 0.0382 0.0046
#>
#> Correlations:
#> -------------
#> Estimate Std. Error
#> rho_ab -0.1896 0.229
#>
#> Model Fit:
#> ----------
#> Log-likelihood: 2.31
#> AIC: 13.38
#> BIC: 41.06
#>
#> Demand Metrics (Group-Level):
#> -----------------------------
#> Pmax (price at max expenditure): 20.0000
#> Omax (max expenditure): 30.9810
#> Q at Pmax: 1.5491
#> Elasticity at Pmax: -0.4772
#> Method: numerical_optimize_observed_domain
#>
#> Derived Parameters (Individual-Level Summary):
#> ----------------------------------------------
#> Q0 (Intensity):
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 3.622 5.496 7.585 7.365 8.544 11.623
#> Alpha:
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.09879 0.09879 0.09879 0.09879 0.09879 0.09879
#> Breakpoint:
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 7.591 13.549 16.183 17.986 21.796 34.419
#> Pmax:
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 20 20 20 20 20 20
#> Omax:
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 16.15 24.51 33.83 32.85 38.11 51.85
# Extract coefficients
coef(fit2)
#> beta0 beta1 logsigma_a logsigma_b logsigma_e
#> -392.08424414 135.79402761 5.76791287 -1.04113716 -1.63183781
#> rho_ab_raw Q0 alpha k
#> -0.19191230 6.94572447 0.09878809 1.74202549
# Standardized tidy summaries
tidy(fit2) |> head()
#> # A tibble: 6 × 9
#> term estimate std.error statistic p.value component estimate_scale
#> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 beta0 -392. 147. -2.67 7.55e- 3 zero_probabi… logit
#> 2 beta1 136. 50.4 2.69 7.09e- 3 zero_probabi… logit
#> 3 Q0 6.95 0.813 16.5 1.65e-61 consumption natural
#> 4 k 1.74 0.157 6.16 7.34e-10 consumption natural
#> 5 alpha 0.0988 0.0166 -13.8 4.43e-43 consumption natural
#> 6 logsigma_a 5.77 1.48 3.89 1.00e- 4 variance natural
#> # ℹ 2 more variables: term_display <chr>, estimate_internal <dbl>
glance(fit2)
#> # A tibble: 1 × 9
#> model_class backend nobs n_subjects n_random_effects converged logLik AIC
#> <chr> <chr> <int> <int> <int> <lgl> <dbl> <dbl>
#> 1 beezdemand_h… TMB_hu… 160 10 2 TRUE 2.31 13.4
#> # ℹ 1 more variable: BIC <dbl>
# Get subject-specific parameters
head(get_subject_pars(fit2))
#> id a_i b_i Q0 alpha breakpoint Pmax Omax
#> 1 19 -88.44328 0.5148913 11.623368 0.09878809 34.419417 20 51.84539
#> 2 30 -21.33116 -0.6512299 3.621529 0.09878809 20.997058 20 16.15363
#> 3 38 40.35748 -0.2348862 5.491712 0.09878809 13.330758 20 24.49548
#> 4 60 31.73720 0.1663613 8.202899 0.09878809 14.204504 20 36.58857
#> 5 68 31.19351 0.4228981 10.601806 0.09878809 14.261495 20 47.28877
#> 6 106 116.82464 -0.2317221 5.509116 0.09878809 7.590564 20 24.57311
#> Pmax_unconditional Omax_unconditional
#> 1 19.999999 51.84539
#> 2 19.999999 16.13182
#> 3 12.765578 20.06101
#> 4 13.605392 30.70281
#> 5 13.660199 39.74372
#> 6 7.270466 16.35863Use check_demand_model() and the plotting helpers for
quick post-fit checks.
check_demand_model(fit2)
#>
#> Model Diagnostics
#> ==================================================
#> Model class: beezdemand_hurdle
#>
#> Convergence:
#> Status: Converged
#>
#> Random Effects:
#>
#> Residuals:
#> Mean: -0.008684
#> SD: 0.9478
#> Range: [-3.129, 2.549]
#> Outliers: 2 observations
#>
#> --------------------------------------------------
#> Issues Detected (1):
#> 1. Detected 2 potential outliers (|resid| > 3)
#>
#> Recommendations:
#> - Investigate outlying observations
plot_residuals(fit2)$fitted
#> NULL
plot_qq(fit2)| Parameter | Interpretation |
|---|---|
beta0 |
Part I intercept: baseline log-odds of zero consumption |
beta1 |
Part I slope: change in log-odds per unit increase in log(price) |
logQ0 |
Log of intensity parameter (population average) |
k |
Scaling parameter for demand decay |
alpha |
Population elasticity \(\exp(\log\alpha)\); subject values are \(\exp(\log\alpha + c_i)\) in the 3-RE model |
The subject_pars data frame contains:
| Parameter | Description |
|---|---|
a_i |
Random effect for Part I (zeros probability) |
b_i |
Random effect for Part II (intensity) |
c_i |
Random effect for alpha (3-RE model only) |
Q0 |
Individual intensity: \(\exp(\log Q_0 + b_i)\) |
alpha |
Individual elasticity: \(\exp(\log\alpha + c_i)\) (or \(\exp(\log\alpha)\) for 2-RE) |
breakpoint |
Price where P(zero) = 0.5: \(\exp(-(\beta_0 + a_i) / \beta_1) - \epsilon\) |
Pmax |
Price at maximum expenditure |
Omax |
Maximum expenditure |
# Compare nested models
compare_hurdle_models(fit3, fit2)
# Unified model comparison (AIC/BIC + LRT when appropriate)
compare_models(fit3, fit2)
# Output:
# Likelihood Ratio Test
# =====================
# Model n_RE LogLik df AIC BIC
# Full (3 RE) 3 -1234.56 12 2493.12 2543.21
# Reduced (2 RE) 2 -1245.78 9 2509.56 2547.89
#
# LR statistic: 22.44
# df: 3
# p-value: 5.2e-05A significant p-value suggests the 3-RE model provides a better fit.
A hurdle model carries two different “predicted consumption” quantities, and choosing the wrong one quietly distorts any score computed against observed data:
type = "demand" (the default since beezdemand 0.3.0) is
the marginal expectation (1 - P0) * E[Y | Y > 0].
Observed consumption includes zeros, so this is the quantity to score
against raw data (cross-validation error, calibration checks, or model
comparison on predictions).type = "response" is the conditional positive mean
E[Y | Y > 0]: expected consumption given that any is
purchased. Scoring raw data with it systematically overstates
prediction error wherever the probability of zero consumption is large
(typically at high prices).# Marginal expectation (the default): score this against observed y
pred <- predict(fit2, type = "demand", prices = c(0.5, 1, 5, 10))
pred[, c("x", "prob_zero", "predicted_consumption", ".fitted")]
#> # A tibble: 40 × 4
#> x prob_zero predicted_consumption .fitted
#> <dbl> <dbl> <dbl> <dbl>
#> 1 0.5 3.54e-250 10.9 10.9
#> 2 0.5 4.96e-221 3.39 3.39
#> 3 0.5 3.07e-194 5.15 5.15
#> 4 0.5 5.53e-198 7.69 7.69
#> 5 0.5 3.21e-198 9.94 9.94
#> 6 0.5 4.97e-161 5.16 5.16
#> 7 0.5 3.35e-228 8.11 8.11
#> 8 0.5 5.98e-224 7.66 7.66
#> 9 0.5 2.75e-212 6.56 6.56
#> 10 0.5 7.71e-184 4.48 4.48
#> # ℹ 30 more rowsThe prob_zero and predicted_consumption
columns expose the two parts; .fitted is their product
(1 - prob_zero) * predicted_consumption. See the
Scoring predictions section of
?predict.beezdemand_hurdle for details.
Population demand curve from 2-RE hurdle model.
Probability of zero consumption as a function of price.
# Distribution of individual parameters
plot(fit2, type = "parameters")
plot(fit2, type = "parameters", parameters = c("Q0", "alpha", "Pmax"))
# Individual demand curves
plot(fit2, type = "individual", subjects = c("1", "2", "3", "4"))
# Single subject with observed data
plot_subject(fit2, subject_id = "1", show_data = TRUE, show_population = TRUE)The simulate_hurdle_data() function generates data from
the hurdle model:
# Simulate with default parameters
sim_data <- simulate_hurdle_data(
n_subjects = 100,
seed = 123
)
head(sim_data)
# id x y delta a_i b_i
# 1 1 0.00 12.345678 0 -0.523456 0.1234567
# 2 1 0.50 11.234567 0 -0.523456 0.1234567
# ...
# Custom parameters
sim_custom <- simulate_hurdle_data(
n_subjects = 100,
logQ0 = log(15), # Q0 = 15
alpha = 0.1, # Lower elasticity
k = 3, # Higher k (ensures Pmax exists)
stop_at_zero = FALSE, # All prices for all subjects
seed = 456
)The run_hurdle_monte_carlo() function assesses model
performance through simulation.
Note: Monte Carlo simulations are computationally intensive and not run during vignette building. The example below shows typical usage and expected output format.
# Run Monte Carlo study
mc_results <- run_hurdle_monte_carlo(
n_sim = 100, # Number of simulations
n_subjects = 100, # Subjects per simulation
n_random_effects = 2, # 2-RE model
verbose = TRUE,
seed = 123
)
# View summary
print_mc_summary(mc_results)
# Monte Carlo Simulation Summary
# ==============================
#
# Simulations: 100 attempted, 95 converged (95.0%)
#
# Parameter True Mean_Est Bias Rel_Bias% Emp_SE Mean_SE SE_Ratio Coverage_95% N
# beta0 -2.0 -2.01 -0.01 0.5 0.12 0.11 0.92 94.7 95
# beta1 1.0 1.02 0.02 2.0 0.08 0.08 1.00 95.8 95
# logQ0 2.3 2.29 -0.01 -0.4 0.05 0.05 1.00 94.7 95
# k 2.0 2.03 0.03 1.5 0.15 0.14 0.93 93.7 95
# alpha 0.5 0.51 0.01 2.0 0.04 0.04 1.00 95.8 95
# ...| Metric | Target | Interpretation |
|---|---|---|
| Bias | ~0 | Estimates should be unbiased |
| Relative Bias | < 5% | Acceptable bias relative to true value |
| SE Ratio | ~1.0 | Model SEs match empirical variability |
| Coverage 95% | ~95% | CIs should contain true value 95% of time |
# Fit hurdle model
hurdle_fit <- fit_demand_hurdle(data,
y_var = "y", x_var = "x", id_var = "id",
random_effects = c("zeros", "q0"), verbose = 0
)
# Extract subject parameters
hurdle_pars <- get_subject_pars(hurdle_fit)
# These can be merged with other analyses
# e.g., correlate with individual characteristics
cor(hurdle_pars$Q0, subject_characteristics$age)
cor(hurdle_pars$breakpoint, subject_characteristics$dependence_score)The hurdle model is implemented using Template Model Builder (TMB), which provides:
fit <- fit_demand_hurdle(
data,
y_var = "y",
x_var = "x",
id_var = "id",
epsilon = 0.001, # Constant for log(price + epsilon)
tmb_control = list(
max_iter = 200, # Maximum iterations
eval_max = 1000, # Maximum function evaluations
trace = 0 # Optimization trace level
),
verbose = 1 # 0 = silent, 1 = progress, 2 = detailed
)For difficult optimization problems:
Zhao, T., Luo, X., Chu, H., Le, C.T., Epstein, L.H. and Thomas, J.L. (2016), A two-part mixed effects model for cigarette purchase task data. Jrnl Exper Analysis Behavior, 106: 242-253. https://doi.org/10.1002/jeab.228
Hursh, S. R., & Silberberg, A. (2008). Economic demand and essential value. Psychological Review, 115(1), 186-198.
vignette("beezdemand") – Getting started with
beezdemandvignette("model-selection") – Choosing the right model
classvignette("fixed-demand") – Fixed-effect demand
modelingvignette("mixed-demand") – Mixed-effects nonlinear
demand modelsvignette("mixed-demand-advanced") – Advanced
mixed-effects topicsvignette("tmb-mixed-effects") – TMB mixed-effects
demand models (continuous-only)vignette("cross-price-models") – Cross-price demand
analysisvignette("group-comparisons") – Group comparisonsvignette("migration-guide") – Migrating from
FitCurves()