This article explains how to include covariates in your Hierarchical Bayes (HB) MaxDiff analysis to improve accuracy. Advances in computing have made it simple to include complex respondent-specific covariates in HB MaxDiff models. A few reasons why we may want to do this in practice:
- A standard model that assumes each respondent's part-worth (utility) is drawn from the same normal distribution may be too simplistic. Information drawn from additional covariates may improve the estimates of the part-worths. This is likely to be the case for surveys with fewer questions and therefore less information.
- Additionally, when respondents are segmented, we may be worried that the estimates for one segment are biased. Another concern is that HB may shrink the segment means overly close to each other. This is especially problematic if sample sizes vary greatly between segments.
In this example, we'll take a MaxDiff analysis that does not use covariates:
and include them to produce a more accurate model:
Requirements
- Familiarity with MaxDiff analysis. Check out our Introduction to MaxDiff article.
- A Hierarchical Bayes MaxDiff analysis output. See How to Use Hierarchical Bayes for MaxDiff for more info on what is required to prepare this. The raw data used in the example below is located here and the design file here. Our data set asked 315 Americans ten questions about the attributes they look for in a U.S. president. Each question asked the respondents to pick their most and least important attributes from a set of five. MDVersion is the Version variable. MDmost and MDleast are the best and worst variables.
Please note these steps require a Displayr license.
Method
- Select your MaxDiff model on the page.
- In the object inspector, under Data > Model > Covariates: select your covariate variable. Ours is called 2016 Voting:
- This calculation is going to take about 10 minutes or so (it is doing a lot!). The current MaxDiff output will be updated to show the results adjusted by the covariates.
- Check that the algorithm used has both converged to and adequately sampled from the posterior distribution, see this post for a detailed overview.
You can see that the In-sample accuracy of the model using the covariates increased slightly from 86.5% to 88.2% vs an increased computational time of about 5 minutes (because adding in the covariates makes the model more complex).
Technical Notes
In the usual HB model, we model the part-worths for the ith respondent as βi~ N(μ, ∑). Note that the mean and covariance parameters μ and ∑ do not depend on i and are the same for each respondent in the population. The simplest way to include respondent-specific covariates in the model is to modify μ to be dependent on the respondent's covariates.
We do this by modifying the model for the part-worths to βi~N(Θxi, ∑) where xi is a vector of known covariate values for the ith respondent and Θ is a matrix of unknown regression coefficients. Each row of Θ is given a multivariate normal prior. The covariance matrix, ∑, is re-expressed into two parts: a correlation matrix and a vector of scales, and each part receives its own prior distribution.
Displayr uses the No-U-Turn sampler from stan - the state-of-the-art software for fitting Bayesian models. This package allows us to quickly and efficiently estimate our model without having to worry about selecting the tuning parameters that are frequently a major hassle in Bayesian computation and machine learning. We fit the models using 1000 iterations and eight Markov chains. The package also provides a number of features for visualizing the results and diagnosing any issues with the model fit.
Next
How to Create a Prediction-Accuracy Table
How to Save Respondent-Level Preference Shares from a MaxDiff Latent Class Analysis (or Hierarchical Bayes)
How to Perform an Advanced Analysis of Experimental Data (MaxDiff)