How to Use Qualtrics MaxDiff Data in Displayr

Displayr can analyze MaxDiff variables from a variety of survey platforms, but depending on how they have been set up, you may need to reformat them first. One such example is the output from Qualtrics' dedicated MaxDiff module.

For example, let us suppose you have respondents choose which company they like the most (best) and which the least (worst) from several alternative lists of companies.

Displayr's version of MaxDiff analysis requires separate Best and Worst variables for each question or task, and that the alternative labels (company names) are the category options.
Screenshot 2024-04-24 113102.png

Typically, this is programmed in Qualtrics (and similar platforms) as a series of matrix questions with piped alternatives, whereby the variables are by company name for each question or task, and the category labels represent the Best or Worst selections.
Screenshot 2024-04-24 113706.png

Requirements

A data set with MaxDiff variable sets where the alternatives are the variables and the Best and Worst selections are the category labels, as per the above example.

These instructions cater to a non-randomized single version design (respondents see the same alternatives for each question) and a randomized multiple version design (where respondents were shown random alternatives for each question).
Make sure your MaxDiff selection variables and any variables that store the attributes shown (when your design is randomized) have each been set up as a set of Nominal - Multi questions, one for each task or iteration in the design. If your design asks each respondent to evaluate 4 sets of alternatives 6 times, then there should be 6 sets of Nominal - Multi variables made up of 4 variables each in your project.
Note: Do not import the MaxDiff Design exported from Qualtrics. These steps build the design from the respondent data and are incompatible with designs exported from Qualtrics.

Method

Part 1 - Tidy the data

This makes things easier later on. Let's say you have 6 MaxDiff variable sets, Q14 to Q19, which represent each question or iteration of the MaxDiff survey question.

In the Data Sources tree, select the first variable set for the Most/Least choices for the first MaxDiff question. In our example, that's Q14.
In the Object Inspector , select General > General > Label and change the label to MD1.
Repeat steps 1 and 2 for each of Q15 to Q19 (the Most/Least choices for the other questions).
The result should end up looking like this:
If you have a randomized design where not all respondents were shown the same alternatives for each question, you will also have an additional set of variables that store the alternatives (i.e., Apple, Microsoft, IBM) shown to each respondent for each question. These are usually denoted as Question.Option_MAXDIFF and found directly below vers_MAXDIFF (which stores the design's version number).
These should similarly be combined into Nominal - Multi variable sets for each design "question" or task, and labeled MD1A to MD6A to help with later steps. Each set needs to have the options in order 1-X and should have the same 1-X labels across all sets. So, in the above example, all the variables starting with "1." (of which there are 5) should be combined as a set, and all those starting with "2." as another set, and so on. Note, if these have instead been imported as text variables, you will need to first select all of them and change Data > Properties > Structure to Nominal.
If not all the respondents in the file were given the MaxDiff questions, then the MaxDiff version variable (usually labeled vers_MAXDIFF) will have a version for 0 for those who were not shown the MaxDiff questions. This needs to be treated as missing data, so you will need to click on the variable > Missing Values and set the 0 code to Exclude from analyses.

Part 2 - Add the Design

If you have an experimental design, you can paste this into your document by using Ctrl + v (or Cmd + v on a Mac). Otherwise, you can use R code to create a design based on your data. In this example, we will do the latter.

Go to Calculation > Custom Code from the toolbar.
Click on the page to insert the custom output.
Enter a name under General > General > Name. In this example, we will change the name to design.
Enter your R code in the code editor . For non-randomized single version designs, you should use the code below and confirm and update the code based on your study, where you see #UPDATE:

#UPDATE - ensure all of/only your MaxDiff most/least variable sets are within the list()
questions = list(MD1, MD2, MD3, MD4, MD5, MD6)
n.questions = length(questions)
question.attributes = lapply(questions, colnames)
alts.per.question = length(question.attributes[[1]])
attributes = unique(unlist(question.attributes))

design = matrix(NA, nrow = n.questions, ncol = alts.per.question + 2)
colnames(design) = c('Version', 'Question', paste0('Option.', 1:alts.per.question))
design[, 'Version'] = rep(1,nrow(design))
design[, 'Question'] = 1:n.questions
for (q in 1:n.questions) {
for (a in 1:alts.per.question) {
design[q, a + 2] = which(attributes == question.attributes[[q]][a])
}
}

design

For randomized multiple version designs, you would use the below instead and confirm and update the code based on your study, where you see #UPDATE (there are 2 lines):

#UPDATE - ensure all of/only your MaxDiff alternative variable set names are within the list()
pipe.questions = list(MD1A, MD2A, MD3A, MD4A, MD5A, MD6A)
#UPDATE - ensure your MaxDiff version variable name is correct
version = vers_MAXDIFF

#get basics of design
n.questions = length(pipe.questions)
n.options = NCOL(pipe.questions[[1]])

#stack alternatives on top of one another
alts=do.call(rbind,pipe.questions)
#add in the version number and get unique combinations
des = unique(data.frame("Version"=rep(version,n.questions),
                        "Question"=rep(1:n.questions,each=length(version)),
                         alts))
#remove blank versions and order by version
des = des[!is.na(des$Version),]
des = des[order(des$Version),]
#make the data numeric to get alternative number
des = data.frame(lapply(des, as.numeric))
#rename the columns for each option
colnames(des)[3:NCOL(des)]=paste0("Option.",1:n.options)
#return final table named design
design = des

Part 3 - Create the best/worst variables

Hover over a variable in the Data Sources tree, and select + > Custom Code > R > Numeric - Multi to create a variable set.
Type the number of variables you want to create in the How Many Variables Do You Want to Create? box. In this example, we will create 12 new variables (6 best variables and 6 worst variables)
Click OK

Each of the variables in this variable set shares the same R template code, which is used to create empty variables.
The next step is to update the template code. In the Data Sources tree, click on Variable 1 in the New R Variable Set. The current code will be in the object inspector > Data > R Code > Edit Code.

For a non-randomized single version design, insert the following code on line 17, just before the line of code that says new.data

Note, you will need to make the following edits to the code:
- Update the second line to list out the names of all of your MaxDiff variable sets.
- Update the fourth and fifth lines with the labels of the best and worst categories in your MaxDiff variable sets (capitalization counts!).

#Update with the names of the MaxDiff variable sets
questions = list(MD1, MD2, MD3, MD4, MD5, MD6)
#Update with the labels of the best and worst categories in the MaxDiff variable sets
bestlabel = "Most important"
worstlabel = "Least important"

removeHTML <- function(htmlString) {
return(gsub('<.*>', '', htmlString))
}

n.questions = length(questions)
question.attributes = lapply(questions, colnames)
alts.per.question = length(question.attributes[[1]])
attributes = removeHTML(unique(unlist(question.attributes)))
md.list = list()

get.best.worst = function (question, attributes, best.label = bestlabel, worst.label = worstlabel) { 
option.selected.in.row = function (v) {
option = v[v != 'FALSE']
if (length(option) == 0) {
return(NA)
} else {
return(option)
} 
}

q = question
q.labels = matrix(colnames(q), nrow = nrow(q), ncol = ncol(q), byrow = TRUE)
q.best = q == best.label & !is.na(q)
q.best[q.best] = q.labels[q.best]
q.best = apply(q.best, 1, option.selected.in.row)
q.worst = q == worst.label & !is.na(q)
q.worst[q.worst] = q.labels[q.worst]
q.worst = apply(q.worst, 1, option.selected.in.row)
q.best = factor(q.best, levels = attributes)
q.worst = factor(q.worst, levels = attributes)
best.worst = data.frame(q.best, q.worst) 
}

for (i in 1:n.questions) {
md = get.best.worst(questions[[i]], attributes = attributes) 
md.list[[i]] = md
}

x = do.call(cbind, md.list)
colnames(x) = paste0(colnames(x), rep(1:n.questions, each = 2))

new.data = x

For a randomized multiple version design, you will need to use and edit the code below instead:
- Update the second line to list out the names of all of your MaxDiff variable sets, and the fourth line to list out the names of all the variable sets that store the attributes that were shown.
- Update the sixth and seventh lines with the labels of the best and worst categories in your MaxDiff variable sets (capitalization counts!).

#Update with the names of the MaxDiff variable sets
questions = list(MD1, MD2, MD3, MD4, MD5, MD6)
#Update with the names of the variable sets that store what attributes were shown
pipe.questions = list(MD1A, MD2A, MD3A, MD4A, MD5A, MD6A)
#Update with the labels of the best and worst categories in the MaxDiff variable sets
bestlabel = "Most important"
worstlabel = "Least important"

removeHTML <- function(htmlString) {
return(gsub('<.*>', '', htmlString))
}

n.questions = length(questions)
question.attributes = lapply(questions, colnames)
alts.per.question = length(question.attributes[[1]])
attributes = removeHTML(Reduce(union, lapply(pipe.questions, function(x) Reduce(union, lapply(x, function(y) levels(y))))))
md.list = list()

get.best.worst = function (question, pipe.questions, attributes, best.label = bestlabel, worst.label = worstlabel) { 
option.selected.in.row = function (v) {
option = v[v != 'FALSE']
if (length(option) == 0) {
return(NA)
} else {
return(option)
} 
}

q = question
q.labels = pipe.questions[[i]]
q.best = q == best.label & !is.na(q)
q.best[q.best] = q.labels[q.best]
q.best = apply(q.best, 1, option.selected.in.row)
q.worst = q == worst.label & !is.na(q)
q.worst[q.worst] = q.labels[q.worst]
q.worst = apply(q.worst, 1, option.selected.in.row)
q.best = factor(q.best, levels = attributes)
q.worst = factor(q.worst, levels = attributes)
best.worst = data.frame(q.best, q.worst) 
}

for (i in 1:n.questions) {
md = get.best.worst(questions[[i]], pipe.questions, attributes = attributes) 
md.list[[i]] = md
}

x = do.call(cbind, md.list)
colnames(x) = paste0(colnames(x), rep(1:n.questions, each = 2))

new.data = x

9. The Data Sources tree should now look like this:

10. Click on New R Variable Set in the Data Sources tree.

11. In the Object Inspector , change the structure to Nominal - Multi Grid with unordered categories.

You can drag the new variable set onto the page to see a summary of the choices, like below, to confirm.

The questions are now ready to be analyzed in Displayr.