Displayr can analyze MaxDiff variables from a variety of survey platforms, but depending on how they have been set up, you may need to reformat them first. One such example is the output from Qualtrics' dedicated MaxDiff module.
For example, let us suppose you have respondents choose which company they like the most (best) and which the least (worst) from several alternative lists of companies.
- Displayr's version of MaxDiff analysis requires separate Best and Worst variables for each question or task, and that the alternative labels (company names) are the category options.
-
Typically this is programmed in Qualtrics (and similar platforms) as a series of matrix questions with piped alternatives, whereby the variables are by company name for each question or task, and the category labels represent the Best or Worst selections.
Requirements
A data set with MaxDiff variable sets where the alternatives are the variables and the Best and Worst selections are the category labels, as per the above example.
- These instructions cater to a non-randomized single version design (respondents see the same alternatives for each question) and a randomized multiple version design (where respondents were shown random alternatives for each question).
- Make sure your MaxDiff selection variables and any variables that store the attributes shown (when your design is randomized) have each been set up as a set of Nominal - Multi questions, one for each task or iteration in the design. If your design asks each respondent to evaluate 4 sets of alternatives 6 times, then there should be 6 sets of Nominal - Multi variables made up of 4 variables each in your project.
Method
Part 1 – Tidy the data
This just makes things easier later on. Let's say you have 6 MaxDiff variable sets, Q14 to Q19, which represent each question or iteration of the MaxDiff survey question.
- In the Data Sources tree, select Q14 (the variable set for the Most/Least choices for the first MaxDiff question).
- In the Object Inspector, select General > GENERAL > Label and change the label to MD1
-
Repeat steps 1 and 2 for each of Q15 to Q19 (the Most/Least choices for the other questions).
The result should end up looking like this:
-
If you have a randomized design where not all respondents were shown the same alternatives for each question, you will also have an additional set of variables that store the alternatives (i.e. Apple, Microsoft, IBM) shown to each respondent for each question. These are usually denoted as Question.Option_MAXDIFF and found directly below vers_MAXDIFF (which stores the design's version number).
- These should similarly be combined into Nominal - Multi variable sets for each design "question" or task, and labeled MD1A to MD6A to help with later steps. Each set needs to have the options in order 1-X and should have the same 1-X labels across all sets. So, in the above example, all the variables starting with "1." (of which there are 5) should be combined as a set, and all those starting with "2." as another set, and so on. Note, if these have instead been imported as text variables, you will need to first select all of them and change Data > Properties > Structure to Nominal.
- If not all the respondents in the file were given the MaxDiff questions, then the MaxDiff version variable (usually labeled vers_MAXDIFF) will have a version for 0 for those who were not shown the MaxDiff questions. This needs to be treated as missing data, so you will need to click on the variable > Missing Values and set the 0 code to Exclude from analyses.
Part 2 – Add the Design
If you have an external design, you can paste this into your document using the Table icon > Paste or Enter Table. Otherwise, you can use R code to create a design based on your data. In this example, we will do the latter.
- Select the Anything icon
> Calculation > Custom Code and click anywhere on the page.
-
Click on your page where you wish to have this Calculation inserted, and drag to create a box for the output.
-
Enter a name under GENERAL > Name. In this example, we will change the name to design.
-
Enter your R code in the object inspector under General > R CODE. For non-randomized single version designs, you should use the below code and confirm and update the code based on your study where you see #UPDATE:
#UPDATE - ensure all of/only your MaxDiff most/least variable sets are within the list()
questions = list(MD1, MD2, MD3, MD4, MD5, MD6)
n.questions = length(questions)
question.attributes = lapply(questions, colnames)
alts.per.question = length(question.attributes[[1]])
attributes = unique(unlist(question.attributes))
design = matrix(NA, nrow = n.questions, ncol = alts.per.question + 2)
colnames(design) = c('Version', 'Question', paste0('Option.', 1:alts.per.question))
design[, 'Version'] = rep(1,nrow(design))
design[, 'Question'] = 1:n.questions
for (q in 1:n.questions) {
for (a in 1:alts.per.question) {
design[q, a + 2] = which(attributes == question.attributes[[q]][a])
}
}
design
For randomized multiple version designs, you would use the below instead and confirm and update the code based on your study where you see #UPDATE (there are 2 lines):
#UPDATE - ensure all of/only your MaxDiff alternative variable set names are within the list()
pipe.questions = list(MD1A, MD2A, MD3A, MD4A, MD5A, MD6A)
#UPDATE - ensure your MaxDiff version variable name is correct
version = vers_MAXDIFF
#get basics of design
n.questions = length(pipe.questions)
n.options = NCOL(pipe.questions[[1]])
#stack alternatives on top of one another
alts=do.call(rbind,pipe.questions)
#add in the version number and get unique combinations
des = unique(data.frame("Version"=rep(version,n.questions),
"Question"=rep(1:n.questions,each=length(version)),
alts))
#remove blank versions and order by version
des = des[!is.na(des$Version),]
des = des[order(des$Version),]
#make the data numeric to get alternative number
des = data.frame(lapply(des, as.numeric))
#rename the columns for each option
colnames(des)[3:NCOL(des)]=paste0("Option.",1:n.options)
#return final table named design
design = des
Part 3 – Create the best/worst variables
- Click the plus sign
when you hover over any variable in the Data Sources tree
- From the Insert Variable(s) menu, select Custom Code > Multiple R Variables > Numeric to create a Numeric - Multi variable set.
- Type the number of variables you want to create in the How Many Variables Do You Want to Create? box. In this example, we will create 12 new variables (6 best variables and 6 worst variables)
- Click OK
Each of the variables in this variable set shares the same R template code, that is used to create empty variables. - The next step is to update the template code with the following R code
- In the Data Set tree, click on Variable 1 in the New R Variable Set. The current code will be in the object inspector > General >R CODE box.
- For a non-randomized single version design, insert the following code on line 17, just before line of code that says new.data
Note, you will need to make the following edits to the code:
- Update the second line to list out the names of all of your MaxDiff variable sets.
- Update the fourth and fifth lines with the labels of the best and worst categories in your MaxDiff variable sets (capitalization counts!).
#Update with the names of the MaxDiff variable sets
questions = list(MD1, MD2, MD3, MD4, MD5, MD6)
#Update with the labels of the best and worst categories in the MaxDiff variable sets
bestlabel = "Most important"
worstlabel = "Least important"
removeHTML <- function(htmlString) {
return(gsub('<.*>', '', htmlString))
}
n.questions = length(questions)
question.attributes = lapply(questions, colnames)
alts.per.question = length(question.attributes[[1]])
attributes = removeHTML(unique(unlist(question.attributes)))
md.list = list()
get.best.worst = function (question, attributes, best.label = bestlabel, worst.label = worstlabel) {
option.selected.in.row = function (v) {
option = v[v != 'FALSE']
if (length(option) == 0) {
return(NA)
} else {
return(option)
}
}
q = question
q.labels = matrix(colnames(q), nrow = nrow(q), ncol = ncol(q), byrow = TRUE)
q.best = q == best.label & !is.na(q)
q.best[q.best] = q.labels[q.best]
q.best = apply(q.best, 1, option.selected.in.row)
q.worst = q == worst.label & !is.na(q)
q.worst[q.worst] = q.labels[q.worst]
q.worst = apply(q.worst, 1, option.selected.in.row)
q.best = factor(q.best, levels = attributes)
q.worst = factor(q.worst, levels = attributes)
best.worst = data.frame(q.best, q.worst)
}
for (i in 1:n.questions) {
md = get.best.worst(questions[[i]], attributes = attributes)
md.list[[i]] = md
}
x = do.call(cbind, md.list)
colnames(x) = paste0(colnames(x), rep(1:n.questions, each = 2))
new.data = x - For a randomized multiple version design, you will need to use and edit the below code instead:
- Update the second line to list out the names of all of your MaxDiff variable sets, and the fourth line to list out the names of all the variable sets that store the attributes that were shown.
- Update the sixth and seventh lines with the labels of the best and worst categories in your MaxDiff variable sets (capitalization counts!).
#Update with the names of the MaxDiff variable sets
questions = list(MD1, MD2, MD3, MD4, MD5, MD6)
#Update with the names of the variable sets that store what attributes were shown
pipe.questions = list(MD1A, MD2A, MD3A, MD4A, MD5A, MD6A)
#Update with the labels of the best and worst categories in the MaxDiff variable sets
bestlabel = "Most important"
worstlabel = "Least important"
removeHTML <- function(htmlString) {
return(gsub('<.*>', '', htmlString))
}
n.questions = length(questions)
question.attributes = lapply(questions, colnames)
alts.per.question = length(question.attributes[[1]])
attributes = removeHTML(Reduce(union, lapply(pipe.questions, function(x) Reduce(union, lapply(x, function(y) levels(y))))))
md.list = list()
get.best.worst = function (question, pipe.questions, attributes, best.label = bestlabel, worst.label = worstlabel) {
option.selected.in.row = function (v) {
option = v[v != 'FALSE']
if (length(option) == 0) {
return(NA)
} else {
return(option)
}
}
q = question
q.labels = pipe.questions[[i]]
q.best = q == best.label & !is.na(q)
q.best[q.best] = q.labels[q.best]
q.best = apply(q.best, 1, option.selected.in.row)
q.worst = q == worst.label & !is.na(q)
q.worst[q.worst] = q.labels[q.worst]
q.worst = apply(q.worst, 1, option.selected.in.row)
q.best = factor(q.best, levels = attributes)
q.worst = factor(q.worst, levels = attributes)
best.worst = data.frame(q.best, q.worst)
}
for (i in 1:n.questions) {
md = get.best.worst(questions[[i]], pipe.questions, attributes = attributes)
md.list[[i]] = md
}
x = do.call(cbind, md.list)
colnames(x) = paste0(colnames(x), rep(1:n.questions, each = 2))
new.data = x
9. The Data Sources tree should now look like this:
10. Click on New R Variable Set in the Data Sources tree
11. In the Object Inspector, change the structure to Nominal - Multi Grid with unordered categories
The results are as follows:
You can drag the new variable set onto the page to see a summary of the choices like below to confirm.
The questions are now ready to be analyzed in Displayr.
Next
How to Create a MaxDiff Experimental Design
How to Do MaxDiff Latent Class Analysis
How to Use Hierarchical Bayes for MaxDiff
How to Import Data from Qualtrics