This article describes how to create a choice-based conjoint model using discrete choice experiment data in Displayr.
Requirements
- A document containing a choice-based modeling data set
- An appropriate choice model design (see list of formats listed below)
Method
1. From the menus, select Anything > Advanced Analysis > Choice Modeling and select one of the following models:
- Hierarchical Bayes - this model is more flexible in modeling the characteristics of each respondent and tends to produce a model that better fits the data
- Latent class analysis - to be used when you want a segmentation of respondents
- Multinomial logit - equivalent to single-class latent class analysis
A new R output called choice.model will appear on your page.
2. From the object inspector, select one of the following options for the Experimental Design > Design source:
- Data set - select variables from a data set to specify the design. Variables need to be supplied corresponding to the version, task, and attribute columns of a design. See here for an example.
- Experimental design R output - select an R output in the project to supply the choice model design (created using Anything > Advanced Analysis > Choice Modeling > Experimental Design).
-
Sawtooth CHO format - supply the design using a Sawtooth CHO file. You'll need to upload the CHO file to the project as a data set (first rename it to end in .csv instead of .cho) so that Displayr can recognize it. The new data set will contain a text variable, which should be supplied to the CHO file text variable input.
Important: The .csv file needs to be uploaded to the cloud drive and then added to the project from there. Displayr won't allow a direct upload of a text file - Sawtooth dual file format - supply the design through a Sawtooth design file (from the Sawtooth dual file format). You'll need to upload this file to the project as a data set. The version, task, and attributes from the design should be supplied to the corresponding inputs (similar to the Data set option).
- JMP format - supply the design through a JMP design file. You'll need to upload this file to the project as a data set. The version, task, and attributes from the design should be supplied to the corresponding inputs (similar to the Data set option).
- Experiment variable set - supply the design through an Experiment variable set in the project.
3. When Data set, Sawtooth dual file format, or JMP format are selected, choose the variables from your design data set containing the Version, Task, and Attributes.
Note, if you are working with an Alchemer (formerly SurveyGizmo) data set, the ResponseID from the conjoint data set is used as Version and Set Number as Task.
Alternative-specific designs are supported in Attributes (attributes that do not apply to an alternative are coded as a 0). Any alternatives for which all of the values are missing are identified as 'None of these' alternatives and will have coefficients estimated as an alternative-specific constant with the label None of these.
4. You'll also need to provide attribute levels through a spreadsheet-style data editor for most of these options. To enter the attributes, select Enter attribute levels and enter the attribute name and levels in each column:
Note that this is optional for the JMP format if the design file already contains attribute-level names. The levels are supplied in columns, with the attribute name in the first row and attribute levels in subsequent rows.
5. Code some categorical attributes as numeric - Whether to treat some categorical attributes as numeric. If checked, a text box will appear below to allow the attribute and numeric coding to be specified as a comma-separated list, e.g. Weight, 1, 2, 3, 4. When one text box is filled, another text box will appear for another attribute to be specified.
6. Next, you'll need to select the Respondent Data. Whether respondent data needs to be explicitly provided depends on how you supplied the design in the previous step. If an Experiment Question or CHO file was provided, there is no need to separately provide the data, as Experiment Questions and CHO files already contain the choices made by the respondents.
For the other methods of supplying the design, the respondent Choices and the Tasks or Version corresponding to these choices need to be provided from variables in the project. Each variable corresponds to a question in the choice experiment, and the variables need to be provided in the same order as the questions.
Note the following:
- If you have a 'None of these' option, you will need to code with the index that the 'None of these' option appears in the design. For example, if 'None of these' is the fifth option shown, then 'None of these' should get a value of 5 for the relevant variables in your data set. See Variable Sets for how to confirm and modify your data.
- If you have a dual-response 'none' design, the following requirements apply:
- The variables with the dual-response 'none' data must be a Binary - Multi variable set.
- Under Data > Data Values > Select categories, Count This Value must be selected for the category which indicates that the respondent would purchase their selected choice.
- The response categories which indicate the respondent would purchase their selected choice and the one which indicates the respondent would not purchase their selected choice should also be set to Include in analyses in the Missing Data column.
- You will additionally need to select the corresponding 'Yes/No' questions in the Dual-response 'none' choice field of the choice model output.
- Note that if your conjoint data comes from Alchemer, see How to Convert Alchemer Conjoint Data for Analysis in Displayr. Displayr will then add the appropriate questions containing the choices and the design version in the respondent data set.
- Instead of using respondent data, there is also an option to use simulated data by changing the Data source setting to Simulated choices from priors. See this blog post for more information on using simulated data.
7. If Sawtooth CHO Format was selected as the Design source, select the Respondent IDs, which is a variable containing respondent IDs corresponding to those in the CHO file.
8. If Experimental Design is selected as the Data source, choose the Prior source - between using priors from the choice model design output or manually entering the priors. If the design output contains no priors, prior means and standard deviations of 0 are assumed.
9. Enter the Simulated sample size - the number of simulated respondents to generate.
10. Dual-response 'none' choice - (Optional) Variables indicating dual-response 'None of these' choices. Should be the same number of variables as Choices and Tasks. These variables should be combined as a Pick Any question, with the category that indicates the respondent would purchase their selected choice being selected as Count this value in the Values.
11. Select one of the following options from the Missing data input, which determines how Displayr will deal with missing data, if any:
- Use partial data is the default setting which ignores questions with missing data but keeps other questions for analysis
- Exclude cases with missing data removes respondents from the analysis if any of the selected questions contain missing data
- Error if missing data shows an error message if any respondents have missing data on any of the selected questions
12. In the Model section, if Latent Class Analysis or Hierarchical Bayes is selected as the model Type, enter the Number of classes you want the model to create.
13. OPTIONAL: Enter a value for Questions left out for cross-validation. If there are too many classes, the computation time will be long, and the model may overfit the data. To determine the amount of overfitting in the data, set Questions left out for cross-validation to be greater than the default of 0. This will allow you to compare the output's in-sample and out-of-sample prediction accuracies.
14. Tick Alternative-specific constants to include alternative-specific constants in the model.
15. Indicate the Seed, which is the random seed used to determine the random initial parameters of the model and also used to determine the random questions to leave out for cross-validation. The default is 123.
15. Indicate the number of Iterations used in the Hierarchical Bayes analysis.
14. All other options are more advanced and detailed below. These can be left at their default values. For more information see Checking Convergence When Using Hierarchical Bayes for Conjoint Analysis and How to Improve Choice Model Accuracy Using Covariates.
-
Respondent-specific covariates Variables containing respondent-specific covariates to be included in the model.
-
Chains The number of chains used in the Hierarchical Bayes analysis.
-
Maximum tree depth The maximum tree depth parameter. Only increase this if warnings about "tree depth" are shown.
-
Adapt delta The maximum adapt delta parameter. Only increase this if warnings about "low adapt delta" are shown.
10. OPTIONAL: Apply a filter to the model by selecting a filter variable from the Filter(s) input at the top of the object inspector.
11. OPTIONAL: Apply a weight to the model by selecting a weight variable from the Weight input at the top of the object inspector.
12. Press the Calculate button to run the model.
The following options are also available once the model has run:
Diagnostics
Save Variable(s)
Proportion of Correct Predictions
Technical Details
An R package called flipChoice is used to run the Hierarchical Bayes analysis. flipChoice uses rstan to fit the underlying Bayesian statistical model, which is itself an R interface for Stan.
Adaptive Choice Models
Please note that the choice modeling analysis tools do not support adaptive choice-based conjoint experiments. Such experiments develop the design of future choice tasks based on previous respondent answers and can involve multiple styles of questioning. Thus, while it may be possible to manually reconstruct the design in a way that is compatible with the choice modeling tools, it is unclear whether such designs are consistent with the assumptions of the analysis methods used by these tools.
See Also
A worked example is available in this blog post.
For further information on Hierarchical Bayes modeling, please refer to chapter 5 from Bayesian Statistics and Marketing.
Additional Properties
When using this feature you can obtain additional information that is stored by the R code which produces the output.
- To do so, select Create > R Output.
- In the R CODE, paste: item = YourReferenceName
- Replace YourReferenceName with the reference name of your item. Find this in the Report tree or by selecting the item and then going to Properties > General > Name from the object inspector on the right.
- Below the first line of code, you can paste in snippets from below or type in str(item) to see a list of available information.
For a more in-depth discussion on extracting information from objects in R, check out our blog post here.
Next
How to Change the Specification of a Choice Model
How to Read Displayr's Choice Model Output
How to Remove Random Choosers From a Choice-Based Conjoint Model
How to Create an Experimental Design for Choice-Based Conjoint
How to Set Up a Choice-Based Conjoint Analysis in Qualtrics
How to Preview a Choice-Based Conjoint Questionnaire
How to Compare Multiple Choice Models
How to Create a Utilities Plot
How to Save Utilities From a Choice Model
How to Save Class Membership From a Choice Model