This article describes how to do a Latent Class Analysis in Displayr. Latent Class is a statistical technique for grouping together similar observations (i.e., creating segments). There are also some technical details on how to interpret a Latent Class Analysis at the bottom of this article.
A data set containing the variables that you want to use as inputs to the cluster analysis segmentation.
1. Login into Displayr and load a document.
2. Load the data set that contains the variables that you want to use as inputs to the Latent Class Analysis.
3. From the toolbar menu, select Anything > Advanced Analysis > Cluster > Latent Class Analysis.
4. On the next screen, select the variables that you want to include as inputs to the Latent Class Analysis from the Available data list. The selected variables will be displayed on the Data to display list. As an example, I've used a data set containing statements on a 5-point agree/disagree scale about attitudes around mobile technology.
5. Select the number of segments you want to create:
- Select Work out number of groups automatically if you want Displayr to determine the number of groups with the greatest differences using the Bayesian Information Criterion (BIC), or
- Select Specify the number of groups and enter a value for the number of segments you want to create.
For this example, I've selected the latter option and entered a value of 4.
6. OPTIONAL: Apply a filter if you want to create a segmentation for a specific subgroup.
7. OPTIONAL: Select a weight if you want the input variables weighted.
8. Click the Create Latent Class Analysis button.
The Latent Class output will then be generated. The first column shows the distribution of responses for the entered sample used in the analysis. Each additional column shows the response distributions for each of the segments.
A new single response variable is added to the bottom of the data set called "Latent Class Analysis" with a date/time stamp in the variable label.
This variable contains the segment assigned to each respondent. Create a table using this variable to see the distribution of segment assignments.
9. To get a diagnostic report of your latent class analysis, Anything > Advanced Analysis > Cluster > Diagnostic > Analysis Report
The results are as follows:
To better understand how to interpret the output of the LCA, please see our technical documentation here. Also keep in mind, the estimated size of each segment as a percentage and in terms of number of respondents is shown at the bottom of each segment. It is commonplace that when crosstabs are created using the segmentation variable that the segment sizes will differ from the numbers shown in these boxes (although the differences are typically small). This is because the segment sizes that are shown on the tree are estimates, where the estimates are constructed under the assumption that there is uncertainty (e.g., a person may have a 33% chance of being in one segment and a 66% chance of being in another and a 1% chance of being in a third). By contrast, when the segments variable is selected in crosstabs the assumption is that each person must be in one and only one segment, and the difference between these assumptions causes differences in results. When a weight is used, the total population size (as shown in the Population in the top node) is the Effective Sample Size for the sample that has been used for the segmentation.
Grid questions need a little more treatment before using them in your Latent Class Analysis (LCA). Either stack the data file, or, change the Question Type to Binary-Multi or Numeric - Multi.