This article describes how to go from a Latent Class Analysis segmentation output:
To a Segmentation Comparison Table which profiles the segments against other questions in your survey (e.g., demographic, behavioral, attitudinal questions):
Requirements
- A variable in your dataset containing cluster/segment membership.
- Demographic, behavioral, attitudinal or other questions in your dataset that you want use to profile the segments.
Method
- From the toolbar, select Anything > Advanced Analysis > Cluster > Segment Comparison Table.
- From the Data tab of the object inspector on the left, select the variable containing the segments from the Variable dropdown box.
- From the Profiling variables dropdown, select the questions you want to use to profile the segments.
- Click the Calculate button to generate the comparison table. The segments will be displayed across the top with the profiling question categories in the rows.
Options
Use one of Variable containing segment membership (default) or K-Means. The second option allows users to construct a K-Means model and profile the predicted clusters in one step.
Variable A nominal or ordinal variable. The categories of the variable will make up the columns of this table.
Profiling variables One or more variable or variable sets. These can be of any type except text.
Show index values Show column percentages and averages as a proportion of the total for the row. This is the same as shown in Displayr tables.
Shade Provides the option to color each cell based on the standardized value in the cell. The color can be applied to
-
- None No shading is applied
- Cell colors Shading is applied to the cell fill color
- Font colors Shading is applied to the text in the cell
- Arrows Shading is applied to up/down arrows next to the cell value
- Fonts and arrows Shading is applied to both the arrow and the cell text
- Boxes Shading is applied to a box drawn around the cell text. The width and corner roundness of the box can be adjusted.
- Bars Shading is applied to the bars, while the length of the bars reflects the difference between the cell value and the row mean.
Thresholds Values in each cell are standardized and compared against these thresholds to determine whether a cell is shaded by the color for Very small values, Small values, Large values, or Very large values. Numeric values are standardized by dividing by 2 * standard deviation as suggested by Gelman (2007). Column percentages are standardized by dividing by row totals (i.e. they are the index values).
Only shade significant results By default, cell values which are very large or small but not statistically significant will not be shaded. This may occur because there are less observations for a particular category.
Color cell text conditional on significance Color the text in the cells based on whether the average or column percentage is significantly different from the value for observations not in that segment (i.e. column; compare Crosstabs of Proportions).
Non-significant font color The text in cells which are not significant will be shown in this color instead of the font color.
Use non-parametric test For numeric variables, use the ranks instead of the numeric values to conduct the t-test. This is the same as using a non-parametric test in the statistical assumptions for a Q Table.
False discovery rate correction Adjust p-values to account for the multiple tests conducted in the table.
Confidence level Threshold used in the test to determine significance.
Column labels Optional comma-separated list to override the column labels (or segment names).
Rows to hide Specify rows to hide as a comma separated list. Row names which are not used in the table will be ignored. Use double quotes to escape row names containing commas.
Decimals shown for percentages Number of decimals shown for categoric variables.
Decimals shown for numeric data Number of decimals shown for numeric variables.
Font family The font family for all text in the table
Font size The font size of all text in the table.
Font units The units in which the font size is specified. This can be either "px" or "pt".
Row height The height of a row in the table (with no word wrap). This value will be specified in terms of font units. By default it is 5 + font size.
Column widths A column separated list of values (including units, e.g. "px", "pt", "%") specifying the widths of the column. Each value will be a single column starting from the left. The remaining columns with no specified widths will be equally sized to fill the remaining space.
Column header fill The color of the column header cells.
Row header fill The color of the row header cells containing the names of the profiling variables.
Summary rows fill The color of the cells in the first two rows showing the breakdown of the segmentation variable.
Cell fill The color of the cells (excluding row/column headers) and cells colored by conditional formatting.
Border color The color of the border.
Border width The width of the border in pixels.
Next
How to Analyze Data by Groups/Segments
How to Do Latent Class Analysis
How to Do K-Means Cluster Analysis
How to Save K-Means Cluster Membership