How to Create a Segmentation Comparison Table

This article describes how to go from a Latent Class Analysis segmentation output:

To a Segmentation Comparison Table which profiles the segments against other questions in your survey (e.g., demographic, behavioral, attitudinal questions):

Requirements

A variable in your dataset containing cluster/segment membership.
Demographic, behavioral, attitudinal, or other questions in your dataset that you want use to profile the segments.

Method

From the Report tree select + > Advanced Analysis > Cluster > Segment Comparison Table.
From the Data tab of the object inspector on the left, select the variable containing the segments from the Variable dropdown box.
From the Profiling variables dropdown, select the questions you want to use to profile the segments.
Click the Calculate button to generate the comparison table. The segments will be displayed across the top with the profiling question categories in the rows.

Options

Use one of Variable containing segment membership (default) or K-Means. The second option allows users to construct a K-Means model and profile the predicted clusters in one step.

Variable A nominal or ordinal variable. The categories of the variable will make up the columns of this table.

Profiling variables One or more variable or variable sets. These can be of any type except text.

Show index values Show column percentages and averages as a proportion of the total for the row. This is the same as shown in Displayr tables.

Shade Provides the option to color each cell based on the standardized value in the cell. The color can be applied to:

None No shading is applied
Cell colors Shading is applied to the cell fill color
Font colors Shading is applied to the text in the cell
Arrows Shading is applied to up/down arrows next to the cell value
Fonts and arrows Shading is applied to both the arrow and the cell text
Boxes Shading is applied to a box drawn around the cell text. The width and corner roundness of the box can be adjusted.
Bars Shading is applied to the bars, while the length of the bars reflects the difference between the cell value and the row mean

Thresholds Values in each cell are standardized and compared against these thresholds to determine whether a cell is shaded by the color for Very small values, Small values, Large values, or Very large values. Numeric values are standardized by dividing by 2 × the standard deviation as suggested by Gelman (2007). Column percentages are standardized by dividing by row totals (i.e., they are the index values).

Only shade significant results. By default, cell values that are very large or small but not statistically significant will not be shaded. This may occur because there are fewer observations for a particular category.

Color cell text conditional on significance. Color the text in the cells based on whether the average or column percentage is significantly different from the value for observations not in that segment (i.e., column; compare Crosstabs of Proportions).

Non-significant font color: The text in cells that are not significant will be shown in this color instead of the font color.

Use a non-parametric test: For numeric variables, use the ranks instead of the numeric values to conduct the t-test. This is the same as using a non-parametric test in the statistical assumptions for a Q Table.

False discovery rate correction: Adjust p-values to account for the multiple tests conducted in the table.

Confidence level: Threshold used in the test to determine significance.

Column labels: Optional comma-separated list to override the column labels (or segment names).

Rows to hide: Specify rows to hide as a comma-separated list. Row names that are not used in the table will be ignored. Use double quotes to escape row names containing commas.

Decimals shown for percentages Number of decimals shown for categoric variables.

Decimals shown for numeric data Number of decimals shown for numeric variables.

Font family The font family for all text in the table

Font size The font size of all text in the table.

Font units The units in which the font size is specified. This can be either "px" or "pt".

Row height The height of a row in the table (with no word wrap). This value will be specified in terms of font units. By default it is 5 + font size.

Column widths A column separated list of values (including units, e.g. "px", "pt", "%") specifying the widths of the column. Each value will be a single column starting from the left. The remaining columns with no specified widths will be equally sized to fill the remaining space.

Column header fill The color of the column header cells.

Row header fill The color of the row header cells containing the names of the profiling variables.

Summary rows fill The color of the cells in the first two rows showing the breakdown of the segmentation variable.

Cell fill The color of the cells (excluding row/column headers) and cells colored by conditional formatting.

Border color The color of the border.

Border width The width of the border in pixels.

How to Analyze Data by Groups/Segments

How to Do Latent Class Analysis

How to Do K-Means Cluster Analysis

How to Save K-Means Cluster Membership

Articles in this section

Requirements

Method

Options

Next

Articles in this section

Requirements

Method

Options

Next

Related articles