This article describes how to compare two tables by comparing each cell in one table with the corresponding cell in the other table.
If the primary statistic in the tables is a percentage computed from categorical variables, then the type of test used to compare the values will be determined the test for Proportions specified in Significance Testing. If the primary statistic is an average computed from numeric variables, then the test for Means will be used instead. A number of different options are available for displaying significant values.
Requirements
Please note these steps require a Displayr license.
- Two tables with the same row and column names and the same primary statistic. The primary statistic can be one of "Average", "%", "Total %", "Column %", or "Row %". They must also have appropriate cell statistics for Sample Size and Standard Error for the primary statistic used. For example when the primary statistic is "Column %", the input tables must also contain "Sample Size" and "Standard Error".
- The two tables cannot be crosstabs of Nominal - Multi, Binary - Grid, or Numeric - Grid variable sets where the variable sets are crossed by another variable. However, those variable set types will work as SUMMARY tables. If using a grid variable set in the table, first reduce the complexity of the table by flattening or restructuring the variable set in some other way to reduce the complexity of the table.
Technical Details
Statistical Tests
Only some of the tests used by Displayr's automated statistical testing on tables are supported by this tool. This is because some test results cannot be replicated without access to the raw data, and the Table of Differences only has access to the table statistics that are available on the input tables.
The following options are not supported:
- Quantum and Survey Reporter tests.
- Non-parametric tests for numeric data.
- Non-parametric tests for weighted proportions.
- Tests of correlations.
If these options are chosen in the Advanced Statistical Testing Assumptions, the Table of Differences will fall back to using z-Tests for proportions, and t-Tests for means.
The other settings from the Advanced Statistical Testing Assumptions which are incorporated by the testing used in the Table of Differences testing are:
- The Bessel's Correction for Proportions and Means tests.
- The Extra deff design effect constant.
Using Rules
If one or both of the input tables have a Rule applied which weights the data (for example, see How to Apply a Weight to a Specific Column Only), then there is no way for the Table of Differences tool to know that the data has been weighted, and an unweighted test will be used.
Method
- Go to Table > Specialty > Table of Differences.
- From Inputs > Table of Differences, select your existing tables for Table 1 and Table 2.
OPTIONAL: You can adjust the following settings from Inputs:
-
Show: This option controls which values are shown in the output table. It does not affect the test statistic or p-values.
- Differences between Table 2 - Table 1.
- Primary statistic of Table 2 with differences not shown but reflected by the shading options.
- Primary statistic of Table 2 with differences with separate formatting controls for the primary statistic and the difference values.
-
Show significant values by shading in controls which element is shaded to reflect the p-values from the t-test performed on the two tables.
- None - No shading is applied.
- Cell colors - Shading is applied to the cell fill color.
- Arrows - Shading is applied to up/down arrows next to the cell value.
- Boxes - Shading is applied to a box drawn around the cell text. Additional controls are associated with this option. A separate control for the border color is provided for each significance level and the border width, corner roundness, and padding around the box can be adjusted. Note that the padding is specified in terms of pixels, so if the size of the output does not provide have enough space to accommodate the box including its padding, the box will be truncated.
- Number of significance levels - controls the number of thresholds (and corresponding shades) applied to the table.
- Rows/Columns to ignore A comma-separated list of row/column names which should not be shown in the output table.
You can adjust the following settings from Format:
- Font family - The font family for all text in the table.
- Font size - The font size of all text in the table.
- Font units - The units in which the font size is specified. This can be either "px" or "pt".
- Number of decimals shown - Separate controls are shown for the primary statistic and difference.
- Prefix/Suffix - Optional text to prepend or append to the primary statistic/difference.
- Font family/size of differences - Font controls for the primary statistic and difference (separately). These will default font controls, but where they differ these controls allow different fonts to be shown in a different setting.
- Show +/- sign on differences - Always prepend differences with ‘+’ or ‘-‘.
- Automatically determine the font color - Text will be automatically colored black or white depending on the background color. It takes into account the cell fill color, and conditional shading (both in cell color and box).
- Show legend - Whether or not to show a legend explaining the shaded cells/boxes/arrows in the footer of the table.
- Show column/row headers - Whether to show column/row headers for the table. In some cases, whether the Table of Differences is shown next to other visual elements, information about the row/column names may already be present.
- Show borders around row/column headers - By default, borders are only placed around the table cells, but they can be extended to include the header/row headers.
- Row height automatically fills R output - This is the default option. Users can adjust the spacing by dragging and resizing the output. Note, however, that if the table has many rows, then it may be better to set it to a fixed row height.
- Row height - The height of a row in the table (with no word wrap). This option is only shown if row height does not automatically fill R output. This value will be specified in terms of font units. By default, it is 5 + font size. If there are too many rows to show, a scrollbar will be shown.
- Column widths - A column separated list of values (including units, e.g. "px", "pt", "%") specifying the widths of the column. Each value will be a single column starting from the left. The remaining columns with no specified widths will be equally sized to fill the remaining space. Note that the first value will be applied to the row headers (if they are shown).
- Column header fill - The color of the column header cells.
- Row header fill - The color of the row header cells containing the names of the profiling variables.
- Cell fill - The color of the cells (excluding row/column headers) and cells colored by conditional formatting.
- Collapse borders - Whether the borders of adjacent cells should be collapsed into a single line. This is the default, but there is also the option to not collapse borders, in which case the ‘’gap between rows’’ or ‘’gap between columns’’ can be manually adjusted.
- Border color - The color of the border.
- Border width - The width of the border in pixels.