Introduction
This article describes how to control which categories are used when computing percentages and averages on tables.
Please note these steps require a Displayr license.
Method
The categories used when computing statistics on tables are controlled via Value Attributes (see the example below). These can be edited in a variety of ways, including by:
- Selecting a variable in the Data Sources tree and clicking one of the options under Data > Properties > DATA VALUES on the right.
- Right-clicking on a table category that you wish to exclude and selecting Delete or Remove.
- When setting the Structure of a Binary variable set via the Fix button in the orange warning above.
- Automatically when using Combine.
- Via QScript.
Individual categories are included or excluded as follows:
- If a category is set to Excluded from analyses in the Missing Values column, it will then be excluded from all calculations. (Note that if you can also see a Missing data row, this refers to observations that are marked as missing values in the original data file.)
- If a category is set to Include in percentages (but not averages) in the Missing Values column, it will be excluded from all numeric calculations (e.g., Average, Median), but will be included when computing percentages. Note that you will only see the Value column on data that can be represented numerically (e.g., it will not appear for binary variable sets).
- If it is a Date/Time variable set, there is a completely different set of options (see Setting Time Periods for Date Questions).
- If you have Binary - Multi or Binary - Grid variable sets, there is a column called Count This Value, which dictates the numerator when computing percentages. In the example below, for example, there are six unique categories in the data file: Like, Love, Neither like nor dislike, Hate, Dislike, and Missing data, and the settings shown tell us that:
- Anybody with Missing data is excluded from any calculations.
- The analysis will count up the number of people that have selected either Like or Love. That is, this number will be the numerator in any calculations of percentages (i.e., the bit that goes above the line in a fraction).
- The base used in calculating percentages consists of everybody except those people that have Missing data. Thus, this particular example will compute Top 2 Box percentages scores (i.e., the proportion of people that said Like or Love from amongst all those people that selected one of the five categories).
Excluding categories when computing percentages
Nominal and Nominal - Multi variable sets
Right-click on the table category you wish to exclude and select Delete. This causes the table to be recomputed with this category removed. You can see which categories have been removed by selecting the variable in your Data Sources tree and clicking Data > Properties > Missing values. These will be the categories that have been set to Exclude from analyses in the Missing Values column.
Alternatively, to remove a category from the table without affecting the calculations on a table, you can right-click the category and select Hide. This removes the category label from the table but sets it instead as Hide but include in NET calculations.
To undo the removal of categories from a table you can select the table and click Data > Properties > UNHIDE ROWS AND COLUMNS > Unhide Categories Shown in Rows/Columns. You can also select the variable set in the Data Sets tree and modify the Missing Values settings under Data > Properties > DATA VALUES > Missing values, or else click Reset to revert all changes.
Excluding categories from averages
Numeric, Numeric - Multi, and Numeric - Grid variable sets, and STATISTICS > Right and STATISTICS > Below
In some cases, you may wish to keep a category showing in the table but remove its contribution to the Average, Sum, or other numerical statistics that are displayed in the STATISTICS > Right or STATISTICS > Below. For example, a rating scale question may include a Don't Know category, and you want to know about the number of respondents who have selected this category without those respondents contributing to the calculation of the average score for the variable set.
To achieve this, select the variable in your Data Sources tree, click Data > Properties > DATA VALUES > Values, and set Missing Values for this category to Include in percentages (but not averages). This value will now not be used in the calculation of the average.
The same method is used to change the way the Average, Sum, or other numeric statistics are calculated for tables showing Numeric, Numeric - Multi, and Numeric - Grid variable sets.