A variable set is a group of one or more variables. Variable sets are used to create summary tables, crosstabs, and can be accessed via code. A variable set has a structure, and this determines the characteristics of the table that is created (e.g., which statistics are shown, how statistical testing is performed).
Variable Sets are groups of 1 or more variables
Variable sets can also be used in calculations. The easiest way to do this is to drag the variable set into your code.
Variable sets are used to create tables (and can be accessed via code)
Variable sets are the building block of tables. Any summary table is a table which is summarizing the data from a single variable set. A crosstab is a table that contains two or more variable sets.
Consequently, the key to creating tables is to create and modify variable sets. This is discussed in more detail in xxx.
A variable set has a structure, which determines how the variable set is used when creating tables (e.g., whether to show averages or percentages and how statistical tests are performed). When a variable set is selected, its structure is shown in the Structure field in the object inspector, on the right side of the screen. Sometimes a definition of the structure appears immediately to the right, as in the case below.
The structure of a variable set can be seen by the icon that is used to represent the variable set:
The structure of a variable set deterines how it is analyzed when it is used to create a table. For example, a summary table of a Binary - Grid variable set (defined below), appears as a grid.
The structure of a variable set is made up of its measurement scale and its set type. Measurement scales refer to the properties of the data - is it text, binary, nominal, odinal, numeric, or date/time. Set type refers to whether there is a single variable, multiple variables, which can appear in multiple structures. Displayr supports 13 variable set structures.
An individual variable has a measurement scale. Displayr recognizes the following measurement scales:
- Text. Typically this is used to store unstructured text data.
- Binary: There can only be two categories (e.g., Yes, No), or, two categories and missing values.
- Nominal: Two or more categories that are not in any natural order (e.g., Red, Green, Blue).
- Ordinal: Two or more categories with an unambiguous ordering (e.g., Dislike, Ambivalent, Like).
- Numeric: Data where a number is stored, and the number has no associated label (e.g., 1.23, 1, 0).
There are numerous possible set types, but they broadly fall into four groups:
- Variables sets containing a single variable.
- Variable sets containing a set of variables with no order, or, a one-dimensional order.
- Variables containing a set of variables that can be ordered in a two-dimensional order (grids). These are illustrated below.
- Variable sets containing structural dependencies between the variables. These are illustrated below.
Displayr's 13 variable set structures
- Binary - Multi. Consider data that comes from a question that asks Which of these brands have you drunk in the past 7 days? Coke, Pepsi, Dr Pepper, None of these. The most convenient way of storing this data is as four binary variables, with one for each brand and one for None of these. In Displayr, this is stored as a Binary - Multi variable set.
- Nominal - Multi. This is used for storing sets of related nominal variables. E.g., main food type consumed at breakfast, at lunch, at dinner, etc.
- Ordinal - Multi. This is used for sets of ratings. E.g., satisfaction with KFC, McDonald's, etc.
- Numeric - Multi. This is used for sets of related numeric variables. Q2a. How many times did you drink these brands in the past week? ___ Coca-Cola ___ Diet Coke ___ Coke Zero ___ Pepsi ___ Diet Pepsi ___ Pepsi Max
- Binary - Multi (Compact). This is for data that is stored as Nominal - Multi, but needs to be interpreted as Binary - Multi.
- Binary - Grid. This is for Binary - Multi data, where there is a clear two dimensional structure. For example, Which of these did you drink 'at home' in the past week? Coca-Cola, Diet Coke, Coke Zero, Pepsi, Diet Pepsi , Pepsi Max. And whihc of these did you drink 'out and about' in the past week? Coca-Cola, Diet Coke, Coke Zero, Pepsi, Diet Pepsi , Pepsi Max.
- Numeric - Grid. This is for Numeric - Multi data, where there is a clear two dimensional structure. For example, How many times did you drink these brands 'at home' in the past week? ___ Coca-Cola ___ Diet Coke ___ Coke Zero ___ Pepsi ___ Diet Pepsi ___ Pepsi Max. And How many times did you drink these brands 'out and about' in the past week? ___ Coca-Cola ___ Diet Coke ___ Coke Zero ___ Pepsi ___ Diet Pepsi ___ Pepsi Max.
- Ranking. This is used when each variable contains the ranking that a respondent has assigned to an object.
- Experiment. This is used for some advanced statistical analysis. It's documented on the Q Wiki.