A variable set is a group of one or more variables. Variable sets are used to create summary tables, crosstabs, and can be accessed via code. A variable set has a structure, and this determines the characteristics of the table that is created (e.g., which statistics are shown, how statistical testing is performed).
Variable Sets are groups of 1 or more variables
Variable sets can also be used in calculations. The easiest way to do this is to drag the variable set into your code.
Variable sets are used to create tables (and can be accessed via code)
Variable sets are the building block of tables. Any summary table is a table which is summarizing the data from a single variable set. A crosstab is a table that contains two or more variable sets.
Consequently, the key to creating tables is to create and modify variable sets.
A variable set has a structure, which determines how the variable set is used when creating tables (e.g., whether to show averages or percentages and how statistical tests are performed). When a variable set is selected, its structure is shown in the Structure field in the object inspector, on the right side of the screen. Sometimes a definition of the structure appears immediately to the right, as in the case below.
The structure of a variable set can be seen by the icon that is used to represent the variable set:
The structure of a variable set determines how it is analyzed when it is used to create a table. For example, a summary table of a Binary - Grid variable set (defined below), appears as a grid.
The structure of a variable set is made up of its measurement scale and its set type. Measurement scales refer to the properties of the data - is it text, binary, nominal, odinal, numeric, or date/time. Set type refers to whether there is a single variable, multiple variables, which can appear in multiple structures. Displayr supports 13 variable set structures.
An individual variable has a measurement scale. This determines how the values are treated when used in tables and other analyses. Displayr recognizes the following measurement scales:
- Text: Typically this is used to store unstructured text data.
- Binary: There can only be two categories (e.g., Yes, No), or, two categories and missing values.
- Nominal: Two or more categories that are not in any natural order (e.g., Red, Green, Blue).
- Ordinal: Two or more categories with an ordering (e.g., Dislike, Ambivalent, Like).
- Numeric: Data where a number is stored, and the number has no associated label (e.g., 1.23, 1, 0).
- Date/Time: Dates stored on a continuous scale that can be grouped into time periods for easy analysis.
There are numerous possible set types, but they broadly fall into four groups:
- Variables sets containing a single variable.
- Variable sets containing a set of variables with no order, or, a one-dimensional order.
- Variables containing a set of variables that can be ordered in a two-dimensional order (grids). These are illustrated below.
- Variable sets containing structural dependencies between the variables. These are illustrated below.
Displayr's 13 variable set structures
These are 13 variable set structures in Displayr along with an example of what each looks like in a table:
- Binary - Multi. Consider data that comes from a question that asks Which of these brands have you drunk in the past 7 days? Coke, Pepsi, Dr Pepper, None of these. The most convenient way of storing this data is as four binary variables, with one for each brand and one for None of these. In Displayr, this is stored as a Binary - Multi variable set.
- Nominal - Multi. This is used for storing sets of related nominal variables. E.g., main food type consumed at breakfast, at lunch, at dinner, etc.
- Ordinal - Multi. This is used for sets of ratings. E.g., satisfaction with KFC, McDonald's, etc.
- Numeric - Multi. This is used for sets of related numeric variables. Q2a. How many times did you drink these brands in the past week? ___ Coca-Cola ___ Diet Coke ___ Coke Zero ___ Pepsi ___ Diet Pepsi ___ Pepsi Max
- Binary - Multi (Compact). This is for data that is stored as Nominal - Multi, but needs to be interpreted as Binary - Multi.
- Binary - Grid. This is for Binary - Multi data, where there is a clear two dimensional structure. For example, Which of these did you drink 'at home' in the past week? Coca-Cola, Diet Coke, Coke Zero, Pepsi, Diet Pepsi , Pepsi Max. And which of these did you drink 'out and about' in the past week? Coca-Cola, Diet Coke, Coke Zero, Pepsi, Diet Pepsi , Pepsi Max.
- Numeric - Grid. This is for Numeric - Multi data, where there is a clear two dimensional structure. For example, How many times did you drink these brands 'at home' in the past week? ___ Coca-Cola ___ Diet Coke ___ Coke Zero ___ Pepsi ___ Diet Pepsi ___ Pepsi Max. And How many times did you drink these brands 'out and about' in the past week? ___ Coca-Cola ___ Diet Coke ___ Coke Zero ___ Pepsi ___ Diet Pepsi ___ Pepsi Max.
- Ranking. This is used when each variable contains the ranking that a respondent has assigned to an object. It is multiple numeric variables that represent a ranking, where the highest number is most preferred and ties are permitted.
Rank the following brands according to how much you like them... Place a 3 next to the brand you like most, a 2 in your next preferred brand and a 1 next to your least preferred brand.
Note that if your question uses lowest numbers as indicating alternatives being more preferred you will need to reverse the values assigned to each rank.
- Experiment. This question type is used to represent the various different types of experiments, from randomized experiments (Fully randomized experiments through to Conjoint Analysis and Choice Modeling). See Experiments for more information.