Introduction
Displayr has the ability to automatically code a single simple text variable, or multiple text variables (and keep responses alphabetically ordered). This is distinct from Displayr’s semi-automatic, and manual categorization features, which require you to create and allocate responses to categories. It is also distinct from Displayr's automatic categorization feature. Displayr uses automatic rules to create and allocate codes, behind the scenes. This is very useful for CSV files, where categorical data is often encoded as the labels rather than as numeric values (e.g. a question such as “What is your favourite animal?” would have data values of “Ants”, “Dogs”, “Cats”).
Method
How to automatically code a single simple text variable
- Select the variable in the Data Sets tree.
- In the Object Inspector on the right, change the Structure from Text to Nominal.
- Select the text variables which should have a shared code frame in the Data Sets tree.
- Right-click and select Combine or select Combine in the toolbar.
- The text variables have now been combined into a single variable set.
- In the Object Inspector on the right, change the Structure for the new variable set from Text – Multi to Nominal – Multi.
How automatic coding works
Key points
- Converting a Text variable to a Nominal variable via Structure will automatically code text responses.
- Auto-coded variables that are part of the same question (e.g. Binary - Multi, Nominal – Multi, etc.) share the same code frame. This means all text responses from all the source text variables will be coded together and have the same numeric values.
- When automatically coding multiple text variables that are related, first use Combine to combine them into a Text – Multi variable set, and then change the Structure to Nominal - Multi (or any other categorical type). This ensures responses from all variables are alphabetically ordered.
How Displayr automatically codes text responses
- Like the manual categorization feature, spaces at the start and end of responses are ignored, and it is not case sensitive. For example, “ dogs” and “Dogs “ will both be coded as the same category.
- When making a label for the coded categories, Displayr chooses the label that occurs most often in the text responses. For example, if the responses were “coke“, “COKE”, “Coke” and “Coke” the auto-coded question would use “Coke” as the label for the category.
- The coded categories are in alphabetical order, both in the Value Attributes dialog and on tables.
How Displayr deals with changes in the data file
- Whenever the source text variables are updated (from either an updated data file or due to an edit within Displayr), the code frame is automatically re-coded.
- Whenever auto-coded variables are combined into a multi-variable set, their code frames change to include unique responses from all other input text variables.
- Whenever auto-coded variables are moved from a multi-variable set to their own single-variable, their code frames stop including responses from the other text variables and only include their own responses. Importantly, their category values stay the same.
- Existing text responses always keep their same category value (e.g. if “Ants” was originally the first alphabetical response with an auto coded value of 1, and “Aardvarks” appeared in the new data, “Ants” would remain with a value of 1, and “Aardvarks” would get a new unique value).
- The category labels may change if another type of text response becomes the highest occurring response. (e.g. if the new responses were “coke”, “COKE”, “Coke”, “Coke”, “coke” and “coke”, the new label would be “coke”).
Next
How to Semi-Automatically Code Text Data
How to Refine and Edit Text Categories After Categorization
How to Manually Code Mutually Exclusive (Single Response) Text Data