This article tells you how to automatically code unstructured text data. It will take you from unstructured verbatim (raw) text responses:
To a state where the verbatims are automatically categorized:
You will need a Text variable in order to perform automatic coding. Text variables are represented by a small a next to the variable in the Data Sets tree:
- From the toolbar, go to Anything > Advanced Analysis > Text Analysis > Automatic Categorization > Unstructured Text.
- In the object inspector, under Inputs > DATA SOURCE > Text variable select the text variable you want to automatically code.
- In the object inspector, under Inputs > DATA SOURCE > Category Creation select Create New Categorization.
- Under Inputs > CATEGORIES > Number of categories enter a numeric value for the number of categories you would like to end up with. The default is 10.
- To save the categorizations for use in tables and other outputs, select the automatic categorization output on your page and click Inputs > SAVE VARIABLE(S) > Categories or First Category from the object inspector.
- Categories: Save variables to the data set containing the categories. Where there are multiple input variables, multiple sets of variables are added for each.
- First category: Save a variable to the data set containing the first category mentioned. Where there are multiple input categories, the first category of each will be saved as a separate variable.
NOTE: The variables created from this using SAVE VARIABLE(S) > Categories and First Category may become invalid and need to be deleted and recreated if the output has changed, either due to the input text variable being modified or the input settings modified.