This article describes how to go from verbatim text responses in either a single language or multiple languages:
To a state where the responses are translated into the language of your choice and automatically categorized:
This feature automatically categorizes the text variable containing unstructured text into single-response or multiple-response categories.
You can allow the algorithm to determine appropriate categories and their labels automatically based on patterns observed in the data. Alternatively, you can a provide partial categorization of cases in the data and the algorithm will predict which of the user specified categories the remaining cases belong to using the method described here: How to Automatically Classify New Text Data Using an Existing Categorization.
These categories for all cases can then be saved by clicking Inputs > SAVE VARIABLE(S) > Categories.
Requirements
You will need a Text variable in order to perform automatic coding. Text variables are represented by a small a next to the variable in the Data Sets tree:
OPTIONAL: If you have an input variable with multiple languages you will need to supply a nominal variable indicating language, enabling multiple languages to be translated at the same time.
Method
- From the toolbar, go to Anything > Advanced Analysis > Text Analysis > Automatic Categorization > Unstructured Text.
- From the object inspector, select the text variable you would like to categorize and translate from Inputs > DATA SOURCE > Text variable.
- From Inputs > TRANSLATE (GOOGLE CLOUD TRANSLATION), specify the Source language. If your text variable contains more than one language, select Specify with variable, and select the nominal variable that contains a list of the languages in the Source language variable dropdown.
- Specify the Output language that you would like the responses to be translated to.
- Click Calculate if Automatic is not already ticked.
OPTIONAL:
You can a provide partial categorization of cases in the data and the algorithm will predict which of the user specified categories the remaining cases belong to using the method described here: How to Automatically Classify New Text Data Using an Existing Categorization.
These categories for all cases can then be saved by clicking Inputs > SAVE VARIABLE(S) > Categories.
Next
How to Automatically Translate Text Variables into Other Languages
How To Automatically Code Unstructured Text Data
How to Automatically Classify New Text Data Using an Existing Categorization