This article describes how to translate text into different languages when using Displayr's text categorization tools. For example, you can automatically translate Chinese text:
...to another language such as English:
...and then categorize the responses into a desired output language such as French:
If you'd like to create new translated variables instead of categorizing the translated text, please see: How to Automatically Translate Text Variables into Other Languages
- A data set with a text variable.
- If translating from multiple languages, you may use a variable which stores the language for each text response. Click here for a list of the currently supported languages.
- Familiarity with Displayr's manual or semi-automatic text categorization tool. For more information see: How to Semi-Automatically Code Text Data
Text translation is available through the following tools:
- Text Analysis > Semi-Automatic Categorization > Mutually Exclusive Categories
- Text Analysis > Semi-Automatic Categorization > Multiple Overlapping Categories
- Text Analysis > Manual Categorization > Mutually Exclusive Categories
- Text Analysis > Manual Categorization > Multiple Overlapping Categories
Translating text containing a single language
The following example uses the Manual Categorization Tool to translate Chinese text, but the same process works with the Semi-Automatic Categorization tool. The assumption is that all the text is in a single language.
- Select the text variable from the Data Sets tree, hover over it, and click the button that appears to the right of it.
- From Anything > Data > Variables > Text Categorization > Manual > Mutually Exclusive Categories > New.
Alternatively, you can instead select Anything > Advanced Analysis > Text Analytics > Manual Categorization > Mutually Exclusive Categories > New.
- Click the Translate button to the right of Inputs and Back Coding.
The results are as follows:
From the Source language dropdown, choose from the following options:
- Automatically detect language
- Specify with variable. Use this option if the source language is identified by a variable in your data set. This option is particularly useful if your file contains multiple languages. Note that if your language variable has missing data, Displayr will make a best guess at the language.
- A specific language - the default language is English.
- From the Output language dropdown, choose the language you want to translate the text to.
In this example, I will let Text Analytics detect the source language and then translate it to English:
- Click the Translate button.
Important: When the source language and output language are different, your data will be sent over an encrypted connection to Google to be translated. Click Cancel if you do not want your data sent to Google.
When the text translation is done, in this example the following message appears:
- Click OK.
Notice that "[Translated]" appears next to the verbatims:
To see the original, untranslated text of a response, hover over the response. The original text will appear in a tooltip:
Translating text containing multiple languages
If you have multiple starting languages, all the text can be translated at once, as long as you have a variable that identifies the starting language for each verbatim. Click here for a list of the currently supported languages.
In this example, the variable Multilingual contains both French and German text. The second variable, Language, identifies whether the text is French or German.
To translate the text:
- From the Insert Variable(s) menu, select Text Categorization > Manual > Mutually Exclusive Categories > New.
- Click the Translate button
- From the Source Language menu, select Specify with variable.
The source text is translated into the desired language by Google.
Automatic source languages detection
If you choose to have Google automatically detect the source languages, the detected languages are summarized after translation. You may choose to cancel and re-translate with different settings if you are unhappy with the detection results.
Automatic translation for categorizations with translation settings
Once you decide on the translation settings in the translation dialog and saved the categorization, in future editing sessions of this categorization in this document, all responses are automatically translated using this saved categorization.
For initial categorizations and subsequent categorizations where no translation settings are specified, all data is assumed to be in English
Once you have customized and consented to translation settings for the categorization, any subsequent categorizations will mention the translation settings: