Text analysis encompasses lots of techniques for analyzing data stored as text. This article lists some of the most frequently asked questions (FAQ) about Text Analysis and the various Displayr features supporting this. For a more detailed review of these features, see Finding the Best Text Analysis for Your Data.
Questions:
- What is Text Categorization?
- How do I enable AI for Text Categorization?
- How do I load data for Text Analysis?
- How do I use AI to discover themes in my text data?
- How can I automatically classify responses into themes?
- Can I edit AI-created themes?
- Can I create my own themes?
- Can I edit how the AI has classified responses?
- Can the AI reclassify responses that have already been classified?
- Can I classify responses into themes manually?
- Can I delete themes?
- Do I have to categorize all responses in one go, or can I come back and continue work later?
- Can I add more data to my Text Categorization later on?
- Can I use the same themes and classification on other data?
- Can I move Text Categorization into Displayr from another tool?
- How do I present my classified data?
- Do you have Word Clouds?
- How do I combine categorical (non-text) data with new themes and classifications (back coding)?
- Can I do qualitative analysis in Displayr?
What is Text Categorization?
In Displayr, we talk about Text Categorization as an umbrella term that covers all our functions that allow you to go from unstructured text data to structured data (categorized, or classified, into themes).
How do I enable AI for Text Categorization?
Before you can use any AI-powered features in Displayr, you need to accept the associated terms and conditions. Click your avatar icon (top right), select Account Settings, and on the General tab, scroll down to the Displayr AI section. Change the drop-down to Enabled. A dialog will pop up with a link to the terms and conditions, allowing you to accept or decline them.
How do I load data for Text Analysis?
In the Data Sources pane, click + Add Data (or just the +) icon to bring up the data importing dialog. Select where your data is (in the example below, it's just a simple data file, e.g. a .sav or .csv), and Displayr will load it into your document.
For more details, see How to Import Data into Displayr
How do I use AI to discover themes in my text data?
Under Data Sources select the text variable you want to classify, then hover and click + > Text Categorization.
- If you want to classify responses into multiple themes, select Multiple themes and click Start.
- If you want to classify responses into a single theme, then select Only one theme and click Start.
The text categorization dialog will open. Click Create and Displayr's AI will start working out what themes you have in your data. You can set the number of themes Displayr will create - the default is 10.
How can I automatically classify responses into themes?
Once you have some themes, in the Text Categorization module, click Classify and the AI will work out which responses match which themes.
Can I edit AI-created themes?
Right-click any theme and select Rename to change the theme. The AI will take changes into consideration when classifying new responses to that theme. See the gif below for an example of how to rename an AI-created theme.
Can I create my own themes?
To create your own theme, right-click in the Themes section of the Text Categorization module and select Add Theme. A prompt will open for you to type the text/label of the theme.
Can I edit how the AI has classified responses?
In the Show text from dropdown, select the theme you want to review. You can reclassify the responses manually by selecting any response(s) in the list of responses, clicking the relevant theme and then selecting:
- Add to if you want to add the responses to that theme.
- Move to if you want to move the responses for their current theme(s) to the selected theme(s).
- Remove from if you want to remove the responses from the selected theme.
Can the AI reclassify responses that have already been classified?
In the Show text from dropdown, select a theme where you want the AI to reclassify the responses.
- Move all the responses you want to reclassify to Uncategorized.
- Change the text of existing themes, or add new themes manually that you want AI to use.
- Click Classify to run the AI over the uncategorized responses.
Can I classify responses into themes manually?
In the Show text from dropdown, select the theme you want to review or Uncategorized for any responses the AI has been unable to classify. Select one or more responses in the list of responses, select one or more themes, and then select the action button below the themes panel.
- Add to if you want to add the responses to that theme.
- Move to if you want to move the responses for their current theme(s) to the selected theme(s).
- Remove from if you want to remove the responses from the selected theme.
- Categorize as is used for any responses that are Uncategorized when first assigning them to themes.
Can I delete themes?
Right-click any theme and select Delete. Any responses classified to that theme will be removed from it and placed back in the uncategorized responses.
You can also select Delete and Reclassify as. This will bring up a dialog where you can select which existing theme you would like to move the responses to.
Do I have to categorize all responses in one go, or can I come back and continue work later?
You can definitely take a break! Hit Save when you're done for the moment. This will generate a new variable set in your Data Sources view that holds the classification for each case in your data file. To continue your classification, select this categorized variable (not the text variable!), and in the object inspector under Data > Properties, click Edit Categorization.
Can I add more data to my Text Categorization later on?
Yes. If you're currently working on a data set where you know you'll get more data later, then all you need to do is update the existing data set. The additional data in any variables you've already classified will then automatically be available in the Text Categorization module for you to work with. For more on updating your data set, see How to Update a Data Set in Displayr
Can I use the same themes and classification on other data?
Yes. If you are reusing your already-created themes on other text variables in the same document, there are options to reuse via the menus. Alternatively, you can export your classification and import it into the categorization tool to reuse in the same or a separate document. More detail on the two methods:
- To reuse via the menus: Select the text variable to classify under Data Sources, and then go to Text Categorization. Select the Reuse Existing Categorization tab and choose how you want to reuse the classification, either:
- Click Reuse by sharing - any changes you make to the classification will flow through to all other variable sets that share this classification.
- Click Reuse by duplicating - this essentially copies the classification to the new text variable so that any changes you make to the classification will NOT apply to the other variable sets that use it.
- To reuse by exporting: In the Text Categorization Module you can use the Export button to create and save a .QCodes file. When classifying any other variable set, in any Displayr document, you can use the Import button to use the same themes and classification for that data.
Can I move Text Categorization into Displayr from another tool?
Yes, that's possible. It's even OK if you've only classified some of the responses, missing values are allowed in the data. You will first need to export the data from your other tool. At a minimum, Displayr needs:
- the text variables that were classified
-
the variables that store how each response has been classified formatted as either:
- A single column in the data set, if each response was classified into only a single theme.
- One column for each theme, with binary values (1's for "fits into theme" and 0 for "does not fit into theme"), if each response was classified into multiple themes.
Once you have the data formatted for Displayr:
- Import your data file into Displayr (see How to Import Data into Displayr).
- Under Data Sources select the text variable(s) and the corresponding variable(s) that hold the classification data.
- Generate a .QCodes file (a file in a special format used for text classification in Displayr) that can be imported into our Text Categorization module based on the variables selected. In the toolbar go to Anything > Advanced Analyses > Text Analysis > Create QCodes File.
- Enter a descriptive file name to save the existing classification as a .QCodes file to your Displayr Cloud Drive.
- Go to your avatar icon in the top right, and select Displayr Cloud Drive.
- Locate the file you just created, and click the name to download it to your computer.
- Select your text variable under Data Sources, hover, click the + icon, and then select Text Categorization.
- In the dialog, select the same type of classification as you already have in the file you're importing, i.e., Multiple themes or Only one theme.
- In the Text Categorization module, click the Import button at the bottom of the screen, and then load the .QCodes file you just downloaded above.
The classification will now be "live" and you can work with it in Displayr's text classification tools, including running the AI classification tool and other features. You can read more about using existing categorization and creating QCodes file at How to Use a Categorization in a Different Document.
How do I present my classified data?
Displayr is a fully-fledged data app and dashboard-building tool, so there's a wide range of ways to do that. The easiest way is just to drag your classified variable set (the one with the suffix - Categorized if not renamed) from the Data Sources pane onto the Page to create a table, and then convert that to a visualization.
See How to Create a Bar Chart for more on bar charts, which is just one of dozens of chart types available in Displayr.
Do you have Word Clouds?
To show a word cloud, drag a text variable from the Data Sources pane onto the Page, then from the object inspector, click the Visualization selector. Next, click the Text Analysis group, and then Word Cloud. The text data in the table on the page will automatically be converted to a Word Cloud. You can work with the text directly in the Word Cloud, e.g., drag off themes that you want to remove from the final version.
For more on this, see How to Create a Word Cloud
How do I combine categorical data with my text data using the same, or possibly more, categories (known as back coding)?
An example of this would be classifying an "Other" response from a survey back into the main question. You can do this in our Text Categorization module:
- Select the relevant text variable(s) to classify under Data Sources, hover, click the + icon, and select Text Categorization.
- Select the appropriate classification type based on the Structure of the categorical variable set you want to use for the categories (i.e. Multiple themes for Binary-Multi sets and Single theme for Nominal).
- Click Inputs and Back Coding in the module that pops up.
- In the Categorization Input dialog and the drop-down Corresponding back coding variables select the variable set you want to combine the classified responses with, then click OK.
For further details about back coding, see How to Back Code Variables in Displayr.
Can I do qualitative analysis in Displayr?
Displayr is, first and foremost, a tool for quantitative analysis. However, you can use Displayr's general AI tools to refine and work with your text data before you start classifying it into themes.
Select a text variable under Data Sources, hover, and click the + > AI > Custom - Text.
This feature allows you to specify your own prompt and will give you a new text variable that you can use in further functions. For example, you can prompt the AI to:
1. Summarize responses (particularly useful if you're working with lengthy interview transcriptions).
2. Pull out a set of keywords or phrases.
3. Extract particularly salient quotes.
The output here, really, is up to how you write the AI prompt. Once done, the generated variable(s) can be used in all other analyses (including AI text classification) in Displayr.