When you want to do analysis on keywords present in text data, you may want to create a term document matrix which creates binary data mapping keywords to each respondent in your data set. This article describes how to convert raw text verbatims:
To a term document matrix represents the words in the text as a table (or matrix) of numbers:
Requirements
A verbatim text variable that contains sentences or phrases. Text variables are represented by an A next to the variable in the Data Sources tree:
Method
- From the Report tree or toolbar (if working on a Page) select + > Advanced Analysis > Text Analysis > Advanced > Setup Text Analysis.
- From the object inspector
, select the text variable in the Data > Text Analysis Options > Text variable drop-down or drag the text variable from your Data Sources tree into the Text variable field.
- Make any modifications to the options in your text analysis setup as described in How to Extract Keywords and Phrases from Text.
- From the Report tree select + > Advanced Analysis > Text Analysis > Advanced > Term Document Matrix.
- From the object inspector
> Data > Setup item, select your text analysis output that was created in Steps 3-4.
- OPTIONAL: Update Minimum document count as needed based on your text analysis.
- Go to Calculation > Custom code.
- Click on the page to create the output.
- Paste the following code into the R Code editor:
library(tm)
non.sparse.matrix <- as.matrix(term.document.matrix)In the code above, replace term.document.matrix with the name of the output that was created in step 6. You can find this by selecting the object and going to object inspector > General > General > Name.
Next
How to Extract Keywords and Phrases from Text
How to Hook Up a Term Document Matrix or Sparse Matrix to Custom R Code