This article describes how to go from a table of text:
To a state where a term document matrix represents the words in the text as a table (or matrix) of numbers:
Requirements
A verbatim text variable that contains sentences or phrases. Text variables are represented by an A next to the variable in the Data Sources tree:
Please note these steps require a Displayr license.
Method
- Create a table by dragging a text variable onto a Page.
- From the toolbar, go to Anything > Advanced Analysis > Text Analysis > Advanced > Setup Text Analysis.
- From the object inspector, select the text variable in the Data > Text Analysis Options > Text variable drop-down or drag the text variable from your Data Sources tree into the Text variable field.
- Make any modifications to the options in your text analysis setup as described in How to Set Up Your Text for Analysis.
- Go to Anything > Advanced Analysis > Text Analysis > Advanced > Term Document Matrix.
- From the object inspector > Data > Setup item, select your text analysis output that was created in Steps 3-4.
- OPTIONAL: Update Minimum document count as needed based on your text analysis.
- Go to Calculation > Custom code.
- Click on the page to create the output.
- Paste the following code into the R Code editor:
library(tm)
non.sparse.matrix <- as.matrix(term.document.matrix)
In the code above, replace term.document.matrix with the name of the output that was created in step 6. You can find this by selecting the object and going to General > Name.
Next
How to Set Up Your Text for Analysis
How to Hook Up a Term Document Matrix or Sparse Matrix to Custom R Code