t-Distributed Stochastic Neighbor Embedding (t-SNE) is a technique used for compressing high dimensional data into a small number of dimensions. This method attempts to preserve local structure by maintaining the distribution of the neighbors of each point. This can be contrasted with Principal Component Analysis which preserves large-scale relationships.
t-SNE is used primarily to compress data to 2 dimensions for visualization. This article describes how to create a two-dimensional scatterplot, where each of the objects is represented as a point:
- High dimension variables in your data set (many variables)
- A distance matrix, see How to Create a Distance Matrix or you can paste one in as a table.
1. From the menus, select Anything > Advanced Analysis > Dimension Reduction > t-SNE.
2. Follow the instructions on How to Create a Dimension Reduction Scatterplot to further configure the output. If using variables, start on Step 3 of that section. If using a distance matrix, follow from Step 2 of that section.