t-Distributed Stochastic Neighbor Embedding (t-SNE) is a technique used for compressing high dimensional data into a small number of dimensions. This method attempts to preserve local structure by maintaining the distribution of the neighbors of each point. This can be contrasted with Principal Component Analysis which preserves large-scale relationships.
t-SNE is used primarily to compress data to 2 dimensions for visualization. This article describes how to create a two-dimensional scatterplot, where each of the objects is represented as a point:
- A Displayr document containing the variables that you want to use as inputs to the t-SNE analysis.
1. From the menus, select Anything > Advanced Analysis > Dimension Reduction > t-SNE.
2. From the object inspector, under Inputs > Variables, select the input variables from your data set.
3. Click the Calculate button to generate the t-SNE output.
4. OPTIONAL: You can perform color-based grouping in the scatterplot by selecting a variable in the Group variable field.