Introduction
This article describes how to produce a 2-dimensional scatterplot to visualize either high dimensional numeric data or a distance matrix.
Requirements
- High dimension variables in your data set, or
- A distance matrix
Method 1: Using Variables
1. From the toolbar menu, select Anything > Advanced Analysis > Dimension Reduction > Dimension Reduction Scatterplot.
2. Select one of the available dimension reduction techniques from the Algorithm input:
- Principal Component Analysis
- t-SNE
- MDS - Metric
- MDF - Non-metric
3. Select your input variables from the Variables drop-down list.
4. OPTIONAL: Tick the Normalize variables checkbox to normalize the data:
- For t-SNE and MDS each variable is standardized to the range [0, 1]
- For PCA the correlation matrix is used rather than the covariance matrix
5. OPTIONAL: When Create binary variable from categories is checked, unordered categorical variables with N categories are converted into N-1 binary indicator variables. Otherwise such variables are each converted to a single numeric variable with integers representing categories (as happens for ordered categories).
6. OPTIONAL: Enter a value for Perplexity which is a parameter used by the t-SNE algorithm and related to the number of nearest neighbors considered when placing each data point. The typical useful range is from 5 to 50 and the default value is 10.
- Low values imply that immediately local structure is most important.
- High values increase the impact of more distant neighbors and global structure
7. Select a Group variable to categorize the output. If numeric, the data are shaded from light (lowest values) to dark (highest). If categorical, data points are colored according to their category.
8. Click the Calculate button to generate the scatterplot.
Method 2: Using a Distance Matrix
1. 1. From the toolbar menu, select Anything > Advanced Analysis > Dimension Reduction > Dimension Reduction Scatterplot.
2. Select a distance matrix input:
- Choose an input from the Distance matrix drop-down box which was created in your document, or
- Select the paste or type distance matrix to manually input the distance matrix.
From the Select a distance matrix you've created in your document If you've already create a distance matrix in your document using
3. OPTIONAL: Enter a value for Perplexity which is a parameter used by the t-SNE algorithm and related to the number of nearest neighbors considered when placing each data point. The typical useful range is from 5 to 50 and the default value is 10.
4. Click the Calculate button to generate the scatterplot matrix.
See Also
How to Do Principal Component Analysis in Displayr
How to Create a Principal Component Analysis Biplot
How to Create a Component Plot from a Principal Component Analysis
How to Create a Goodness of Fit Plot from a Dimension Reduction Output
How to Create a Scree Plot from a Principal Component Analysis
How to Save Components/Dimensions from a Dimension Reduction Output
How to do Multidimensional Scaling from Variables
How to do Multidimensional Scaling from a Distance Matrix
How to Create a Distance Matrix
Comments
0 comments
Article is closed for comments.