A Principal Components Analysis Biplot (or PCA Biplot for short) is a two-dimensional chart that represents the relationship between the rows and columns of a table. This article describes how to take a table with rows and columns:
And generate a Principal Component Analysis Biplot based on the table:
- Any table that contains rows and columns, including contingency tables, grids or even raw data.
- The objects that are the focus of the analysis should be in the rows of the table. For example, if analyzing brand associations, the brands should be shown in the rows.
1. Select the table that you want to use as an input to the Principal Component Analysis Biplot. For this example, we'll use a binary brand/attribute grid.
2. From the toolbar menu, select Anything > Advanced Analysis > Dimension Reduction > Principal Components Analysis Biplot.
3. From the object inspector on the right, select the table from the Input table drop-down list. If needed you can get the name of the input table by selecting the table and then going to Properties > General > Name in the object inspector.
4. Set the Normalization to Principal, Row principal, Column principal, Symmetrical or None. As an example, the normalization should be set to Column principal if the focus of the analysis is more concerned with differences between the columns of the table.
5. Choose the Output as a Scatterplot, Moonplot or Text.
6. OPTIONAL: Update Rows/Columns to ignore.
7. OPTIONAL: Set Row/Column title.
8. OPTIONAL: Set the colors of the rows and columns.
The PCA biplot also allows us to see associations between the brands and attributes.
This chart allows us to see the associations between the brands and the image categories. We can see that the image categories Here today, gone tomorrow, Fashionable, Bureaucratic, and Don’t know much about them most strongly distinguish the brands. It also allows us to see patterns of correlation between the image categories. For example there is a correlation between Unreliable and Here today, gone tomorrow.
The biplot also allows us to see associations between the brands. For example Virgin mobile, Orange (Hutchison), AAPT, and New Tel each have similar response patterns, and are most strongly distinguished by Don’t know much about them. With regard to this category, Telstra (Mobile Net) is located far on the other side of the origin, and this indicates that has roughly opposite response patterns. In the table we can see that Telstra (Mobile Net) has the lowest score for this image category, whereas the other three are much higher.
The principal coordinates are scaled by the square root of the singular values (i.e., by the standard deviations of the estimated components). The first component is given a scale of 1 and the second component’s scale is the ratio of the standard deviation of the second component divided by the standard deviation of the first. Some other statistics programs scale differently to Displayr. For example, R first scales by the standard deviation (rather than the ratio of the standard deviations) and then applies a second scaling to permit the row and column points to be charted in the same space. Although this changes the absolute magnitude of the numbers it does not change the interpretation (i.e., the relativities are equivalent), and the same conclusions will be drawn from the data regardless of the scaling.