*Hierarchical cluster analysis* is an algorithm that groups similar objects into groups called *clusters*. The endpoint is a set of clusters*, *where each cluster is distinct from each of the other clusters, and the objects within each cluster are broadly similar to each other. This article describes how to conduct a hierarchical cluster analysis in Displayr.

Please note, Displayr's hierarchical cluster analysis tool treats the *variables* as the *cases*, so it does not produce segments in the traditional sense (e.g., it is used for creating segments of brands, rather than segments of people). If you want to group similar respondents together, consider an alternative method such as Latent Class Analysis or k-means cluster analysis.

## Requirements

- Hierarchical clustering can be performed with either
*raw data*or a*distance matrix.*When raw data is used, the distance matrix is automatically computed in the background.

## Method

- From the
**toolbar**, select**Anything > Advanced Analysis > Cluster > Hierarchical Cluster Analysis**. - From the
**object inspector**, select the variables from your data set that you want to use as inputs to the cluster analysis. For this example, we've used binary variables showing device ownership from a technology survey. - Enter a value for the
**Number of clusters**that you want to create. - OPTIONAL: Select a distance measure from the
**Distance**input. This is the formula used to compute the distance between points, prior to clustering. For more information, see the dist package documentation which is used for the distance matrix computation.**Euclidean**(default)**Maximum****Manhattan****Canberra****Binary**

- OPTIONAL: Select the algorithm to use to form the clusters from the
**Clustering method**input. For more details, see the hclust package documentation.**Ward1 (ward.D)****Ward2 (ward.D2)**(default) - Commonly know as*Ward's method***Single****Complete****Average****McQuitty****Median****Centroid**

- OPTIONAL: Tick
**Variable names**. This Displays Variable Names in the output. -
OPTIONAL: Tick

**Categorical as binary**. This represents unordered categorical variables as binary variables. Otherwise, they are represented as sequential integers (i.e., 1 for the first category, 2 for the second, etc.).*Numeric - Multi*variables are treated according to their numeric values and not converted to binary. -
OPTIONAL: Set the

**Label margin**, which is the width of the right-hand margin to accommodate long labels. - Click the
**Calculate**button to generate the custom analysis output.

The output is what's called a *dendrogram* which shows the distance between the variables. Each of the clusters is displayed as a separate color.

## Next

How to Analyze Data by Groups/Segments

How to Do Latent Class Analysis

How to Create a Segmentation Comparison Table

How to Do Mixed Mode Cluster Analysis in Displayr

## Comments

0 comments

Article is closed for comments.