Cluster Analysis is a set of methods for grouping objects into different categories based on similarities. As a descriptive technique it discovers structures in data without the need for an explaination as to why they exist.
Uses
identify groups of similar cases based on a set attributes
classify cases into relatively homogeneous groups
Example
cluster multiple individuals of one species from different localities to determine population membership
include all variables that are believed to be significant for characterizing the cases
compute measures of similarity among a number of cases. Note that the inverse of similarity is distance
plot clusters as a function of coefficient of similarity: dendrogram, icicle plot
How this is done
standardize variables that are measured in different units
based on this information group cases into clusters using one of several clustering algorithms
terminate the formation of clusters when all objects are included in one big cluster (agglomerative) or have been split into individual cases (divise)
A variety of different distance/similarity measures can be used. For the following pair of cases consider advantages and disadvantages of different methods