Multi-dimensional Scaling
Uses
- find a way to "re-arrange" objects in an
efficient manner based on similarities or dissimilarities
- detect meaningful underlying dimensions that allow
an efficient explanation of the observed distances
- find a low-dimensional space, in which differences
among objects match the true (i.e., higher-order) proximities
among them as closely as possible
- reduce the observed complexity of nature and explain
the distance matrix in terms of fewer underlying dimensions -
larger differences should be represented by larger distances
Examples
- Analyze observer bias by determining similarities
in measures obtained from different observers.
- Examine similarities in food preference among a group
of closely related species. What type of prey traits can explain
the detected patterns? What does that tell us about sensory perceptions
of predators?
- Examine patterns in development or evolution of brains? In
what way are brains shaped?
- Assess the goodness of fit of proximity data to spatial distance
models.
How this is done
- MDS constructs a configuration of points in space
from information about differences between the
points
- Like cluster analysis this is a technique that makes no predictions but helps us interpret how different items relate to each other. It creates a lower dimensional representation of the data set where the original, miltidimensional distances are represented as closely as possible.
- Given enough dimensions we can represent the distance
among points exactly. A distance between two objects can
be mapped with a single line segment, but an increasing number
of objects demands a space with more and more dimensions (i.e.,
n-1 dimensions). This becomes difficult to visualize and interpret.
Thus, the emphasis is on finding a spatial representation in
a low-dimensional form.
- Similar to factor analysis but it uses any general similarity
measures instead of a specific covariance matrix. For each
unit we need to know its nearest neighbor, and then the next,
and so on in rank order.
- Objects with known rank-differences between them are moved
around within a given space (composed of a particular number
of dimensions) until "lack of fit" has been minimized
(i.e., goodness-of-fit has been maximized).
- Examine how well the orginal distances between objects can
be reproduced by a newly-derived configuration using stress
measures (φ = Σ(δij
- δij)2. Stress
calculates how well the actual distribution of distances (nearest
neighbors) is represented by the current solution. The smaller
this measure, the better the fit of the simplified MDS distance
matrix to the observed, complex distance matrix.
- A Scree test (Plot of stress value against number
of dimensions in the solution) helps you decide on how many dimensions
to use. Towards this goal pick the point where the graph begins
to level off.
- Interpret the dimensions through scatterplots of the objects
in three-dimensions - rotate the space to visualize
- Use multiple regression techniques to regress variables
on the coordinates in different dimensions (MD scores)
- Factor rotation: The actual orientation of axes in
the final solution is arbitrary. For example, one can rotate
a map, yet the distances between locations on it remain the same.
You can rotate the factor space for improved interpretation by
minimizing the number of variables with high loadings on each
axis.
last modified: 4/14/10