Lab Exercise for R: Multivariate Descriptive Statistics

Exercise 1: Calculate the distance of individual rows from the multivariate mean (centroid) based on its correlational structure

In a dataset with multiple continuous variables we try to calculate the multivariate distance of individuals. The original axes are used to derive new linear combinations based on the correlational structure of variables.

Use file "BodyMeasures.txt", optimize the coordinate space based on correlations between the original variables

> bodyMeasures <- read.table("http://caspar.bgsu.edu/~courses/stats/Labs/Datasets/BodyMeasures.txt", header=T)
> bodyMeasures_numeric <-
bodyMeasures[,2:12]

> bodyMeasures_numeric <- bodyMeasures[2:12]
> covMat <- cov(bodyMeasures_numeric)
> covMat
> corMat <- cor(bodyMeasures_numeric)
> corMat

Now calculate the centroid as the mean vector across all columns in the data frame. Calculate the Mahalanobis distance from the centroid using the variance/covariance matrix

> means <- colMeans(bodyMeasures_numeric)
> means
> D2 <- mahalanobis(bodyMeasures_numeric,means,covMat)
> D2



last modified: 3/23/15