Advanced Statistics - Biology 6030

Bowling Green State University, Fall 2017

Lab Exercise for R: Multiple Linear Regression Analysis

Exercise 1: Create a model that explains body mass through a combination of independent measures.

If two independent samples are obtained, then we can ask whether they may have been drawn from two different underlying distributions.


Download and read the content from file "BodyMeasures.txt", fit a model that explains Mass as a function of all other variables and report the various indicators  

> bodyMeasures <- read.table("http://caspar.bgsu.edu/~courses/stats/Labs/Datasets/BodyMeasures.txt", header=T)
> fit1 <- lm(Mass~Fore+Bicep+Chest+Neck+Shoulder+Waist+Height+Calf+Thigh+Head,data=bodyMeasures)

> coefficients(fit1)                 # model coefficients
> confint(fit1, level=0.95)     # CIs for model parameters
> fitted(fit1)                           # predicted values
> residuals(fit1)                     # residuals
> anova(fit1)                         # anova table
> vcov(fit1)                           # covariance matrix for model parameters
> influence(fit1)                    # regression diagnostics

> # diagnostic plots
> layout(matrix(c(1,2,3,4),2,2)) # optional 4 graphs/page
> plot(fit)

Now obtain different models by manually including/excluding subsets of independent variables and compare their explanatory power using ANOVA

> # compare models
> # fit1 <- lm(Mass~Fore+Bicep+Chest+Neck+Shoulder+Waist+Height+Calf+Thigh+Head, data=bodyMeasures)
> fit2 <- lm(Mass~Fore+Bicep+Chest+Neck+Shoulder, data=bodyMeasures)
> anova(fit1, fit2)

Construction of the best suited model can also be done via stepwise regression. You can specify options "forward", "backward" or "both"

> library(MASS)
> # fit1 <- lm(Mass~Fore+Bicep+Chest+Neck+Shoulder+Waist+Height+Calf+Thigh+Head,data=bodyMeasures)
> step <- stepAIC(fit1, direction="both")
> step$anova # display results

Alternatively, you can obtain regression info an all possible subsets

> # All Subsets Regression
> library(leaps)
> attach(mydata)
> subsets <- regsubsets(Mass~Fore+Bicep+Chest+Neck+Shoulder+Waist+Height+Calf+Thigh+Head, data=bodyMeasures, nbest=3)
> summary(subsets)

Obtain a varierty of useful plots

> # plot a table of models showing variables in each model, ordered by the selection statistic.
> plot(subsets,scale="r2")
> # plot statistic by subset size
> library(car)
> subsets(subsets, statistic="rsq")


last modified: 2/2/15
This material is copyrighted and MAY NOT be used for commercial purposes, 2001-2017 lobsterman.
[ Advanced Statistics Course page | About BIO 6030 | Announcements ]
[ Course syllabus | Exams & Grading | Glossary | Evaluations | Links ]