Labs for Advanced Stats

Lab Exercise for R: Linear Regression

Exercise 1: Characterize a linear relationship

If two independent samples are obtained, then we can ask whether they may have been drawn from two different underlying distributions.

First generate a Data set

n = 50
x = sample(40:70,n,rep=T)
y = .7*x+rnorm(n,sd=5)

Now plot the data and a horizontal line at the mean

plot(x,y)
ybar = mean(y)
abline(a=ybar,b=0,col="black")

Calculate the best fit model line and draw in the line

fit.lm = lm(y~x)
abline(fit.lm,col="red")
points(x,fitted(fit.lm),col='red',pch=20)

Predict y-hats and construct confidence intervals for the slope

newx = seq(min(x),max(x))
prd = predict(fit.lm,data.frame(x=newx),interval="confidence",level=0.95,type="response")
lines(newx,prd[,2],col="red",lty=2)
lines(newx,prd[,3],col="red",lty=2)

Get a plot of the results, for residuals vs fitted values, residuals vs independent values, a Q-Q plot for residuals, etc ...

plot(fitted(fit.lm), resid(fit.lm))
plot(x,resid(fit.lm))
qqnorm(resid(fit.lm))
qqline(resid(fit.lm),col="red")

last modified: 2/2/15