Lectures for Advanced Statistics

Nested and Multi-factorial Anlysis of Variance

Uses

to ascertain the magnitude of variation at various, hierarchical stages/levels of an experiment; when major groups of individual data points are grouped into randomly chosen subgroups - nested design
when multiple, independent variables (factors) are considered simultaneously - two-way or multi-way designs; here we assume that each factor contributes a certain amount and that these factors add their effects without infleuncing each other

How this is done

Sum of Square formulas for different designs
use a version of our trusted trout dataset "Trout2.txt", which contains the standard length of trout that came from a pond of a particular Size {1, 2, 3 and 4 acres} and Depth {1, 2 and 3 m}
for nested designs consider size as 4 random replicates {1, 2, 3, and 4} within Depth {1, 2 and 3 m} or depth as 3 random replicates {1, 2, and 3} within Size {1, 2, 3 and 4 acres}

Nested Design:

perform an ANOVA on the nested effect (Pond# - 12 different ponds {names 11 - 34}) alone, the fact that these actually represent four replicates each of deep, medium and shallow ponds is ignored at that level of analysis
subdivide the MS between into a term that describes the variance of ponds within pond type (i.e., Depth) and that associated with pond types to each other. Towards this goal,
perform an ANOVA on the treatment effect (Depth) alone, the fact that these actually represent four replicates each for different ponds is ignored at that level of analysis
subtract the SS associated with the analysis of depth from that derived from among all 12 ponds and you get the SS associated with the variance among ponds nested within pond types
complete ANOVA table as usual
Note: keep sample sizes equal or you have a major mathematical hassle on your hands

Multi-way design:

perform an ANOVA on the nested effect (Pond# - 12 different ponds) alone, the fact that these actually represent three replicates each of deep, medium and shallow ponds, and four replicates of size are hidden at that level of analysis
subdivide the MS between into a term that describes the variance of ponds within pond types (i.e., Depth, Size, and their interaction). Towards this goal,
perform an ANOVA on the treatment effect (Depth) alone, the fact that these actually represent four different sizes each is ignored at that level of analysis
perform an ANOVA on the treatment effect (Size) alone, the fact that these actually represent three different depths is ignored at that level of analysis
subtract the SS explained by depth and size from that derived from among all 12 ponds and you get the SS associated with the interaction term
complete ANOVA table as usual
Note: keep sample sizes equal or you have a major mathematical hassle on your hands

Challenges:

Consider two analyses, one with model terms {Size, Depth[Size]} the other with model term {Pond#}. Phrase your prediction as to the size of the two SS_Model and the two SS_Errorterms. Test that prediction and understand why you get that result.
Consider two analyses, one with full factorial model terms {Size, Depth, Size*Depth} the other with model terms {Size, Depth[Size]}. Phrase your prediction as to the size of the two SS_Model and the two SS_Errorterms. Test that prediction and understand why you get that result.
Consider two analyses, one with full factorial model terms {Size, Depth, Size*Depth} the other with partial factorial model terms {Size, Depth}. Phrase your prediction as to the size of the two SS_Model and the two SS_Errorterms. Test that prediction and understand why you get that result.

To perform a two-way complete model ANOVA in R you would first import datafile "Trout3.txt", then you need to make sure that pond size and pond depth are treated as a categorical variables

> Trout3.txt <- read.table("/Trout3.txt", header=TRUE, sep="", na.strings="NA", dec=".", strip.white=TRUE)
> Trout3.txt$PondSize_cat <- as.factor(Trout3.txt$PondSize)
> Trout3.txt$PondDepth_cat <- as.factor(Trout3.txt$PondDepth)
> Trout3.txt$PDepth_PSize_cat <- as.factor(Trout3.txt$PDepth_PSize)
> AnovaModel.1 <- (lm(stLength ~ PondDepth_cat*PondSize_cat, data=Trout3.txt))
> anova(AnovaModel.1)

This is equivalent to piecing the terms together from a series of one-way ANOVAs. First create the model for the consistent size effects using a one-way ANOVA on PondSize, then report the results

> AnovaModel.2 <- (lm(stLength ~ PondSize_cat, data=Trout3.txt))
> anova(AnovaModel.2)

then you create the model for the consistent depth effects using a one-way ANOVA on PondDepth, then report the results

> AnovaModel.3 <- (lm(stLength ~ PondDepth_cat, data=Trout3.txt))
> anova(AnovaModel.3)

then you create the model for the one-way ANOVA on the specific combination of PondDepth and PondSize, then report the results

> AnovaModel.4 <- (lm(stLength ~ PDepth_PSize_cat, data=Trout3.txt))
> anova(AnovaModel.4)

the SS and df for Model.4 minus those for Model.2 and Model.3 are associated with the interaction between depth and size

last modified: 2/18/13