Advanced Statistics - Biology 6030
Bowling Green State University, Fall 2019
Curve Fitting with Parametric Regression Analysis
- Is a straight line the model that is best suited to
describe our data? Or is there something to be gained by including
our equation? An additional curve is added to the model when an
additional polynomial term of the predictor variable is included in the
equation. Y modeled on X produces a straight line model, Y modeled on X
+ X2 will test for a curved model, Y modeled on X + X2 and X3 gives a model with two curves. These Curves are represented by adding a higher order polynomial term to the equation.
- The fit of the model (i.e., the SSM) will certainly increase
with each additional polynomial term, however, it is not clear whether the new equation is significantly better? Remember we loose one df for each term added to
our equation, so our MSM model term in the numerator for the F-statistic may or may not increase.
How is this done
- Build a linear model of the data by regressing Y on X
- Center the independent variable X by subtracting the mean for X from each value in it
-> Xc. The main purpose of this is to reduce collinearity between the independent variables.
- Create the polynomial terms by multiplying each value
in Xc with itself one time (quadratic term), 2 times (cubic term), etc.
- Build regression models of increasing complexity by including additonal polynomials terms as predictor variables
- Test whether a higher degree model significantly improves the model's fit
- calculate F = (SSM for higher degree
model - SSM for lower degree model) / (MSE for higher degree model)
- compare to F-Tables with numerator df = 1 and denominator
df = residual df of the higher degree model
Things to consider
- you can always run this analysis with raw data, standardized
(i.e, z-transform), centered (i.e., subtract mean) or on ranked data to
make sure they give you essentially similar results.
Compare models with different degrees of freedom - F statistic: (SSM for higher degree model - SSM for lower degree model) / (MSE for higher degree model). compare to F-Tables with numerator df = 1 and denominator df = residual df of the higher degree model
last modified: 2/1/08
This material is copyrighted and MAY NOT be used for commercial purposes, © 2001-2019 lobsterman.
[ Advanced Statistics Course page | About BIO 6030 | Announcements ]
[ Course syllabus | Exams & Grading | Glossary | Evaluations | Links ]