Curve Fitting with Parametric Regression Analysis
Uses
- Is a straight line the model that is best suited to
describe our data? Or is there something to be gained by including
curves in
our equation? An additional curve is added to the model when an
additional polynomial term of the predictor variable is included in the
equation. Y modeled on X produces a straight line model, Y modeled on X
+ X2 will test for a curved model, Y modeled on X + X2 and X3 gives a model with two curves. These Curves are represented by adding a higher order polynomial term to the equation.
- The fit of the model (i.e., the SSM) will certainly increase
with each additional polynomial term, however, it is not clear whether the new equation is significantly better? Remember we loose one df for each term added to
our equation, so our MSM model term in the numerator for the F-statistic may or may not increase.
How is this done
- Build a linear model of the data by regressing Y on X
- Center the independent variable X by subtracting the mean for X from each value in it
-> Xc. The main purpose of this is to reduce collinearity between the independent variables.
- Create the polynomial terms by multiplying each value
in Xc with itself one time (quadratic term), 2 times (cubic term), etc.
- Build regression models of increasing complexity by including additonal polynomials terms as predictor variables
- Test whether a higher degree model significantly improves the model's fit
- calculate F = (SSM for higher degree
model - SSM for lower degree model) / (MSE for higher degree model)
- compare to F-Tables with numerator df = 1 and denominator
df = residual df of the higher degree model
Things to consider
- you can always run this analysis with raw data, standardized
(i.e, z-transform), centered (i.e., subtract mean) or on ranked data to
make sure they give you essentially similar results.
last modified: 2/1/08