Abstract
The selection of an optimal regression model comprising linear combinations of various integer powers of an independent variable (explanatory variables) is considered. The optimal model is defined as the most accurate (minimal variance) stable model, where all parameter estimates of the orthogonalized explanatory variables are significantly different from zero. The potential causes that limit the number of terms that can be included in a stable regression model are investigated using two indicators, which measure signal-to-noise ratios in the variables. The truncation-to- noise ratio indicator is used to measure the extent of collinearity between the explanatory variables and the correlation-to-noise ratio indicator to evaluate the significance of the correlation between an explanatory variable and the dependent variable. It is shown that the number of terms that can be included in a stable polynomial model (and its accuracy) depend on the range and precision of the data, the rate of the error propagation during computations, and the algorithm used to calculate the regression parameters. It is demonstrated that it can often be advantageous to include nonconsecutive powers of the independent variable in an optimal polynomial model. An orthogonalized-variable-based stepwise regression procedure is presented, which enables identifying the optimal model in polynomial regression.
Original language | English |
---|---|
Pages (from-to) | 4477-4485 |
Number of pages | 9 |
Journal | Industrial and Engineering Chemistry Research |
Volume | 38 |
Issue number | 11 |
DOIs | |
State | Published - 1999 |