There are many issues with the concept of predictor importance in a multiple regression.
- Regression is typically based on observational data, where there is no guaranty that any relationship between the predictor and the outcome variable is causal. In fact, on theoretical grounds, most applications that I have seen in psychological contexts would suggest a third variable or a reciprocal relationship to be more likely.
- Measures of variable importance that depend on the other predictors in the model (e.g., semi-partial correlation, standardised beta), well, they depend on the other predictors in the model. This problem becomes increasingly important as the size of correlations between predictors (i.e.,multicollinearity) increases.
- Importance of a predictor from a policy perspective may depend on other factors such as the cost of manipulation.
- If predictors vary in their reliability and validity, better prediction of one variable may be partially due to superior measurement and not in better prediction from the underlying phenomena.
When I consult with researchers who want to say something about variable importance in their data, I tend to give the following advice:
- Consider the above mentioned issues. i.e., 1) consider alternative causal explanations; 2) also use a measure of variable importance that is independent of the other predictors in the model (e.g., zero-order correlation); 3) consider other factors that might be weighted in policy makers decisions about variable importance; 4) examine the reliability of the data as indicated by standard measures. Consider how well the variables actually measure the latent theoretical concept. If reliability is substantially different, SEM based approaches adjusting for reliability may be better. Having predictors that are all of high reliability and validity in the first place is even better.
- A quick and easy way to look at variable importance, which I find reasonable, involves examining and reporting the zero-order correlation and the semi-partial correlation for each predictor. The zero-order correlation (i.e., the standard correlation) tells you the degree to which the predictor is related to the outcome variable independent of any other predictors. The squared semi-partial correlation tells you the unique percentage of variance explained in the outcome variable by the target predictor over and above the other predictors. If you want to look at whether the differences between the zero-order correlations for the different predictors are statistically significant, check out my posts on examining the difference between non-independent correlations. Andy Field describes how to read this information from SPSS output. The zero-order and semi-partial correlations may rank order the predictors in the same way. If they do not, consideration should be given to the role of multicollinearity in influencing the semi-partial correlations.
- For more information about the literature on relative importance have a look at some of the links on Ulrike Grömping's site.
- For further information on multiple regression, one online reference is this book by Cohen and colleagues. My own material on multiple regression is here.