This scenario comes up in many settings: items of a psychological scale; many scales measuring similar constructs (e.g., personality, symptoms, performance, ability, etc). The example that I will be talking about here is one where you have a number of ability tests. For example, if you have scores for participants on ten ability tests, it may be useful to form one or more composites. These composites can then be used in subsequent analyses.
THE PROCEDUREStep 1: Decide which variables should form composites
Step 2: Compute composites
Step 3: Use composites in subsequent analyses.
Step 1: Decide which variables should form the composites
Three major sources of information inform which variables should be grouped together to form composites: (1) data, (2) aims, and (3) theory.
1.1) Data: All else being equal it makes more sense to combine variables that are correlated with each other. Thus, if you examine the correlation matrix for the set of tests and see that a subset of tests correlate highly with each other (e.g., r greater than .4 or .5 or .6 or .7), this suggests that this subset is measuring something in common. A more sophisticated approach to this task involves running a factor analysis or principal components analysis.
In the following link I provide notes for a lecture on Factor Analysis and PCA with practice questions. The example in the lecture is based on my own data where I assessed whether nine ability tests could be reduced to three abilities. Much can be said about factor analysis, and I don’t wish to discuss all the issues here (see books like Tabachnick and Fiddel or Hair et al). However, after completing your factor analysis, you should have worked out how many components you want to extract and which variables will be included in each.
1.2) AIMS: It is important to think about the purpose of forming composites in relation to your analyses. For example, if you have 10 ability tests, you might only be interested in having a general measure of intelligence. In which case, it might be sufficient to create a single composite based on all tests. In other cases, such as in neuropsychology, where particular deficits are theorised, or in settings where you are interested in the differential prediction of classes of ability tests, a fine grained split may be of interest. In general, there is a trade-off between complexity and parsimony.
1.3) THEORY: It is also useful to think about the theory of how the individual tests relate. This is particularly important if you have a small sample (e.g., n less than 50), such that factor analysis might not be possible, or even if it is possible, results might not be especially reliable. Theory and past research may suggest that the tests should be grouped in particular ways.
After thinking about the data, your aims, and theory, you should have decided which tests will be combined to form composites.
Step 2: Compute composites
Two main options for forming composites are ‘factor saved scores’ and creating your own weighted composites.
2a) Factor Saved Scores: In the case of factor saved scores, you let the factor analytic procedure compute its own composites based on the results of the factor analysis. SPSS has a button called “Scores…” which lets you save scores. See Andy Field's Factor Analysis notes for more information.
2b) Your own weighted composite: This typically involves creating a linear composite of the component variables. For example, assume you have three tests called “(EV) everyday vocabulary”, “(AV) advanced vocabulary”, and “(C) comprehension”. As a result of the factor analysis you have decided to combine these three tests into a composite. A simple procedure would be to say that: composite = EV + AV + C. The problem with this approach is that the three tests often have different metrics. One may be percentage correct, another may be the number solved, and so on. The result is that the tests with larger standard deviations will be weighted more in the composite. Generally, we want to weight all the tests equally. At the very least we want to be in control of the weighting; we don’t want to leave the weighting up to some arbitrary consequence of the metrics of the variables.
Thus, a common procedure is to:2b.1) convert the raw test scores to z-scores, and 2b.2) add-up the z-scores.
2b.1) Convert each raw score to a z-score:
The formula for a z-score is (score – mean) / standard deviation.
For general information about descriptive statistics and using the compute function follow this link.
a) SPSS will do this for you using Analyze - Descriptive - Descriptive Statistics - Save Standardized Values
b) Alternatively you can get the descriptive statistics for your variable (mean and standard deviation) and then use Transform - Compute to create a new variable.
In syntax it would look like this:
compute zvocab = (vocab – 10) / 2.
Assuming that your raw variable for the test is vocab, the mean of the raw score is 10, and the standard deviation of the raw score is 2.
Option b above is essential when you have two or more time points. This is because you will want to use a common mean and standard deviation for standardised variables at the two time points. If you standardise within a time point, you will remove any change in scores over time.
2b.2) add-up the z-scores:
In SPSS this can be done using Transform >> Compute
In syntax it might look like this:
compute verbaltot = zvocaba + zvocabe + zcompr.
This assumes that you want your new variable to called verbaltot, and that you have created three z-score versions of your tests.
Note that you will have to adjust the above approach if you have any reversed tests. For example, on measures of reaction time or error counts, low scores indicate more ability, whereas on measures based on number of items answered correctly, high scores indicate more ability. In these situations, you will need to either reverse the z-scores before forming the composite or place a minus sign, instead of a plus sign, before the test in the compute statement.
Step 3: use composites in subsequent analyses.
The composites can then be used in subsequent analyses, such as predictors in regressions, dependent variables in group comparisons, and so on. The benefit is that you have simplified the complexity in your data and are able to present a more parsimonious explanation.
When it comes to reporting your decision to combine scales, you will want to give a justification and an explanation. The justification should make reference to data, theory, and your aims. The description can be as simple as what follows in Ackerman & Cianciolo (2000): “To provide more stable measures of the underlying abilities, composites were formed with unit-weighted z scores of constituent tests” (p.264)
- Ackerman and Cianciolo (2000, Cognitive, Perceptual-Speed, and Psychomotor Determinants of Individual)
- Further elaboration of use of factor analysis in this context can be seen in: Goff, M., & Ackerman, P. L. (1992). Personality-intelligence relations: Assessment of typical intellectual engagement. Journal of Educational Psychology, 84, 537-552.