Jeromy Anglim's Blog: Psychology and Statistics


Thursday, June 17, 2010

Canonical Correlation: Getting Started with R or SPSS

Canonical correlation is a method of modelling the relationship between two sets of variables. This post provides: (a) Examples of when canonical correlation can be useful; (b) Links to good online resources where you can learn about the technique; (c) Links to examples of running the analysis in R or SPSS; and (d) Examples of articles showing how to report a canonical correlation analysis.

General Thoughts

You might want to consider a canonical correlation analysis in situations where you have:
  • a set of predictors of task/job performance (e.g., ability, personality, demographics) and a set of performance measures (e.g., speed, accuracy, etc.)
  • two tests each with multiple scales that are meant to measure similar things (e.g., two measures of the Big 5 personality factors). This can be useful if you are trying to validate a newer measure against a pre-existing measure.
  • a set of self-report measures and a set of behavioural measures (e.g., performance on a task)
  • a set of set of brain scan measures and a set of behavioural measures
In essence, canonical correlation tends to be interesting when you have two classes of variables and you are interested in what, if any, relationship might exist between the sets. It is a parsimonious way of dealing with the challenges of multivariate data. However, it can sometimes be more exploratory than you might want.

General References

The following references provide an overview of the technique.

R

If you want to implement your analyses in R, the following may provide a useful starting point.

SPSS

If you want to use SPSS to run a canonical correlation, these tutorials may be useful. Note that the location script required to run CANCORR changes between versions and installations of SPSS. David Garson sets out the following code template:
INCLUDE 'c:\Program Files\SPSS\Canonical correlation.sps'.
CANCORR SET1=varlist/
SET2=varlist/.
In contrast on my installation, the script line needs to be:
INCLUDE 'C:\Program Files\SPSSInc\PASWStatistics18\Samples\English\Canonical correlation.sps'.
To find the script on your installation, go to your SPSS installation and search for "canonical".

Journal Articles reporting Canonical Correlation Analysis

The following references provide examples of how to justify, present, and interpret the results of a canonical correlation analysis.
  • Rossier, Meyer and Berthoud (2004) [get pdf here] explore the commonalities between the NEO and 16PF personality inventories.
  • Satterfield, Buelow, Lyddon, and Johnson (1995) look at the relationship between client expectations and attitude to change in a clinical psychology setting.
  • Luthans, Welsh, and Taylor (1988) [get pdf here] present a canonical correlation relating a set of managerial effectiveness measures to a set of managerial activities.

References

  • Luthans, F.; Welsh, D. & Taylor III, L. A descriptive model of managerial effectiveness Group & Organization Management, East Acad Manage, 1988, 13, 148
  • Satterfield, W.; Buelow, S.; Lyddon, W. & Johnson, J. Client stages of change and expectations about counseling. Journal of Counseling Psychology, 1995, 42, 476-478
  • Sherry, A. & Henson, R. Conducting and interpreting canonical correlation analysis in personality research: A user-friendly primer Journal of Personality Assessment, Routledge, 2005, 84, 37-48.
  • Tabachnick, B. G. and L. S. Fidell. 1996. Using Multivariate Statistics. 3rdEdition. HarperCollins College Publishers.

12 comments:

  1. Hello Jeromy,

    Thanks for posting the link to UCLA's tutorial on performing canonical correlation analysis in R. I'm taking a multivariate stats course in which our professor expects us to teach ourselves how to use R so that we may complete the homework assignments, and the UCLA tutorial was invaluable for our latest assignment in CCA.

    Cheers,
    -Morgan
    Atlanta, GA

    ReplyDelete
  2. Thanks a tonne .. i was struggling with the location of the file in SPSS 17.
    omkumar krishnan

    ReplyDelete
  3. I conducted a canonical correlation using the SPSS syntax
    MANOVA DVs with IVs
    /DICRIM = COR
    /PRINT = SIGNIF(EIGEN DIMENR)

    IV's included 4 Myers-Briggs Scales with continuous scores and DVs Modified Instructional Perspectives Inventory with Total Score and seven factors


    I received the following warning “The WITHIN CELLS error matrix is SINGULAR. These variables are LINEARLY DEPENDENT on preceding ones .. F7 Multivariate tests will be skipped.

    A table with Eigenvalues and Canonical Correlations did not appear. Nor did I find a Dimension Reduction Analysis. I do have t-Values and sig. of t as well as a summary with F and sig. of F Factorial MANOVA indicates that no one scale serves as a predictor; however, the interaction of the 4 scale does.

    How should I interpret the warning? Should I rerun the Canonical Correlation? If so, what adjustments would you suggest?

    Thanks

    moehlp@umsl.edu

    ReplyDelete
  4. @Anon
    A common cause is including an overall scale and subscales (i.e., the overall scale is probably a composite of the subscales; and thus provides no additional information).

    I'd have a look and see whether one or more of your variables is a composite of the other variables.

    Ipsative scales can also sometimes cause problems.

    Small samples relative to the number of variables can also occasionally cause problems.

    For a more comprehensive answer, ask the question on stats.stackexchange.com

    ReplyDelete
  5. Thanks for valuable information. I used same procedure for my study. I have two set of variables 6 are from assets side of balance sheet and 6 are from liabilities side of balance sheet.Unfortunately i could not calculate Redundancy coefficient. can i get more information?

    ReplyDelete
  6. @anonymous sounds like a good question for http://stats.stackexchange.com/

    ReplyDelete
  7. Tanks Jeromy

    when I use this command:-
    INCLUDE 'c:\Program Files\SPSS\Canonical correlation.sps'.
    CANCORR SET1=varlist/
    SET2=varlist/.
    I got the following information at the end of out put about redundancy analysis:-

    The canonical scores have been written to the active file.
    Also, a file containing an SPSS Scoring program has been written
    To use this file GET a system file with the SAME variables
    Which were used in the present analysis. Then use an INCLUDE command
    to run the scoring program.
    For example :

    GET FILE anotherfilename
    INCLUDE FILE "CC__.INC".
    EXECUTE.

    Please clearly explain it.

    ReplyDelete
  8. @Anonymous It's been a while since I've used the SPSS canonical correlation procedure; I'm not sure what it means.

    I'd take a guess and say that the CC__INC file includes a set of compute statements or something similar for computing scores on the canonical variates and that these commands are contained in a file called "CC__.INC". Thus, if you had another SPSS data file with the same variables you could open this data file (with GET FILE) and run these compute statements (using INCLUDE FILE and then EXECUTE).

    ReplyDelete
  9. Would this be appropriate to use when you are analyzing 2 very different types of variables (MACI personality scales and DSM-IV screener symptom counts)against change outcomes for a behavioral measure with multiple dimensions/factors. MACI scores have a 0 to 150 range. Screener scores range varies depending on clinical diagnosis. Have no idea what to do with the change scores with this analysis. I have run analysis in MR with ANCOVA, residualized change scores, and gain scores approach, but not sure if I could use the CC approach with this and if so, what to do....

    ReplyDelete
    Replies
    1. It seems like there are multiple levels to your question. You might want to post to stats.stackexchange.com and post a link back here. On that site, we'll have a better opportunity to work through the issues. Also, I imagine you'd have to provide more details and context, and there might be more than one question there.

      Delete
  10. Hi Jeromy
    First of all thanks for your information. I want to ask a question. I'm using canonical correlation for my doctorate's working and I make a mistake. I'd written Syntax in SPSS 20 but program is given error:
    "Error # 1. Command name: CANCORR
    The first word in the line is not recognized as an SPSS Statistics command.
    Execution of this command stops."

    What should I do in this case?

    ReplyDelete
    Replies
    1. Possibly the line has note been updated on your computer to refer to the correct location of the following file.

      INCLUDE 'c:\Program Files\SPSS\Canonical correlation.sps'.

      Find it on your hard drive and include the appropriate path.

      Delete