Jeromy Anglim's Blog: Psychology and Statistics


Thursday, February 19, 2009

Single Group Correlational Study: Basic Analyses

This post sets a basic procedure for analysing a single group observational study in psychology. It aims to provide a starting point, particularly for researchers who do analyse data infrequently.

The Typical Study
I speak to many researchers with the following study design: A single group of participants where a number of measures have been obtained. Commonly there are between 50 and 200 participants. The variables tend to involve self-report psychological tests, demographics, and possibly some ability tests. Data is obtained at one time point.

Students analysing such data for the first time are often looking for a basic framework that can help them structure the process.

This post assumes:
  • The reader is looking for a basic framework to get started
  • Is using SPSS to analyse their data
Basic Resources
  • The SPSS Survival Manual (See Unimelb bookstore) is an excellent reference manual for psychology course work students who need to analyse their thesis data. 
  • Google also has many resources. A simple search for the statistical technique with the possible addition of the search term SPSS should bring up many useful results.
  • I also have a basic set of notes on SPSS.
A BASIC PROCEDURE
1. Get the raw data into SPSS

2. Incorporate metadata 
This includes value labels, variable labels, changing variable names, etc.

3. Data Checking
Check that the data has been entered accurately
Look at frequencies; look at distributions; look for impossible values.
The main task is to compare the data with your expectations. You will want to do more of this in relation to particular analyses. See my workshop or search the web for more information about each.


4. Compute Variables
Once you have your raw data, there are several new variables that you may want to create. Now is the time to do these steps, although as you proceed with your analyses, you may need to return to this phase if there are variables that you want that you did not anticipate.

The most common are:
Reversing items and scales: 
This is necessary when you have psychological tests where some items are negatively worded. You will need to reverse the negatively worded items before forming composites.

Other times you may want to reverse whole scales.

In SPSS, either use Transform -Recode or Transform - Compute

The standard formula to reverse a variable while maintaining the same scale as the original variable is:  (max + min) – (score); where max is the maximum possible score, min is the minimum possible score, and score is the variable itself.

Totals and means: 
Totals and means are used when computing scale scores for psychological tests, whether they are personality-style tests or ability tests based on a set of correct and incorrect answers.
Transform - Compute using the Mean or Sum function is the most common approach.

Collapsing 
You may wish to reduce the number of categories in a variable. For example, if you have a nationality variable, you may wish to reduce it to Australian versus Not Australian.
Transform - Recode is one common tool


Transformations: 
Transformations are commonly used to adjust non-normal data. SPSS Survival Manual and Tabachnick and Fidell list typical functions: square root, log, inverse, that can be used. Osborne discusses this further.

5. Describing scale properties
When forming scales from test items, two common steps include: a) checking reliability and b) checking factor structure.

5.a Reliability Analysis
Alpha is typically reported for each scale (although there are other indices of reliability)
To obtain Alpha in SPSS, you need to have entered your data at the item level.
Enter the items into Analyze - Scale - Reliability Analysis

Andy Field provides information on interpretation.

5.b Factor Analysis
If your psychological test includes multiple factors it can be good to run an exploratory factor analysis to check that the items load on the factors that they are meant to.

The importance of this step and the amount of information you provide about it depends on a number of factors.

If the scale has never been used, is rarely used, or is rarely used in the present sample, you should run a factor analysis. There are also occasions where you may not be able to put much confidence in a factor analysis: e.g., sample size is small, item-factor correlations are not particularly distinct.

Addition resources include UCLA SPSS Output and Andy Field's notes on factor analysis.


6. Univariate descriptive statistics
You will want to report basic descriptive statistics for all of the main variables in your study.In the case of metric variables this includes at least means, standard deviations and some indication of the distribution (skewness and kurtosis, outliers). You might report, median, minimum, maximum, and more. In the case of nominal variables this includes percentages for each of the categories.

SPSS Frequencies is one tool among many for the job, mainly found under the Analyze - Descriptive Statistics Menu.
You may also wish to compare your means and standard deviations to any available norms.

7. Correlation Matrix
It is useful to know the correlations between the main variables in the study. This is usually presented in your report. It allows the reader to explore the relationships in your data. The following post sets out how to format the correlation matrix.


8. Model Testing
There are many options at this point:

Multiple Regression:



Standard multiple regression: predicting a DV from two or more other variables. Hierarchical regression: Seeing whether one or more predictors predicts the DV over and above other variables. Moderator regression: This is simply multiple regression with interaction terms


Mediation:

Structural Equation Modelling:


There are many other more advanced techniques, including multidimensional scaling, cluster analysis, and partial correlation, just to name a few. If you have categorical data you may also want to look into techniques such as  chi-square, loglinear modelling, logistic regression, and discriminant function analysis.

CONCLUSION
The above reflects a very basic set of analyses. It aims to be a starting point for student researchers who are running analyses on such a dataset for the first time and are looking for some structure. It should be read along side a good statistics book that can touch on some of the many subtler issues involved.