Jeromy Anglim's Blog: Psychology and Statistics


Saturday, September 19, 2009

Introduction to Journal Article Deconstruction

One of the most powerful strategies that I use to learn how to write journal articles is to consciously study the writing conventions of good journal articles. I often want to communicate this strategy to other researchers who are battling the process of writing research. This is particularly the case with results sections. Thus, I plan to post various case studies using this approach to demonstrate how it works. Perhaps the principles generated from the deconstruction will also be relevant to others. I'll post all such instances with the label Article Deconstruction.



Case Study of Article Deconstruction: Ackerman (1992)
Writing Needs Analysis: I'm currently in the process of writing up the results of a study in skill acquisition. The study involved examining performance on a text editing task over a period of practice trials. The data can be viewed at multiple temporal levels: key strokes; trials; blocks of trials; sets of blocks; and overall experiment. As well as block performance there are many other measures of interest such as whether a wide range of strategies have been used, whether the individual accessed additional help, and so on. The data can also be analysed at the group-level or at the individual-level. Given the complexity in this data I have been trying to get inspiration on how to organise and sequence both my analyses in R and the actual write-up of the results.

Selecting an Article for Deconstruction: My strategy was to find some good examples of articles in the literature and deconstruct their writing style to extract information about the genre of writing up a results chapter with such a study. I figured that providing an example of this process might be of benefit to other researchers who are working out how to write-up their own respective results.

I selected an article by Phillip L. Ackerman. The article explores effects of practice and ability on a performance task of similar complexity as my text editing task. The combination of my prior knowledge of the high quality of the author's work and the journal's standard also guided my selection. The aim of Article Deconstruction is to identify principles that guided the creation of the work, and then use these principles to guide your own future work. My analysis focuses exclusively on the results section

Reference: Ackerman, P. L. (1992). Predicting Individual Differences in Complex Skill Acquisition: Dynamic Ability Determinants. Journal of Applied Psychology, 77, 598-614. (See here)

Observations - Structure and sequence: The overall results section is divided into several main sections: 1) Introductory paragraph; 2) Practice on performance; 3) The moderating role of practice on the ability-performance relationship; 4) Gender differences; 5) Self report measures and the effect of practice on self-report measures.

Principles - Structure and sequence: Results can be divided into sections. Sections can be defined by various criteria including: the dependent variable; the independent variable; the relationship of variables.
Sequencing of sections should be logical. A correlation matrix and descriptive statistics come before model testing. Discussion of main effects generally come before discussion of interactions. All else being equal, results that pertain to more central research questions come before results that pertain to secondary research questions.

Observations - Introductory paragraph: The introductory paragraph introduces a correlation matrix for the ability tests measured in the study. It includes means, standard deviations, and reliability. Ackerman comments generally on the correlation matrix. He explains how the variables were used to form composites and presents the correlations between these composites.

Principles - Table of correlation matrix: The correlation is an excellent way to show the pattern of relationships in the data. It can be readily supplemented with additional information, such as means, standard deviations, and reliabilities. The convention of numbering each of the variables in the correlation matrix and then only presenting the numbers in the columns allows for a concise presentation form. Presenting correlations to three decimal places allows future readers wanting to do structural equation modelling on the matrix to have reasonable precision (even if two decimals is sufficient for general interpretation purposes). All variables for means and standard deviations have the same number of numbers after the decimal and are right-aligned. Numeric index of variables (i.e., the first column) are right-aligned. Variable names are left aligned.. Superscripts of a, b, c, etc. can be used to comment on particular cells. It is generally cleaner to provide a general note at the bottom of the table about correlations greater than a certain absolute amount being statistically significant than using the star-notation. Table titles are typically a noun-phrase, which in the case of correlation matrices include the summary statistics presented and a broad description of the category of variables used. Correlation matrices involving composite measures should generally be presented separately from their component variables because composite measures are correlated with components by virtue of their construction.

Principles - Text accompanying a correlation matrix:  After presenting a correlation matrix comment about the general pattern of the results particularly with regards to the degree the pattern conforms to expectations. Results are generally written in the passive voice (e.g., "are provided", "were formed", "are shown").

Observations - A section on practice and performance: In the study a series of dependent variables are presented, including performance, information requests, and various component task performance scores. Each varied over practice and the aim of the section was to report results on this relationship. Sub headings are given for each of these sets of dependent variables, and there is an initial introduction to the section. Figures, model tests, context, and description are presented to elaborate on each. This section could be seen as a general instance of examining the effect of time on a range of dependent variables.

Principles - Sequencing a section with a common independent variable and multiple dependent variables: An opening short paragraph can be used to set out the purpose of the section in relation to the article's aims and outline the sequence of subsections to come. Dependent variables may be grouped into sets if this is appropriate. The sequence of the dependent variables should be logical. Some examples of logical grounds include important to less important; overall to component; etc.

Principles - One subsection of time on a dependent variable: Information to be considered for inclusion includes: a figure showing the relationship between time and the dependent variable; a description of the dependent variable; a context outlining the relevance of the dependent variable to the research; a model outlining the functional form of the relationship including parameters and fit; a test of statistical significance of any discussed relationship; a comment on interpretation. A general ordering of this information is often: 1) description; 2) context; 3) figure; 4) commentary on figure; 5) model fit; 6) statistical significance.

Principles - figures of time on a dependent variable: A figure of the mean level of a dependent variable over time may or may not include: a) trend line; b) distributional information in terms of percentiles. Y axis labels are vertically oriented. y-axis and x-axis markers are horizontally oriented. One way of showing the data is with dots for data and lines between dots. Font size for axes labels and marker labels is a font like Arial and of slightly smaller size than the main text. The title captures the core relationship: [function] of [Y] over [grouping factor] by [X]. e.g., "mean performance on each trial for each level of practice". Figure titles are presented below the figure with "Figure x." in italics and on the same line as the title.

Observations - discussing a small problem with the study: Ackerman when presenting results for components of performance notes that some component measures of performance that would have been interesting to examine were not measured by the experimental software. He states the reason for this and then continues to discuss what was done.

Principles - discussing a small problem with the study: Studies often have aspects which would have been changed in hindsight. Or there may have been things that a researcher would have liked to have done, but were not feasible for whatever reason. If relevant, acknowledge these considerations, state the reasons for them, and continue on in the knowledge that the study has enough going for it to justify its publication.

I skip over the ability-performance correlations section because it is of less general relevance.


Observations - group differences on dependent variables in observational studies: Ackerman presents a set of analyses examining sex differences on the ability tests and task performance. The analyses are introduced in one paragraph. A rationale is provided and it is made explicit that they are post hoc. The section is divided then into subsections on ability and task performance. Differences on the 14 ability tests and 5 composite ability tests are presented in a table with variables grouped into sets an separate columns for men mean, women mean, and the t-test (note these days, given concerns about effect sizes, standard deviations and Cohen's d with a star for significance would most likely be required). A figure is presented for differences between males and females on performance.

Principles - post hoc analyses: Post hoc analyses are generally presented after main analyses. An exception  is that they might be presented after a set of analyses, but within an earliersection. Post hoc analyses need a justification. Possible justifications include: a) the data on the analyses suggested something interesting; b) initial analyses on other variables suggested the need for the analyses; c) the literature suggests such a relationship (it's acceptable to cite articles); etc. Post hoc analyses should be identified in writing for what they are (as opposed to pretending that they were a priori hypotheses).

Principles - table of differences between group means: A table is an efficient means of presenting the relationship between a grouping variable and many dependent variables (i.e., perhaps 3 or more). If there are many dependent variables and they can be grouped, this can aid table interpretation. Statistical significance can be flagged using stars (* .05; ** .01; etc.). Values tend to be aligned on the decimal.
Interpretation of such a table focuses first on the general pattern of results conscious of type 1 error issues associated with running a series of significance tests.


Principles - effect of time on dependent variable separately for two groups: discussion of group differences on a dependent variable that is measured over time goes from overall to interactions and from overall to component. Presentation of group differences includes: a statement of direction of effect, the group means, and standard deviations, a test of statistical significance, and an effect size measure.

Principles - figure of effect of time on dependent variable separately for two groups: Interactions between time and group on a dependent variable can be presented in figures, as a statistical model, and as a significance test related to the statistical model. Data for each group are presented on the same graph using different plotting characters (e.g., filled circles and non-filled triangles) and different line markers (e.g., complete lines and dotted lines). A legend is used with the plotting character and short name for each group.
When multiple dependent variables are presented in a composite figure (e.g.., a matrix of xy-plots) additional captions are used on each figure to indicate the dependent variable.
If the variable on the y-axis is a ratio variable (e.g., time, counts, etc.) and the y-axis is abbreviated to not show part of the y-axis between zero and the focal range, double diagonal lines are presented at the base of the y-axis to show this abbreviation.

Observations - dependent variables measured at a small number of time points. Ackerman discusses changes in various self-report measures related to self-monitoring and mood measured at the end of each of the five sessions. Repeated measures ANOVAs are presented and the general pattern is discussed. No means or indications of effect size are given. Interpretation of the results are given.

Principles - analyses of peripheral importance. Analyses vary in importance with regards to the aims of the study. A simple scale might be: 1) Core; 2) Secondary; 3) Peripheral; 4) Unnecessary.
1) Core analyses relate to the main research questions of the study and tend to link in with the important original contribution of the research.
2) Secondary analyses are interesting, but fall short of core for one of several reasons including: a) they are partial duplication of the main analyses (e.g., looking at effects on component performance, subscales, and so on); b) they pertain to research questions of lesser importance.
3) Peripheral analyses are of even less interest that secondary analyses, but are deemed sufficiently important to still report. They are often presented in less detail.
4) Unnecessary analyses are deemed to be of insufficient importance to be included in a report. In some settings they may still be placed in an appendix. The degree to which any analyses are deemed necessary varies based on the aims of the study and the constraints placed by word counts on the article.
This scale of analysis importance can inform placement of analyses: i.e., going from core to peripheral. It also can guide the degree of detail provided regarding analyses. Given a certain word count allocation to a section, the important information should be provided first.

Review
The above analysis has endeavoured to highlight the process that I call Journal Article Deconstruction. The aim is to make explicit the principles that govern good scientific writing. By making them explicit they can more readily enter into your own writing style. I find the strategy works best when done iteratively with an actual writing project, where you are required to write the style that you are deconstructing.