## OverviewGrowth curve analysis (GCA) is a multilevel regression technique designed for analysis of time course or longitudinal data. A major advantage of this approach is that it can be used to simultaneously analyze both group-level effects (e.g., experimental manipulations) and individual-level effects (i.e., individual differences). We have been using this method for several years, particularly in the context of visual world paradigm (VWP) eye tracking data and learning curves, though it can be applied to any time course data. Our first article describing this method was:
Our original GCA page provides the example data sets and analysis code used for that article, along with (old) tutorial materials. Since that original report, we have improved the method in various ways. This "new and improved" version of growth curve analysis is described in a much more detailed way in:
The basics are summarized below. All of the analyses are conducted using the R package
`lme4` and the graphs generated using the `ggplot2 ` package. For more information about R, check out our R Resources page.## Key challenges in analyzing time series data- Using separate analyses for individual time bins or time windows creates a trade-off between power (more data in each bin) and temporal resolution (smaller time bins) and introduces experimenter bias in selection of time bins/windows.
- Statistical thresholding (i.e.,
*p*< 0.05 is significant but*p*> 0.05 is not) creates false discretization of continuous processes. - There is no clear way to quantify individual differences, which are an important source of constraints for theories of cognition.
## The big pictureGrowth curve analysis provides a way to address those challenges by explicitly modeling change over time and quantifying both group-level and individual-level differences. To specify a growth curve model, you need to decide on three key components:
**The functional form**: the overall shape of the data. What family of mathematical functions are you going to use to model the data? For growth curve analysis (and multilevel regression in general) the functional form needs to be "dynamically consistent", meaning that the model of the average is equal to the average of the individual models (see our technical report on dynamic consistency for more information). Polynomial models (starting from linear and going up through quadratic, cubic, quartic, etc.) satisfy this property and are able to capture any data shape, so they are a good option. Unfortunately, polynomials are not very good at capturing flat asymptotes and extrapolation (i.e., predicting what will happen outside the observed time window) is generally not possible. If you are not very interested in asymptotic portions of your data and don't plan on making out-of-time-window prediction, then a polynomial functional form is probably best for you.**The fixed effects**: What are your group-level predictors? Usually these are the experimental manipulations like word frequency, stimulus relatedness, etc. They can be continuous or categorical and they can refer to either the stimuli (e.g., word frequency) or the participants (e.g., age, short-term memory span, lesion location, etc.).**The random effects**: the randomly sampled observations over which you plan to generalize. Typically these are either participants or items. By capturing individual variability, the random effects provide another way to quantify individual differences.
## A simple exampleTo see how these components come together, consider the`ChickWeight ` data set (part of the default R installation), which has data from an experiment on the effect of diet on early growth of chicks. We start out with just a "base" model of chick growth allowing for individual variability in weight (in technical terms, a random intercept for each Chick: `(1 | Chick)` ):`> m.base <- lmer(weight ~ Time + (1 | Chick), data=ChickWeight, REML=F)` To this, we can add fixed effects of diet on the intercept (i.e., a constant difference in weights among chicks randomly assigned to different diets):
`> m.0 <- lmer(weight ~ Time + Diet + (1 | Chick), data=ChickWeight, REML=F)` and on the slope (i.e., effects of diet on the rate of growth):
`> m.1 <- lmer(weight ~ Time * Diet + (1 | Chick), data=ChickWeight, REML=F)` We can visually examine the effect of adding these terms by plotting the model fits against the observed data. The observed data are on the left, the m.0 (intercept model) fit is in the middle, and the m.1 (linear model) fit is on the right:
To statistically evaluate the effects of adding these terms, we can examine the change in the goodness of fit (log likelihood) through model comparisons:
`> anova(m.base, m.0, m.1)`
Of course, in addition to different baseline weights, the chicks might have other (unmeasured) individual properties that affect their growth rate. To capture this, we can add a linear effect on time of Chick to the random effects, and we can use model comparisons to examine whether this effect improved model fit:
`> ` `m.t1 <- lmer(weight ~ Time * Diet + (1 + Time | Chick), data=ChickWeight, REML=F)` `> anova(m.1, m.t1)`
`> coef(summary(m.t1))`
Adding the random slopes did improve model fit. As a general rule, the model should include all random effects that are licensed by the design, that is, all the ones that could vary across participants (Barr et al., 2013). Notice that adding the random slope effect caused slight changes in the fixed effect parameter estimates because some of the variance was now captured by that random effect. This is why Barr et al. recommend using the "maximal" random effect structure -- even when a random effect does not improve model fit, it can still affect the fixed effect estimates and excluding it can elevate the false positive rate.
## Applying GCA to VWP data## Orthogonal polynomialsOften, our data are not perfectly straight, so we want to capture that curvature with higher-order polynomial terms (time squared, time cubed, etc.). Because our time variable is usually only positive, natural polynomials are correlated, so the estimated parameters will be interdependent. Given a particular time range and a maximum polynomial order, we can transform the polynomial time vectors to make them independent, that is, "orthogonal". This is illustrated in the figure below for linear and quadratic time in the range 1-10. Since orthogonal polynomial time terms are independent, the parameter estimates can be interpreted independently.
With orthogonal polynomials, the intercept term reflects the average overall curve height, rather than the height at the left edge of the time window, so if you are interested in differences at the very beginning of the time window, you may be better off sticking with natural polynomials.
## A complete exampleFor this example, we will consider the simple target fixation data shown on the right from a VWP experiment with a word frequency manipulation (High frequency words recognized faster than Low frequency words). Note that since Condition (High vs. Low frequency) was manipulated within-participants, we will want to include nested random effects of Condition with Subject.
The starting data frame is shown below. Subject is the the subject ID, Time is time (in ms) from word onset, timeBin identifies the 100ms time bin corresponding to this time point (convenient for aligning orthogonal time), Condition is the word frequency condition, and meanFix is the mean proportion of fixations to the target object.
The first step is to create a third-order polynomial in the range of
`timeBin` .`> t <- poly((unique(TargetFix$timeBin)), 3)` `> TargetFix[,paste("ot", 1:3, sep="")] <- t[TargetFix$timeBin, 1:3]` Since this is a simple case with just one within-subjects fixed effect that has only two levels, we can skip to the full model and examine its parameter estimates:
`> ` `m.full <- lmer(meanFix ~ (ot1+ot2+ot3)*Condition + ` ` (ot1+ot2+ot3 | Subject) + ` ` (ot1+ot2 | Subject:Condition), ` ` control = lmerControl(optimizer="bobyqa"),` ` data=TargetFix, REML=F)`
`> coef(summary(m.full))`
Notice that the parameter estimates do not have p-values. There are good reasons for that (see this FAQ for more information), but this is cold comfort to most experimental psychologists, who need to report
p-values. The quick and easy solution is to assume that, because we have relatively many observations, the t-distribution converges to the z-distribution, so we can use a normal approximation:`> coefs <- data.frame(coef(summary(m.full))) ` `> coefs$p <- format.pval(2*(1-pnorm(abs(coefs$t.value))), digits=2, eps=0.0001) #also make the p-values a bit more readable` `> coefs` ` Estimate Std..Error t.value p` `(Intercept) 0.4773227513 0.01385240 34.457775306 <1e-04` `ot1 0.6385603705 0.05993519 10.654181583 <1e-04` `ot2 -0.1095979256 0.03848819 -2.847573180 0.0044` `ot3 -0.0932611870 0.02041640 -4.567955536 <1e-04` `ConditionLow -0.0581122429 0.01901291 -3.056462582 0.0022` `ot1:ConditionLow 0.0003188189 0.06330556 0.005036191 0.9960` `ot2:ConditionLow 0.1635455113 0.05426498 3.013831365 0.0026` `ot3:ConditionLow -0.0020869051 0.02014728 -0.103582452 0.9175` Another approach is to selectively remove individual effects of Condition and use model comparisons to evaluate each effect. This method would have produced very similar results:
`Chisq` `Df` `Pr(>Chisq)` `ConditionLow` `4.1481` `1` `0.0417 ` `*` `ot1:ConditionLow` `3.4292` `1` `0.0641 .` `ot2:ConditionLow` `7.4845` `1` `0.0062 ` `**` `ot3:ConditionLow` `0.0107` `1` `0.9175` The results confirm that fixation proportion was lower overall for the Low condition than the High condition (Condition effect on the intercept - ConditionLow - had a statistically significant negative estimate) and there was a difference in curvature between the conditions (statistically significant effect of condition on the quadratic term:
`ot2:ConditionLow` )## Reporting resultsThere are two rules of thumb for reporting growth curve analysis results:
**Clearly describe each of the three key components of the model**: the functional form (third-order orthogonal polynomial), the fixed effects (effect of Condition on all time terms), and the random effects (effect of Subject on each of the time terms and nested effects of Subject-by-Condition on each of the time terms except the cubic). Depending on the circumstances and complexity of the model, you may want to include additional information about the factors and why they were included or not. It's also a good idea to report which method was used for computing p-values.**For key findings, report parameter estimates and standard errors along with significance tests**. In some cases the model comparison is going to be enough, but for key findings, the readers should want to see the parameter estimates. The parameter estimate standard errors are critical for interpreting the estimates, so those should be reported as well. The t-values are not critical to report (they are just Estimate divided by the Std Error, so they can always be computed from the reported estimates and standard errors). If there are many estimated parameters, it may be a good idea to focus the main text discussion on the most important ones and report the full set in a table or appendix.
## Another exampleHere is another example based on data from an experiment examining the link between statistical learning and word learning (Mirman et al., 2008). In this experiment the critical manipulation was between-participants and the data shape is a little different (requiring only a second-order polynomial).
## Analyzing individual differencesHere we will consider cohort and rhyme competition data from 5 participants with Broca's aphasia, 3 participants with Wernicke's aphasia, and 12 age-matched controls (Mirman et al., 2011). The group means are shown below, with cohort competition in the top row and rhyme competition in the bottom row. Visually, it seems that the participants with Wernicke's exhibited more cohort competition and less rhyme competition, whereas participants with Broca's aphasia exhibited the opposite pattern.
`> summary(CohortRhyme)`
`> t <- poly(unique(CohortRhyme$timeBin), 4)` `> CohortRhyme[,paste("ot", 1:4, sep="")] <- t[CohortRhyme$timeBin, 1:4]` When the individual differences are "experiment external", such as aphasia subtype (or continuous variables like age), the best option is to add the individual difference variable as a fixed effect to the model. So we start with a base model of the overall competition effect (Object fixed effect) and add a fixed effect of Group. It would be a good idea to build up the Group effects gradually, but we'll skip to the full model to keep this example more brief:
`> cohort.base <- lmer(FixProp ~ (ot1+ot2+ot3+ot4)*Object + (1+ot1+ot2+ot3+ot4 | subjID) + (1+ot1+ot2 | subjID:Object), ` `control = lmerControl(optimizer="bobyqa"),` ` data=subset(CohortRhyme, Type == "Cohort"), REML=F)` `> cohort.group <- lmer(FixProp ~ (ot1+ot2+ot3+ot4)*Object*Group + (1+ot1+ot2+ot3+ot4 | subjID) + (1+ot1+ot2 | subjID:Object), ` `control = lmerControl(optimizer="bobyqa"), ` `data=subset(CohortRhyme, Type == "Cohort"), REML=F)` `> anova(cohort.base, cohort.group)`
ObjectUnrelated:GroupBroca 0.007168833 0.016841402 0.4256673 0.67035
ObjectUnrelated:GroupWernicke -0.049674167 0.020423200 -2.4322421 0.01501
ot1:ObjectUnrelated:GroupBroca -0.047473215 0.059871004 -0.7929250 0.42782
ot1:ObjectUnrelated:GroupWernicke 0.009277717 0.072604257 0.1277848 0.89832
ot2:ObjectUnrelated:GroupBroca 0.044092440 0.058847796 0.7492624 0.45370
The same steps for the rhyme competition data reveal that only the Broca's group differed from controls:
`> rhyme.base <- lmer(FixProp ~ (ot1+ot2+ot3+ot4)*Object + (1+ot1+ot2+ot3+ot4 | subjID) + (1+ot1+ot2 | subjID:Object), control = lmerControl(optimizer="bobyqa"), data=subset(CohortRhyme, Type == "Rhyme"), REML=F)` `> rhyme.group <- lmer(FixProp ~ (ot1+ot2+ot3+ot4)*Object*Group + (1+ot1+ot2+ot3+ot4 | subjID) + (1+ot1+ot2 | subjID:Object), control = lmerControl(optimizer="bobyqa"), data=subset(CohortRhyme, Type == "Rhyme"), REML=F)` `> ` `anova(rhyme.base, rhyme.group)` ```
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
rhyme.base 32 -2269.6 -2119.7 1166.8 -2333.6
rhyme.group 52 -2265.2 -2021.6 1184.6 -2369.2 35.597 20 0.01715 *
```
`> coefs.rhyme <- as.data.frame(coef(summary(rhyme.group)))` `> coefs.rhyme$p <- format.pval(2*(1-pnorm(abs(coefs.rhyme[,"t value"]))))`
Estimate Std. Error t value p ```
ObjectUnrelated:GroupBroca -0.035564333 0.015761359 -2.25642560 0.0240440
ObjectUnrelated:GroupWernicke 0.017755000 0.019113455 0.92892677 0.3529270
ot1:ObjectUnrelated:GroupBroca 0.122036460 0.064618493 1.88856865 0.0589497
ot1:ObjectUnrelated:GroupWernicke -0.128382147 0.078361433 -1.63833333 0.1013522
ot2:ObjectUnrelated:GroupBroca -0.006877810 0.055995864 -0.12282711 0.9022440
ot2:ObjectUnrelated:GroupWernicke -0.011219074 0.067904959 -0.16521729 0.8687730
``` The random effects are the systematic deviations from the "mean" pattern predicted by the fixed effects. So we'll use the random effects from the base model, which did not distinguish between participant groups so we can get individual effect sizes relative to the overall mean (otherwise we'd be looking at individual effect sizes relative to the sub-group mean). The key function for extracting random effects is
`ranef` and in our case it will be a list with two elements, because we had a random effect by `subjID ` and a random effect by `subjID:Object` . We're interested in the latter, because it is the one that will capture individual differences in competition effect size. In particular, we'll want to subtract the random effect estimate for the Unrelated object from the estimate for the Competitor for each subject to get that subject's individual competition effect size. Extracting the random effects is relatively easy (e.g., `ranef(cohort.base)$'subjID:Object'` ), but there is a moderate amount of data manipulation involved in getting them into a convenient form (requiring the `reshape2` and `plyr ` packages). There are many ways to do the data manipulation, here is my way:`#get cohort effect sizes` `> blup.cohort <- data.frame(colsplit(row.names(ranef(cohort.base)$'subjID:Object'), ":", c("Subject", "Object")), ranef(cohort.base)$'subjID:Object')` `> ES.coh <- ddply(blup.cohort, .(Subject), summarize, Intercept = X.Intercept.[Object=="Competitor"] - X.Intercept.[Object=="Unrelated"], Quadratic = ot2[Object=="Competitor"] - ot2[Object=="Unrelated"])`
`#get rhyme effect sizes` `> blup.rhyme <- data.frame(colsplit(row.names(ranef(rhyme.base)$'subjID:Object'), ":", c("Subject", "Object")), ranef(rhyme.base)$'subjID:Object')` `> ES.rhy <- ddply(blup.rhyme, .(Subject), summarize, Rhyme.Intercept = X.Intercept.[Object=="Competitor"] - X.Intercept.[Object=="Unrelated"])`
`#combine` `> ES <- merge(ES.coh, ES.rhy, by="Subject")` `> group <- unique(subset(CohortRhyme, select=c(subjID, Group))) #get group assignments from original data frame` `> ES <- merge(ES, group, by.x="Subject", by.y="subjID")` Now we can examine a scatterplot of the individual cohort and rhyme competition effect sizes (effects on the intercept, shown on the right) and compute correlations across the complete sample and just for the participants with aphasia:
`> cor.test(ES$Intercept, ES$Rhyme.Intercept)` `> ` `cor.test(ES$Intercept[ES$Group != "Control"], ES$Rhyme.Intercept[ES$Group != "Control"])` `> cor.test(ES$Intercept[ES$Group == "Control"], ES$Rhyme.Intercept[ES$Group == "Control"])` The correlation is significant for participants with aphasia (r = -0.86, p = 0.006), but not for control participants (r = 0.34, p > 0.25). This suggests that there may be a single mechanism behind this pattern, in contrast to the standard separable components interpretation of a double dissociation.
## Another exampleHere is another example based on data from an experiment that examined the time course of activation of function and thematic semantic knowledge in 17 participants with left hemisphere stroke (Kalenine, Mirman, & Buxbaum, 2012).
I highly recommend the ggplot2 package for plotting (if you are not already familiar with it, you may want to start with a tutorial and the online documentation is a useful reference). To start, plot the target fixation data discussed above:
`ggplot(TargetFix, aes(Time, meanFix, color=Condition)) + ` ` stat_summary(fun.data=mean_se, geom="pointrange") + ` ` labs(y="Fixation Proportion", x="Time since word onset (ms)") + ` ` theme_bw() + scale_color_manual(values=c("red", "blue"))` You can now add the model fits by adding another stat_summary with the y-variable mapped to the predicted values in the lmer model object, which can be accessed using the
`fitted()` function:`last_plot() + stat_summary(aes(y=fitted(m.full)), fun.y=mean, geom="line")`
The other aesthetics (mappings) are automatically inherited (e.g., mapping of color to Condition), so the plot will automatically remain consistent. This is particularly useful for multi-panel plots as in the cohort and rhyme competition example from above. When the model was fit to a data subset, it's not a good idea to rely on ggplot to align the data and the model-predicted values. Instead, create a new data frame that has just the data subset and add the model-predicted values to that data frame:
`dat <- subset(CohortRhyme, Type == "Cohort")` `dat$mfit <- fitted(cohort.group)` `ggplot(dat, aes(Time, FixProp, color=Object)) + ` ` facet_wrap(~ Group) + theme_bw() + ` ` stat_summary(fun.y=mean, geom="point") +` ` stat_summary(aes(y=mfit), fun.y=mean, geom="line") +` ` labs(y="Fixation Proportion", x="Time since word onset (ms)") + ` ` scale_color_manual(values=c("red", "blue"))` |