Minding the Brain: regression

Showing posts with label regression. Show all posts

Wednesday, August 13, 2014

Plotting mixed-effects model results with effects package

As separate by-subjects and by-items analyses have been replaced by mixed-effects models with crossed random effects of subjects and items, I've often found myself wondering about the best way to plot data. The simple-minded means and SE from trial-level data will be inaccurate because they won't take the nesting into account. If I compute subject means and plot those with by-subject SE, then I'm plotting something different from what I analyzed, which is not always terrible, but definitely not ideal. It seems intuitive that the condition means and SE's are computable from the model's parameter estimates, but that computation is not trivial, particularly when you're dealing with interactions. Or, rather, that computation was not trivial until I discovered the effects package.

Guidebook for growth curve analysis

I don't usually like to use complex statistical methods, but every once in a while I encounter a method that is so useful that I can't avoid using it. Around the time I started doing eye-tracking research (as a post-doc with Jim Magnuson), people were starting recognize the value of using longitudinal data analysis techniques to analyze fixation time course data. Jim was ahead of most in this regard (Magnuson et al., 2007) and a special issue of the Journal of Memory and Language on data analysis methods gave as a great opportunity to describe how to apply "Growth Curve Analysis" (GCA) - a type of multilevel regression - to fixation time course data (Mirman, Dixon, & Magnuson, 2008). Unbeknownst to us, Dale Barr was working on very similar methods, though for somewhat different reasons, and our articles ended up neighbors in the special issue (Barr, 2008).

Growth Curve Analysis and Visualization Using R

In the several years since those papers came out, it has become clear to me that other researchers would like to use GCA, but reading our paper and downloading our code examples was often not enough for them to be able to apply GCA to their own data. There are excellent multilevel regression textbooks out there, but I think it is safe to say that it's a rare cognitive or behavioral scientist who has the time and inclination to work through a 600-page advanced regression textbook. It seemed like a more practical guidebook to implementing GCA was needed, so I wrote one and it has just been published by Chapman & Hall / CRC Press as part of their R Series.

My idea was to write a relatively easy-to-understand book that dealt with the practical issues of implementing GCA using R. I assumed basic knowledge of behavioral statistics (standard coursework in graduate behavioral science programs) and minimal familiarity with R, but no expertise in computer programming or the specific R packages required for implementation (primarily lme4 and ggplot2). In addition to the core issues of fitting growth curve models and interpreting the results, the book covers plotting time course data and model fits and analyzing individual differences. Example data sets and solutions to the exercises in the book are available on my GCA website.

Obviously, the main point of this book is to help other cognitive and behavioral scientists to use GCA, but I hope it will also encourage them to make better graphs and to analyze individual differences. I think individual differences are very important to cognitive science, but most statistical methods treat them as just noise, so maybe having better methods will lead to better science, though this might be a subject for a different post. Comments and feedback about the book are, of course, most welcome.

Tuesday, February 11, 2014

Three ways to get parameter-specific p-values from lmer

How to get parameter-specific p-values is one of the most commonly asked questions about multilevel regression. The key issue is that the degrees of freedom are not trivial to compute for multilevel regression. Various detailed discussions can be found on the R-wiki and R-help mailing list post by Doug Bates. I have experimented with three methods that I think are reasonable.

Two ways that correlation and stepwise regression can give different results

[Expanding on my recent answer on Cross Validated, aka stats.stackexchange.com]

In general, a correlation test is used to test the association between two variables (y and z). However, if there is a third variable (x) that might be related to z or y, it makes sense to use stepwise regression (or partial correlation). There are two quite different situations where the correlation and stepwise regression will produce different results. Here are some examples using made up data.

Minding the Brain