Friday, April 4, 2014

Flip the script, or, the joys of coord_flip()

Has this ever happened to you?

I hate it when the labels on the x-axis overlap, but this can be hard to avoid. I can stretch the figure out, but then the data become farther apart and the space where I want to put the figure (either in a talk or a paper) may not accommodate that. I've never liked turning the labels diagonally, so recently I've started using coord_flip() to switch the x- and y-axes:
ggplot(chickwts, aes(feed, weight)) + stat_summary(, geom="pointrange") + coord_flip()

It took a little getting used to, but I think this works well. It's especially good for factor analyses (where you have many labeled items):
pc <- principal(Harman74.cor$cov, 4, rotate="varimax")
loadings <-$loadings[, 1:ncol(pc$loadings)])
loadings$Test <- rownames(loadings)

ggplot(loadings, aes(Test, RC1)) + geom_bar() + coord_flip() + theme_bw(base_size=10)
It also works well if you want to plot parameter estimates from a regression model (where the parameter names can get long):
m <- lmer(weight ~ Time * Diet + (Time | Chick), data=ChickWeight, REML=F)
coefs <-
colnames(coefs) <- c("Estimate", "SE", "tval")
coefs$Label <- rownames(coefs)

ggplot(coefs, aes(Label, Estimate)) + geom_pointrange(aes(ymin = Estimate - SE, ymax = Estimate + SE)) + geom_hline(yintercept=0) + coord_flip() + theme_bw(base_size=10)

Monday, March 3, 2014

Guidebook for growth curve analysis

I don't usually like to use complex statistical methods, but every once in a while I encounter a method that is so useful that I can't avoid using it. Around the time I started doing eye-tracking research (as a post-doc with Jim Magnuson), people were starting recognize the value of using longitudinal data analysis techniques to analyze fixation time course data. Jim was ahead of most in this regard (Magnuson et al., 2007) and a special issue of the Journal of Memory and Language on data analysis methods gave as a great opportunity to describe how to apply "Growth Curve Analysis" (GCA) - a type of multilevel regression - to fixation time course data (Mirman, Dixon, & Magnuson, 2008). Unbeknownst to us, Dale Barr was working on very similar methods, though for somewhat different reasons, and our articles ended up neighbors in the special issue (Barr, 2008).

Growth Curve Analysis and Visualization Using R
In the several years since those papers came out, it has become clear to me that other researchers would like to use GCA, but reading our paper and downloading our code examples was often not enough for them to be able to apply GCA to their own data. There are excellent multilevel regression textbooks out there, but I think it is safe to say that it's a rare cognitive or behavioral scientist who has the time and inclination to work through a 600-page advanced regression textbook. It seemed like a more practical guidebook to implementing GCA was needed, so I wrote one and it has just been published by Chapman & Hall / CRC Press as part of their R Series.

My idea was to write a relatively easy-to-understand book that dealt with the practical issues of implementing GCA using R. I assumed basic knowledge of behavioral statistics (standard coursework in graduate behavioral science programs) and minimal familiarity with R, but no expertise in computer programming or the specific R packages required for implementation (primarily lme4 and ggplot2). In addition to the core issues of fitting growth curve models and interpreting the results, the book covers plotting time course data and model fits and analyzing individual differences. Example data sets and solutions to the exercises in the book are available on my GCA website.

Obviously, the main point of this book is to help other cognitive and behavioral scientists to use GCA, but I hope it will also encourage them to make better graphs and to analyze individual differences. I think individual differences are very important to cognitive science, but most statistical methods treat them as just noise, so maybe having better methods will lead to better science, though this might be a subject for a different post. Comments and feedback about the book are, of course, most welcome.  

Tuesday, February 11, 2014

Three ways to get parameter-specific p-values from lmer

How to get parameter-specific p-values is one of the most commonly asked questions about multilevel regression. The key issue is that the degrees of freedom are not trivial to compute for multilevel regression. Various detailed discussions can be found on the R-wiki and R-help mailing list post by Doug Bates. I have experimented with three methods that I think are reasonable.

Monday, January 27, 2014

Graduate school advice

Since it is the season for graduate school recruitment interviews, I thought I would share some of my thoughts. This is also partly prompted by two recent articles in the journal Neuron. If you're unfamiliar with it, Neuron is a very high-profile neuroscience journal, so the advice is aimed at graduate students in neuroscience, though I think the advice broadly applies to students in the cognitive sciences (and perhaps other sciences as well). The first of these articles deals with what makes a good graduate mentor and how to pick a graduate advisor; the second article has some good advice on how to be a good graduate advisee.

I broadly agree with the advice in those articles and here are a few things I would add:

Monday, December 9, 2013

Language in developmental and acquired disorders

As I mentioned in an earlier post, last June I had the great pleasure and honor of participating in a discussion meeting on Language in Developmental and Acquired Disorders hosted by the Royal Society and organized by Dorothy Bishop, Kate Nation, and Karalyn Patterson. Among the many wonderful things about this meeting was that it brought together people who study similar kinds of language deficit issues but in very different populations -- children with developmental language deficits such as dyslexia and older adults with acquired language deficits such as aphasia. Today, the special issue of Philosophical Transactions of the Royal Society B: Biological Sciences containing articles written by the meeting's speakers was published online (Table of Contents).

Monday, November 25, 2013

Does Malcolm Gladwell write science or science fiction?

Malcolm Gladwell is great at writing anecdotes, but he dangerously masquerades these as science. Case studies can be incredibly informative -- they form the historical foundation of cognitive neuroscience and continue to be an important part of cutting-edge research. But there is an important distinction between science, which relies on structured data collection and analysis, and anecdotes, which rely on an entertaining narrative structure. His claim that dyslexia might be a "desireable difficulty" is maybe the most egregious example of this. Mark Seidenberg, who is a leading scientist studying dyslexia and an active advocate, has written an excellent commentary about Gladwell's misrepresentation of dyslexia. The short version is that dyslexia is a serious problem that, for the vast vast majority of people, leads to various negative outcomes. The existence of a few super-successful self-identified dyslexics may be encouraging, maybe even inspirational, but it absolutely cannot be taken to mean that dyslexia might be good for you.

In various responses to his critics, Gladwell has basically said that people who know enough about the topic to recognize that (some of) his conclusions are wrong, shouldn't be reading his books ("If my books appear to a reader to be oversimplified, then you shouldn't read them: you're not the audience!"). This is extremely dangerous: readers who don't know about dyslexia, about its prevalence or about its outcomes, would be led to the false conclusion that dyslexia is good for you. The problem is not that his books are oversimplified; the problem is that his conclusions are (sometimes) wrong because they are based on a few convenient anecdotes that do not represent the general pattern.

Another line of defense is that Gladwell's books are only meant to raise interesting ideas and stimulate new ways of thinking in a wide audience, not to be a scholarly summary of the research. Writing about science in a broadly accessible way is a perfectly good goal -- my own interest in cognitive neuroscience was partly inspired by the popular science writing of people like Oliver Sacks and V.S. Ramachandran. The problem is when the author rejects scientific accuracy in favor of just talking about "interesting ideas". Neal Stephenson once said that what makes a book "science fiction" is that it is fundamentally about ideas. It is great to propose new ideas and explore what they might mean. But if we follow that logic, then Malcolm Gladwell is not a science writer, he is a science fiction writer.

Monday, October 21, 2013

The mind is not a (digital) computer

The "mind as computer" has been a dominant and powerful metaphor in cognitive science at least since the middle of the 20th century. Throughout this time, many of us have chafed against this metaphor because it has a tendency to be taken too literally. Framing mental and neural processes in terms of computation or information processing can be extremely useful, but this approach can turn into the extremely misleading notion that our minds work kind of like our desktop or laptop computers. There are two particular notions that have continued to hold sway despite mountains of evidence against them and I think their perseverance might be, at least in part, due to the computer analogy.

The first is modularity or autonomy: the idea that the mind/brain is made up of (semi-)independent components. Decades of research on interactive processing (including my own) and emergence have shown that this is not the case (e.g., McClelland, Mirman, & Holt, 2006; McClelland, 2010; Dixon, Holden, Mirman, & Stephen, 2012), but components remain a key part of the default description of cognitive systems, perhaps with some caveat that these components interact.

The second is the idea that the mind engages in symbolic or rule-based computation, much like the if-then procedures that form the core of computer programs. This idea is widely associated with the popular science writing of Steven Pinker and is a central feature of classic models of cognition, such as ACT-R. In a new paper just published in the journal Cognition, Gary Lupyan reports 13 experiments showing just how bad human minds are at executing simple rule-based algorithms (full disclosure: Gary and I are friends and have collaborated on a few projects). In particular, he tested parity judgments (is a number odd or even?), triangle judgments (is a figure a triangle?), and grandmother judgments (is a person a grandmother?). Each of these is a simple, rule-based judgment, and the participants knew the rule (last digit is even; polygon with three sides; has at least one grandchild), but they were nevertheless biased by typicality: numbers with more even digits were judged to be more even, equilateral triangles were judged to be more triangular, and older women with more grandchildren were judged to be more grandmotherly. A variety of control conditions and experiments ruled out various alternative explanations of these results. The bottom line is that, as he puts it, "human algorithms, unlike conventional computer algorithms, only approximate rule-based classification and never fully abstract from the specifics of the input."

It's probably too much to hope that this paper will end the misuse of the computer metaphor, but I think it will be a nice reminder of the limitations of this metaphor. Dixon JA, Holden JG, Mirman D, & Stephen DG (2012). Multifractal dynamics in the emergence of cognitive structure. Topics in Cognitive Science, 4 (1), 51-62 PMID: 22253177
Lupyan, G. (2013). The difficulties of executing simple algorithms: Why brains make mistakes computers don’t. Cognition, 129(3), 615-636. DOI: 10.1016/j.cognition.2013.08.015
McClelland, J.L. (2010). Emergence in Cognitive Science. Topics in Cognitive Science, 2 (4), 751-770 DOI: 10.1111/j.1756-8765.2010.01116.x
McClelland JL, Mirman D, & Holt LL (2006). Are there interactive processes in speech perception? Trends in Cognitive Sciences, 10 (8), 363-369 PMID: 16843037