Tuesday, September 18, 2012

Aggregating data across trials of different durations

Note: This post is a summary of a more detailed technical report.

In a typical “visual world paradigm” (VWP) eye tracking study a trial ends when the participant responds, which naturally leads to some trials that are shorter than others. So we need to decide when computing fixation proportions at later time points, should terminated trials be included or not? Based on informal discussions with other VWP researchers, I think three approaches are currently in use: (1) for each time bin, include all trials and count post-response frames as non-object fixations (i.e., the participant is done fixating all objects from this trial), (2) include all trials and count post-response frames as target fixations (i.e., if the participant selected the correct object, then consider all subsequent fixations to be on that object; note that, typically, any trials on which the participant made an incorrect response are excluded from analysis), (3) include only trials that are currently on-going and ignore any terminated trials since there is no data for those trials.

The problem with the third approach is that it is a form of selection bias because trials do not terminate at random, so as the time series progresses through the time window, the data move further and further from the complete, unbiased set of trials to a biased subset of only trials that required additional processing time. This bias will operate both between conditions (i.e., more trials from a condition with difficult stimuli than from a condition with easy stimuli) and within conditions (i.e., more of the trials that were difficult than that were easy within a condition). 

Here's an analogy to clarify this selection bias: imagine that we want to evaluate the response rate over time to a drug for a deadly disease. We enroll 100 participants in the trial and administer the drug. At first, only 50% of the participants respond to the drug. As the trial progresses, the non-responders begin to, unfortunately, die. After 6 months, only 75 participants are alive and participating in the trial and the same 50 are responding to the treatment. At this point, is the response rate the same 50% or has it risen to 67%? Would it be accurate to conclude that responsiveness to the treatment increases after 6 months?

Returning to eye-tracking data, the effect of this selection bias is to make differences appear more static. So, for target fixation data, you get the pattern below: considering only on-going makes it look like there is an asymptote difference between conditions, but "padding" the post-response frames with Target fixations correctly captures the processing speed difference. (These data are from a Monte Carlo simulation, so we know that the Target method is correct). 

For competitor fixations, ignoring terminated trials makes the competition effects look longer-lasting, as in the figure on the left. These data come from our recent study of taxonomic and thematic semantic competition, so you can see the selection bias play out in real VWP data. We also randomly dropped 10% and 20% of the data points to show that the effect of ignoring terminated trials is not just a matter of having fewer data points.

Whether post-response data are considered "Target" or "Non-object" fixations does not seem to have biasing effects, though it does affect how the data do look in the same way that probability distribution curves and cumulative distribution curves show the same underlying data but in different ways. More details on all of this are available in our technical report.

No comments:

Post a Comment