I spent last week at a two-part meeting on language in developmental and acquired disorders, hosted by the Royal Society. The organizers (Dorothy Bishop, Kate Nation, and Karalyn Patterson) devised a meeting structure that stimulated – and made room for – a lot of discussion and one of the major discussion topics throughout the meeting was computational modeling. A major highlight for me was David Plaut’s aphorism “Models are experiments”. The idea is that models are sometimes taken to be the theory, but they are better thought of as experiments designed to test the theory. In other words, just as a theory predicts some behavioral phenomena, it also predicts that a model implementation of that theory should exhibit those phenomena. This point of view has several important, and I think useful, consequences.
Just like in behavioral experiments, the observed model behavior could be due to a multitude of causes. Pretty much all models have some design features that reflect theoretical commitments of the proposed account, some features that are convenient simplifications (they help to make a model tractable, but are neutral with respect to the proposed theory), and some that are implementational requirements (just needed in order to implement the model and run simulations). Ideally, the critical model behavior is due to the features that reflect theoretical commitments, but this is not always the case. Even when the model behavior is due to the theoretical commitments, this is often not obvious to the average reader of a modeling paper. Conveniently, models have a ready-made mechanism for both testing whether the theory-relevant features are driving model behavior and for demonstrating this for the reader. One can simply remove or alter the theory-relevant features (ideally these alterations reflect a competing theory) and test whether the model still exhibits the phenomena of interest.
Oppenheim, Dell, and Schwartz (2010) is a good example of this approach: they propose that incremental learning is the basis of cumulative semantic interference effects, implement the model and run simulations to show that it can account for the data. Critically, they also present simulations that examine whether the effect is driven by facilitative or inhibitory learning (Simulation 5) and whether their model's competitive learning mechanism is sufficient without additional competitive selection (Simulation 6). These additional simulations are very revealing about the critical principles underlying their model. In a recent paper, we took a similar approach by implementing both a “failure to activate” deficit and a “failure to inhibit” deficit and showing that only the former produced the pattern that we observed in the behavioral data (Mirman, Britt, & Chen, 2013).
In fact, modelers almost always run such alternative test simulations in the course of model development and testing, but we rarely report them because we want to be concise and worry that discussing multiple models will be confusing for the reader. Although these are legitimate concerns, I think such “control simulations” can play an important role by testing whether the theory-relevant features really are driving model behavior and by making explicit what the theory-relevant features are. The latter point is important because it seems that even the best-written modeling papers leave many non-modeler readers uncertain about which model aspects are critical and which are ancillary. Once the main model has been described, these secondary simulations can often be reported very briefly, with the details provided in an appendix or as supplementary materials posted online.
Just like in behavioral experiments, the observed model behavior could be due to a multitude of causes. Pretty much all models have some design features that reflect theoretical commitments of the proposed account, some features that are convenient simplifications (they help to make a model tractable, but are neutral with respect to the proposed theory), and some that are implementational requirements (just needed in order to implement the model and run simulations). Ideally, the critical model behavior is due to the features that reflect theoretical commitments, but this is not always the case. Even when the model behavior is due to the theoretical commitments, this is often not obvious to the average reader of a modeling paper. Conveniently, models have a ready-made mechanism for both testing whether the theory-relevant features are driving model behavior and for demonstrating this for the reader. One can simply remove or alter the theory-relevant features (ideally these alterations reflect a competing theory) and test whether the model still exhibits the phenomena of interest.
Oppenheim, Dell, and Schwartz (2010) is a good example of this approach: they propose that incremental learning is the basis of cumulative semantic interference effects, implement the model and run simulations to show that it can account for the data. Critically, they also present simulations that examine whether the effect is driven by facilitative or inhibitory learning (Simulation 5) and whether their model's competitive learning mechanism is sufficient without additional competitive selection (Simulation 6). These additional simulations are very revealing about the critical principles underlying their model. In a recent paper, we took a similar approach by implementing both a “failure to activate” deficit and a “failure to inhibit” deficit and showing that only the former produced the pattern that we observed in the behavioral data (Mirman, Britt, & Chen, 2013).
In fact, modelers almost always run such alternative test simulations in the course of model development and testing, but we rarely report them because we want to be concise and worry that discussing multiple models will be confusing for the reader. Although these are legitimate concerns, I think such “control simulations” can play an important role by testing whether the theory-relevant features really are driving model behavior and by making explicit what the theory-relevant features are. The latter point is important because it seems that even the best-written modeling papers leave many non-modeler readers uncertain about which model aspects are critical and which are ancillary. Once the main model has been described, these secondary simulations can often be reported very briefly, with the details provided in an appendix or as supplementary materials posted online.
Treating models as experiments also has implications for attempts to falsify models. Just like claims of a model's success, evidence of a model's failure should be accompanied by assessment of whether this was due to the theoretical principles underlying the model or implementational details. And like the success side, follow-up simulations that can concretely attribute the key aspects of model performance to theoretical principles would make for a much stronger argument for or against a particular model.
Mirman D, Britt AE, & Chen Q (2013). Effects of phonological and semantic deficits on facilitative and inhibitory consequences of item repetition in spoken word comprehension. Neuropsychologia PMID: 23770302
Oppenheim GM, Dell GS, & Schwartz MF (2010). The dark side of incremental learning: A model of cumulative semantic interference during lexical access in speech production. Cognition, 114 (2), 227-252 PMID: 19854436
Largely agree. Except on semantics (which is just that): I think of a model as a theoretical construct, but running it by simulation as an experiment.
ReplyDeleteThanks for your comment. In this case, by "model" I mean an implemented computational model that can be tested by simulation (experiment), so I think we agree.
ReplyDeleteExcellent post, this is essentially the credit/blame assignment problem discussed in McCloskey (1991), and a good thing to be reminded of. Since the word "model" means different things to different researchers sometimes it's simplest to refer to "theories" and "simulations" (which have less fuzzy definitions) and leave the ambiguities of "model" out entirely!
ReplyDelete