Monday, January 21, 2013

Science edition: Does parsimony make sense in an inherently complicated world?

Parsimony (aka Ockham's Razor) is one of the major heuristic (general guiding rule or observation) frameworks in theoretical ecology when it comes to constructing models that attempt to describe the processes underlying ecological communities, broadly defined here as all species in a area. Of course, I'm glossing over the debate about what exactly is a "species" and what constitutes this mysterious "area". My interest in using theoretical methods in ecology, a logical consequence of my romantic pre-PhD student notions about the perceived coolness of community over population ecology combined with the fact that I (sometimes) happened to be in close proximity to Peter Abrams, has brought me face to face with the parsimony in all of it's forms.

Perhaps it's the pervasiveness of all of the ecological bandwagons in our midst (cough, IDH, cough, phylogenetic community ecology, cough, neutral theory) or the fact that it is becoming increasingly difficult to find an overarching theory that provides an explanation for all/most patterns of biodiversity, even though we all secretly want to be the person to theorize it, but the notion that Hypothesis A better explains Phenomenon A than Hypothesis B, simply because it holds fewer assumptions feels hollow from a logical/philosophical perspective. After all, why should the simplicity of a hypothesis be any indicator of how true it is?

From a theoretical modelling perspective, the ideal model involves the simplest number of variables, with a certain set of assumptions that may or may not be realistic for the system that the said model is hoping to explain. Simon Levin (1975) emphasizes:

    "Most models (...) are not meant as literal descriptions of particular situations, and thus generate predictions which are not to be tested against precise population patterns. On the other hand, such models, if handled skillfully, can lead to robust qualitative predictions which do not depend critically on parameter values and can stimulate new directions of investigation."

 There is an inherent trade-off between the number of variables utilized and the extent of the explanatory power of the model. Fewer variables means a simpler model, but less explanatory power for the entire system. More variables make a model complex allowing for greater explanatory power for the entire system. However, this leads  to difficulty in identifying which variables/processes are important within the system of interest. The optimal model is one which minimizes the number of variables while increasing the explanatory power of the model



While I agree that the balance between the number of variables included in the model and the explanatory power is important, I am not very convinced that when choosing between two models with similar explanatory power, the model with the fewer variables is the better model.

In her work, "Feminist epistemology as local epistemology",  philosopher of science, Helen Longino points out three criticisms against Ockham's razor:

i. This formulation begs the question what counts as an adequate explanation.  Is an adequate explanation an account sufficient to generate predictions or an account of underlying processes, and, if explanation is just retrospective prediction, then must it be successful at individual or population levels?  Either the meaning of simplicity will be relative to one’s account of explanation, thus undermining the capacity of simplicity to function as an independent epistemic value, or the insistence on simplicity will dictate what gets explained and how.
ii. We have no a priori reason to think the universe simple, i.e. composed of very few kinds of thing (as few as the kinds of elementary particles, for example) rather than of many different kinds of thing.  Nor is there or could there be empirical evidence for such a view.
iii. The degree of simplicity or variety in one’s theoretical ontology may be dependent on the degree of variety one admits into one’s description of the phenomena.  If one imposes uniformity on the data by rejecting anomalies, then one is making a choice for a certain kind of account. If the view that the boundaries of our descriptive categories are conventional is correct, then there is no epistemological fault in this, but neither is there virtue.
Orkses, Schradder-Frechette and Belitz (1994) contend that:
Ockham's razor is perhaps the most widely accepted example of an extra-evidential consideration. Many scientists accept and apply the principle in their work, even though it is an entirely metaphysical assumption. There is scant empirical evidence that the world is actually simple or that simple accounts are more likely than complex ones to be true. Our commitment to simplicity is largely an inheritance of 17th-century theology.
Despite a lack of evidence suggesting the inherent simple nature of the universe (in fact, we have evidence of the opposite), parsimony has been very useful tool for examining natural phenomena. Statistical methods such as Principal Component Analysis (PCA) describe complex phenomena in terms of simple components that account for the variation observed in data. In Bayesian inference, parsimony  is an underlying principle for model selection (see Jeffreys and Berger (1992) Ockham's Razor and Bayesian Analysis).

From the perspective of pragmatism, simplicity is ideal for a number of reasons. As Hoffmann, Minkin and Carpenter (1997) point out simple hypothesis are more vulnerable to falsification than complex hypothesis, as fewer components in model result in less flexibility. If we ascribe to a Popperian view where a good scientific hypothesis is one that can be falsified, then a model which is more falsifiable than another is the better model. Simpler models due to their simplicity tend to be more readily comprehensible than complex models, although this comprehensibility has nothing to do with accuracy. Simpler models are also more intuitive, as in the case of the example provided below by Hoffman et al. (1997)



There are a number of ways to describe the relationship between these data point.

The two relationships presented above are among the many possible relationships that can exist between the data presented. Most of us would agree that the linear relationship provide a better fit to the data than the complex functional form. Goodness of fit tests, would confirm this intuitive choice.

There seems to be dichotomy between the usefulness of parsimony as a tool for understanding complex phenomena and it's (potentially disturbing) philosophical implications about the nature of reality. Is it possible to utilize parsimony without buying into these problematic assumptions? Is it intellectually dishonest to ignore the philosophical failings of this technique, if it plays an important role in the scientific method? Or merely a pragmatic admission of the limits of our intellectualcapabilities as a species in understanding an infinitely complicated universe? 

5 comments:

  1. Putting philosophy aside, I like parsimony for technical reasons.

    For an example in statistics, complex models require lots of data in order to keep down estimation variation. This is the 'bias-variance tradeoff'.

    For another example, exploring the predictions of complex models can take too much computer time. This is the 'curse of dimensionality'.

    I think everyone would rather just use complex models -- because as you say the world is complex -- but there are just some serious technical challenges to complex modelling that are kind of insurmountable.

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. I like parsimony for techinical reasons as well, especially for the two reasons that you mentioned above. However, I have some reservations when people consider parsimony to be the only underlying reason for choosing which models aught to be applied to a dataset. It's important to admit that parsimony is used for model selection due to pragmatic reasons such as technical limitations (e.g. insufficient data, lack of computing power). Especially for undergrad or grad stats classes where parsimony is taught as a general rule to be applied across the board without any sort of discussion about it.

    ReplyDelete
  4. I think the key is understandability not only simplicity, your theory or model may not be the best but if you arrange to make it understandable to the scientific community you will have a place. This is why science usually advance building on the known because we already understand the most of it.

    ReplyDelete
  5. @Leonardo Saravia

    Understandability is very important in science. But understandality is not a proxy for how well your model reflects the nature of reality. So while simple models are more understandable, that doesn't necessarily mean that they correctly represent a phenomenon. That said, parsimony is a very useful approach to model building and selection from a technical perspective.

    ReplyDelete