Wednesday, January 30, 2013

Some good news from Timbuktu

 Credit: Wikipedia

I have an update on the status of the damage to the Timbuktu manuscript collections, The New Yorker reports that a large number of manuscripts from these collections were transported out of the city, when it fell to Ansar-e-Dine. However, the status of the majority of manuscripts remains unknown.

More details available here

Tuesday, January 29, 2013

A destruction of our shared history

Credit: World Atlas

Credit: Lonely Planet

As many of you might be aware, France is leading a coalition of forces on a nation wide offensive in the west African nation of Mali. Almost six months ago, following a military coup overthrowing then president Amadou Armadi Toure,  Al-Qaeda linked Islamists rebels Ansar-e-Dine and the Tuareg rebel group MINA took control of Northern Mali. This alliance broke down when the Ansar-e-Dine called for the imposition of Sharia law in this region.

After months of back and forth negotiations and truce between the Ansar-e-Dine and the Malian Army, including a number of UN resolutions, the Islamist rebels launched an offensive in southern Mali capturing the city of Konna, on January 10th. On Jan 11th at the behest of the Malian military regime, Frace launched operation Serval, a air and ground military intervention in Norther and Central Mali, which has been successful in pushing back the rebels (for more details see this comprehensive timeline of the conflict by France 24).

As the Islamist rebels retreat from Northern Mali, which includes the famous city of Timbuktu, it has been reported that they set fire to a historic library, housing  thousands of important manuscript collections, a literary heritage that dates back to the 15th and 16th century.

Written in  Arabic, as well as the local languages of Songhai, Tamashek and Bambara, these collections contained accounts of history and laws of the region as well as poetry and stories of North Africa. At a time when Africa as a continent was considered by European colonists as devoid of civilization, lacking literature, history and art, the discovery of these collections were instrumental in creating an African narrative and obliterating the myth of African history consisting solely of oral tradition.

As Essop Pahad, former director of the Timbuktu Manuscripts project points out:

The manuscripts gave you such a fantastic feeling of the history of this continent. They made you proud to be African. Especially in a context where you're told that Africa has no history because of colonialism and all that....The writings are so forward-looking on marriage, on trade, on all sorts of things. If the libraries are destroyed then a very important part of African and world history are gone.

There have been no further details as to the extent of the damage, but many (myself included) hope that some, if not most of the collections are preserved. If the reports on the destruction of these collections are true, then we have lost an important portion of our shared history.

In a similar vein, it should be noted that as part of the invasion of Iraq, American and Polish military presence in the ancient city of Babylon resulted in major damage to one of the oldest archaeological sites in the world. More details here.

Guest Post on the EEB and flow

I have a guest post on the rather awesome EEB and flow discussing my experiences as a minority in ecology. Check it out and let me know what you think. Also relevant is this piece in Science Careers on the difficulties minority women scientists face in europe.

P.S. I owe you one, Caroline

Monday, January 21, 2013

Science edition: Does parsimony make sense in an inherently complicated world?

Parsimony (aka Ockham's Razor) is one of the major heuristic (general guiding rule or observation) frameworks in theoretical ecology when it comes to constructing models that attempt to describe the processes underlying ecological communities, broadly defined here as all species in a area. Of course, I'm glossing over the debate about what exactly is a "species" and what constitutes this mysterious "area". My interest in using theoretical methods in ecology, a logical consequence of my romantic pre-PhD student notions about the perceived coolness of community over population ecology combined with the fact that I (sometimes) happened to be in close proximity to Peter Abrams, has brought me face to face with the parsimony in all of it's forms.

Perhaps it's the pervasiveness of all of the ecological bandwagons in our midst (cough, IDH, cough, phylogenetic community ecology, cough, neutral theory) or the fact that it is becoming increasingly difficult to find an overarching theory that provides an explanation for all/most patterns of biodiversity, even though we all secretly want to be the person to theorize it, but the notion that Hypothesis A better explains Phenomenon A than Hypothesis B, simply because it holds fewer assumptions feels hollow from a logical/philosophical perspective. After all, why should the simplicity of a hypothesis be any indicator of how true it is?

From a theoretical modelling perspective, the ideal model involves the simplest number of variables, with a certain set of assumptions that may or may not be realistic for the system that the said model is hoping to explain. Simon Levin (1975) emphasizes:

    "Most models (...) are not meant as literal descriptions of particular situations, and thus generate predictions which are not to be tested against precise population patterns. On the other hand, such models, if handled skillfully, can lead to robust qualitative predictions which do not depend critically on parameter values and can stimulate new directions of investigation."

 There is an inherent trade-off between the number of variables utilized and the extent of the explanatory power of the model. Fewer variables means a simpler model, but less explanatory power for the entire system. More variables make a model complex allowing for greater explanatory power for the entire system. However, this leads  to difficulty in identifying which variables/processes are important within the system of interest. The optimal model is one which minimizes the number of variables while increasing the explanatory power of the model

While I agree that the balance between the number of variables included in the model and the explanatory power is important, I am not very convinced that when choosing between two models with similar explanatory power, the model with the fewer variables is the better model.

In her work, "Feminist epistemology as local epistemology",  philosopher of science, Helen Longino points out three criticisms against Ockham's razor:

i. This formulation begs the question what counts as an adequate explanation.  Is an adequate explanation an account sufficient to generate predictions or an account of underlying processes, and, if explanation is just retrospective prediction, then must it be successful at individual or population levels?  Either the meaning of simplicity will be relative to one’s account of explanation, thus undermining the capacity of simplicity to function as an independent epistemic value, or the insistence on simplicity will dictate what gets explained and how.
ii. We have no a priori reason to think the universe simple, i.e. composed of very few kinds of thing (as few as the kinds of elementary particles, for example) rather than of many different kinds of thing.  Nor is there or could there be empirical evidence for such a view.
iii. The degree of simplicity or variety in one’s theoretical ontology may be dependent on the degree of variety one admits into one’s description of the phenomena.  If one imposes uniformity on the data by rejecting anomalies, then one is making a choice for a certain kind of account. If the view that the boundaries of our descriptive categories are conventional is correct, then there is no epistemological fault in this, but neither is there virtue.
Orkses, Schradder-Frechette and Belitz (1994) contend that:
Ockham's razor is perhaps the most widely accepted example of an extra-evidential consideration. Many scientists accept and apply the principle in their work, even though it is an entirely metaphysical assumption. There is scant empirical evidence that the world is actually simple or that simple accounts are more likely than complex ones to be true. Our commitment to simplicity is largely an inheritance of 17th-century theology.
Despite a lack of evidence suggesting the inherent simple nature of the universe (in fact, we have evidence of the opposite), parsimony has been very useful tool for examining natural phenomena. Statistical methods such as Principal Component Analysis (PCA) describe complex phenomena in terms of simple components that account for the variation observed in data. In Bayesian inference, parsimony  is an underlying principle for model selection (see Jeffreys and Berger (1992) Ockham's Razor and Bayesian Analysis).

From the perspective of pragmatism, simplicity is ideal for a number of reasons. As Hoffmann, Minkin and Carpenter (1997) point out simple hypothesis are more vulnerable to falsification than complex hypothesis, as fewer components in model result in less flexibility. If we ascribe to a Popperian view where a good scientific hypothesis is one that can be falsified, then a model which is more falsifiable than another is the better model. Simpler models due to their simplicity tend to be more readily comprehensible than complex models, although this comprehensibility has nothing to do with accuracy. Simpler models are also more intuitive, as in the case of the example provided below by Hoffman et al. (1997)

There are a number of ways to describe the relationship between these data point.

The two relationships presented above are among the many possible relationships that can exist between the data presented. Most of us would agree that the linear relationship provide a better fit to the data than the complex functional form. Goodness of fit tests, would confirm this intuitive choice.

There seems to be dichotomy between the usefulness of parsimony as a tool for understanding complex phenomena and it's (potentially disturbing) philosophical implications about the nature of reality. Is it possible to utilize parsimony without buying into these problematic assumptions? Is it intellectually dishonest to ignore the philosophical failings of this technique, if it plays an important role in the scientific method? Or merely a pragmatic admission of the limits of our intellectualcapabilities as a species in understanding an infinitely complicated universe? 

Tuesday, January 1, 2013

Happy New YeaR!

I apologize to my (few) readers in advance. Be prepared to indulge in some serious statistical nerdiness.

Considering that I just spent New Year's eve making changes to a phylogeny (family tree of organisms based on shared evolutionary characteristics) of North American freshwater fishes using the rather brilliant statistical package "R", I thought that it would be very appropriate to ring in 2013 with a pResent.

Disclaimer: I did not write the code myself (I wish I had though). All credit goes to  Weicheng Zhu and Yihui Xie,  who are much cooler people (as demonstrated below) than I am.

Set Fireworks in R 2011

Here is the code for this animation:

fire <- function(centre = c(0, 0), r = 1:5, theta = seq(0,
    2 * pi, length = 100), l.col = rgb(1, 1, 0), lwd = 5,
    ...) {
    x <- centre[1] + outer(r, theta, function(r, theta) r *
    y <- centre[2] + outer(r, theta, function(r, theta) r *
    matplot(x, y, type = "l", lty = 1, col = l.col, add = T,
        lwd = lwd, ...)
f <- function(centre = rbind(c(-7, 7), c(7, 6)), n = c(7,
    5), N = 20, l.col = c("rainbow", "green"), p.col = "red",
    lwd = 5, ...) {
    ani.options(interval = 0.1)
    lwd = lwd
    if (is.vector(centre) && length(n) == 1) {
        r = 1:n
        l = seq(0.1, 0.6, length = n)
        matplot(centre[1], centre[2], col = p.col, ...)
        for (r in r) {
            fire(centre = centre, r = seq(r - l[r], r + l[r],
              length = 10), theta = seq(0, 2 * pi, length = 10 *
              r) + 1, l.col = rainbow(n)[r], lwd = lwd, ...)
    else {
        matplot(centre[, 1], centre[, 2], col = p.col, ...)
        l = list()
        for (i in 1:length(n)) l[i] = list(seq(0.1, 0.6,
            length = n[i]))
        if (length(l.col) == 1)
            l.col = rep(l.col, length(n))
        r = 1:N
        for (r in r) {
            for (j in 1:length(n)) {
              if (r%%(n[j] + 1) == 0) {
                r1 = 1:n[j]
                l1 = seq(0.1, 0.6, length = n[j])
                for (r1 in r1) {
                  fire(centre = centre[j, ], r = seq(r1 -
                    l1[r1], r1 + l1[r1], length = 10), theta = seq(0,
                    2 * pi, length = 10 * r1) + 1, l.col = par("bg"),
                    lwd = lwd + 2)
              else {
                if (l.col[j] == "red")
                  fire(centre = centre[j, ], r = seq(r%%(n[j] +
                    1) - l[[j]][r%%(n[j] + 1)], r%%(n[j] +
                    1) + l[[j]][r%%(n[j] + 1)], length = 10),
                    theta = seq(0, 2 * pi, length = 10 *
                      r%%(n[j] + 1)) + 1, l.col = rgb(1,
                      r%%(n[j] + 1)/n[j], 0), lwd = lwd,
                else if (l.col[j] == "green")
                  fire(centre = centre[j, ], r = seq(r%%(n[j] +
                    1) - l[[j]][r%%(n[j] + 1)], r%%(n[j] +
                    1) + l[[j]][r%%(n[j] + 1)], length = 10),
                    theta = seq(0, 2 * pi, length = 10 *
                      r%%(n[j] + 1)) + 1, l.col = rgb(1 -
                      r%%(n[j] + 1)/n[j], 1, 0), lwd = lwd,
                else if (l.col[j] == "blue")
                  fire(centre = centre[j, ], r = seq(r%%(n[j] +
                    1) - l[[j]][r%%(n[j] + 1)], r%%(n[j] +
                    1) + l[[j]][r%%(n[j] + 1)], length = 10),
                    theta = seq(0, 2 * pi, length = 10 *
                      r%%(n[j] + 1)) + 1, l.col = rgb(r%%(n[j] +
                      1)/n[j], 0, 1), lwd = lwd, ...)
                else fire(centre = centre[j, ], r = seq(r%%(n[j] +
                  1) - l[[j]][r%%(n[j] + 1)], r%%(n[j] +
                  1) + l[[j]][r%%(n[j] + 1)], length = 10),
                  theta = seq(0, 2 * pi, length = 10 * r%%(n[j] +
                    1)) + 1, l.col = rainbow(n[j])[r%%(n[j] +
                    1)], lwd = lwd, ...)
card <- function(N = 20, p.col = "green", bgcolour = "black",
    lwd = 5, ...) {
    ani.options(interval = 1)
    for (i in 1:N) {
        par(ann = F, bg = bgcolour, mar = rep(0, 4), pty = "s")
        f(N = i, lwd = lwd, ...)
        text(0, 0, "Happy New Year", srt = 360 * i/N, col = rainbow(N)[i],
            cex = 4.5 * i/N)
ani.options(interval = 0.2)
card(N = 30, centre = rbind(c(-8, 8), c(8, 10), c(5, 0)),
    n = c(9, 5, 6), pch = 8, p.col = "green", l.col = c("rainbow",
        "red", "green"), xlim = c(-12, 12), ylim = c(-12,