*James G Sanderson. American Scientist. Volume 88, Issue 4. Jul/Aug 2000.*

Tradition holds that Charles Darwin glimpsed the signature of natural selection quite early in his career, after observing the finches of the Galapagos Islands. He visited these teeming shores of the tropical East Pacific in 1835, during his famous circumnavigation of the globe. One passage in his Journal of Researches into the Geology and Natural History of the Various Countries Visited by H. M. S. Beagle (a work usually published under the more compact title The Voyage of the Beagle) describes his reaction to the markedly different beaks of the six species of Galapagos ground finches: “Seeing this gradation and diversity of structure in one small, intimately related group of birds, one might really fancy that from an original paucity of birds in this archipelago, one species had been taken and modified for different ends.” A woodcut showing four finch heads in profile appears next to this statement, further suggesting that these birds were key to the development of Darwin’s ideas about biological evolution.

But as Frank Sullaway of Harvard University has shown, the familiar story of “Darwin’s finches” that many people learned in school is mostly just that-a story. In actuality, Darwin gathered few examples of these supposedly crucial birds. He failed to recognize the importance of the specimens that he did collect and neglected to so much as tag each one with the name of the island from which it came. Indeed, Darwin did not even realize that some of these birds were finches until six years later, when John Gould, an eminent British ornithologist, set him straight. One reads gushing descriptions in The Voyage of the Beagle only because Darwin revised the text of his journal in 1845 to reflect what he had pieced together in the intervening years. His original account says very little about the finches, reflecting the minimal attention he paid to these birds when he first saw them.

Only in more recent times have “Darwin’s finches” received the careful scrutiny they deserve. Shortly after World War II, David Lack, of Oxford’s Edward Grey Institute of Field Ornithology, reported that two species of finches with similarly sized beaks do not live together on any of the 17 islands of the Galapagos. Knowing that the configuration of its beak indicates a lot about what a bird is eating, he concluded that competing species were forced to develop different feeding habits if they were to survive together in one place. More recently, Peter Grant and Dolph Schluter argued that the pattern of Galapagos ground finches (genus Geospiza) is structured in just this way: Similarly endowed species tend not to be found living together on the same islands.

This statement builds on an intellectual foundation constructed a quartercentury ago, largely through the work of the polymath Jared Diamond. Diamond, a medical doctor, became a professor of physiology at the University of California, Los Angeles in 1966. And in 1998 he also became a Pulitzer Prize-winning author, with the publication of Guns, Germs and Steel. But starting in the 1970s, Diamond began devoting some of his considerable energies to the study of tropical birds. After doing fieldwork on more than 50 islands of the Bismarck Archipelago (located east of New Guinea), he concluded that the bird species living there and in other island communities obey what he dubbed assembly rules, which describe why only certain combinations of species can be found together.

According to Diamond, one can deduce the relevant assembly rules for any type of animal by observing which species inhabit which islands. But his thesis sparked a heated controversy, one that has persisted now for more than two decades. The commotion began in 1979, when the ecologist Daniel Simberloff and his student at Florida State University Edward Connor (who later moved to the University of Virginia) wrote an influential article that questioned whether the patterns commonly seen in island communities truly reflect such things as competition between species or are simply the result of happenstance.

The problem is akin to throwing a handful of marbles on a checkerboard and trying to evaluate the resulting pattern. Clearly, if all the marbles end up clustered on one corner square, some underlying mechanism has affected the distribution. The board may be tilted, for example. If the marbles are randomly scattered all over, one can safely conclude that the board is flat and true and that the reasons the glass balls stopped where they did involve a multitude of unknowable and rather trivial-factors. But what if, say, four marbles come to rest on the same square? Is a subtle effect (perhaps a rough patch) influencing the process or can four closely poised marbles just randomly turn up? In this case, it is easy to perform the experiment repeatedly and see whether the curious outcome proves consistent. But something as grand as the distribution of species on a collection of islands cannot be replicated in such a simple manner. Ecologists must evaluate the singular pattern they find.

Connor and Simberloff reasoned that if the observed distribution of birds (or bats or beetles or bears) is in any way special, it should differ in a fundamental fashion from the random placement of an equal number of species on an equal number of islands, something they could simulate and test mathematically This exercise requires two steps: First, one must determine chance expectations by forming what have become known as random, or null, communities. One must then gauge the discrepancy between typical null communities and the actual community using proper statistical tests. If the difference is large, ecologists can try to ferret out the underlying mechanism. If the distributions resemble each other, scientists need not waste their time with further investigations.

After applying statistical tests to the bird species found on the islands of the Vanuatu Archipelago of the East Indies and to both birds and bats living in the West Indies, Connor and Simberloff concluded that Diamond was seeing patterns where none existed. Later they also criticized Grant and Schulter for the manner in which those authors had formulated null communities in their statistical study of the Galapagos finches. Was that analysis so flawed that the conclusions are invalidated? Does the distribution of Darwin’s finches in the Galapagos really reflect competition, or is this seemingly reasonable view most ecologists fully accept just another myth?

## Lies, Damned Lies and …

To answer these questions requires a foray into statistics, a discipline that has long suffered from a tarnished reputation. Yes, statistics can easily be manipulated to mislead. And scholarly debates between investigators about the statistical tests put forth for evaluating ecological patterns do little to bolster one’s confidence in this branch of mathematics. The fundamental problem in island biogeography is the difficulty that arises in generating null communities to compare with the actual observations. Connor and Simberloff realized that the simulated distributions should resemble the real world in two key respects: The total number of different species on each island should correspond, as should the total number of islands harboring each given species. But the proper way to construct such null communities eluded them.

This failure led other ecologists to relax these two requirements, permitting explosive growth of the number of null communities that could be generated for a statistical analysis, something Connor and Simberloff carefully avoided. Indeed, when these restrictions are dropped, many more patterns in the actual community appear unusual, because (in the lingo of mathematicians) the null space is large. The situation is akin to observing a number that is known to fall, for instance, between 1 and 10 and then testing this result against a particular hypothesis. Suppose, for example, that the observed value is 7, and the hypothesis is that this number represents a random sample. If one takes as the null space the set of integers between 1 and 10, then it is perfectly reasonable to suppose that 7 could have been chosen randomly But if the null space is set of all real numbers in that interval, the likelihood of choosing an exact integer is vanishingly small. Clearly, selection of the proper null space is crucial. But for the past two decades, ecologists have been at a loss for the correct procedure.

I became intrigued by this problem of mathematical ecology in 1992. At the time, I was a student in the Biology Department at the University of New Mexico, where I was introduced to the work of Connor and Simberloff during a classroom lecture. Even without reading their paper, I could see from a single overhead viewgraph that the similarity between their null communities and Diamond’s actual observations was dubious: The correspondence was just too good.

Delving into the literature helped little, because, as far as I could see, no one had devised a legitimate way to construct null communities. But I saw that the problem of finding matrix elements that matched the desired row and column sums was akin to the task of finding pixel values in computer axial tomography-a subject that I had worked on previously. Because the “projections” ran only vertically and horizontally in this case (rather than at the many angles of a real CAT scan), I figured that there should be many possible solutions, not just a few.

This problem receded to the back of my mind as I completed my training in wildlife ecology. Only much later did I realize that computer scientists had the answer already in hand. The solution comes from a well-known algorithm called the knight’s tour, which addresses the following illustrative problem: Imagine a single knight placed anywhere on an otherwise empty chess board. This piece moves in the normal manner, that is, it advances or retreats by a single square in one direction and two squares in the perpendicular direction. The problem of the knight’s tour is to figure out how to move the piece so that it lands just once on each of the 64 squares of the chess board.

It turns out that there are a multitude of solutions. Finding them requires nothing more than perseverance-or rather, programming. In fact, solving the knight’s tour is a common exercise for beginning students of computer science. One begins by moving the knight randomly to any square that it can reach. You then move the piece, again at random, to another square, and so forth. If the knight lands on a square that it has already occupied, bring it back to the previous position and randomly pick a different move from the remaining possibilities. If you exhaust all options, just back up one more position and continue anew Such a trial-and-error solution would be tedious to carry out by hand, but computers can follow the prescription easily, and it always renders an answer.

The same tactic also solves the socalled eight queens problem, for which eight queens must be placed on a chess board so that no two are attacking each other, and each open square is attacked by at least one piece. I knew of this algorithm for years and had assigned students to code it many times when I was teaching introductory computer science at the University of New Mexico. But it took me two years to show that the same procedure could be used to construct the null communities needed to test for patterns in island biogeography. With the help of Michael Moulton (then a colleague in the Department of Wildlife Ecology and Conservation at the University of Florida) and Ralph Selfridge (who worked just across campus from us in the Department of Computer Science), I published a scholarly description of the method in 1998.

One begins by setting up a rectangular array with one row for each species and one column for each island. The cells in this array (that is, the elements in this matrix) record which species live on which island. Normally, ecologists use “1” to denote that a species is present and “0” to indicate that it is absent. So the problem of finding null communities (null matrices) boils down to filling in this array randomly with ones and zeros so that each of the rows and columns sums to the appropriate number. As with the knight’s tour, the solution is straightforward: One starts with a blank slate of zeros and then randomly assigns values of 1 to various matrix elements until a prescribed row or column sum has been exceeded. In that case, you reverse the manipulation that caused the violation and try something else. In this rather simpleminded way, the algorithm moves through a progression of favorable and unfavorable states, successively advancing and backtracking until a solution emerges.

It is easy to generate matrices that satisfy the required row and column sums using this strategy: My personal computer, which is by no means the fastest machine in the neighborhood, can usually pop out a solution in about a second. One has to be somewhat cautious, because matrices that differ only because rows or columns have been shuffled are not distinct solutions (just as interposing the names of two islands or two species before you have filled in the matrix of observations would not constitute a different problem.) Indeed, when I first told Simberloff that I had generated 5,000 null matrices for the birds of the Vanuatu Archipelago, a situation for which he was only able to construct 10 examples, he suspected that most of my solutions were not unique and set a graduate student to the task of uncovering the duplication. Having checked these matrices for myself, I was confident of the outcome. But I could certainly understand his reluctance to accept uncritically that the cozy null space he thought he knew so well had suddenly expanded to cosmic proportions.

Will the Real Pattern Please Stand Out Having solved the problem of creating null communities, I needed only to find the proper statistical yardstick, or metric, with which to gauge whether the observed pattern is distinct. Previous treatments of this subject described several ways to formulate this test, but knowing how misleading some of this work had been, I decided to look at the problem afresh. Ultimately, I chose to count instances of two species living on the same island. That is, I could consider each possible combination of two species and add up the number of times such pairings are found. And I could examine each of the null communities I generated in the same manner. Results would, of course, differ from one null configuration to the next, but they would provide a well-defined distribution with a clear mean and standard deviation against which to compare tallies from the set of actual observations.

Darwin’s finches can help illustrate the steps involved. There are 13 species of Galapagos finches inhabiting 17 major islands, a pattern that can be described fully by a 13×17 matrix of olr servations. Using the knight’s tour algorithm, my desktop computer can readily generate 10,000 unique null communities that have the same values for the row and column sums. That is, my 10,000 simulated archipelagos all have the same number of species inhabiting each island and the same number of islands harboring each species.

Because I consider pairs of species, there are quite a few combinations of birds to evaluate, 78 to be exact. Counting the number of times each of these possible groupings occurs indicates that 13 of the pairings found in the Galapagos deviate significantly from chance expectations. A few spurious indications might be anticipated, just because I set my threshold for significance at 95 percent. But finding that 13 pairings occur significantly more or less frequently than chance would predict indicates that Darwin’s finches indeed have an ecological tale to tell.

What might have created these patterns? To answer that question, one has to reach beyond the statistics of abundances and consider the biology of the species involved. Grant had noted that similar birds tend not to live in the same place, but there are many ways in which birds can be similar. Does size matter? It would seem not. Grant and his colleagues have estimated that adults of three of the ground finch species, G. fords, G. scandens and G. difficilis, each weigh about 20 grams. Yet G. fords and G. scandens seem to be birds of a feather, being found on the same island more frequently than chance would have it, whereas the opposite result holds for the combinations of G. fords and G. difficilis and for G. difficilis and G. scandens.

If not size, how about pedigree? Because Grant and his coworkers have analyzed evolutionary relationships among the Galapagos finches using DNA, this question, too, is easy to address. These genetic investigations revealed that G. fuliginosa and G. fords are closely related, as are G. scandens and G. conirostris. Although the first two live alongside each other more often than chance would predict, the latter pair are found less often than expected. So evolutionary divergence seems not to govern the present-day pattern.

Might the birds’ highly variable beaks give the real explanation? After all, the configuration of the beak tells a lot about the ecological niche a bird occupies. Following in the footsteps of Darwin, Lack, Grant and others, I recently tested whether the dimensions of their bills had something to do with the distribution of these species. And it was straightforward to show that one simple principle governs the makeup of the Galapagos ground finches, Geospiza (this genus accounts for about half of the species present on the islands, and all of the anomalous pairings involve at least one of them). The rule is this: Both the widths and the lengths of the beaks differ by more than 16 percent for species found together more often than chance would predict. Stated another way, the beak widths or lengths differ by less than 16 percent for species that live together less often than chance expectations. So it would seem that the longstanding notion that competition for food has structured the community of Galapagos finches has a sound statistical footing.

## Community Relations

The question of whether assembly rules are fact or fiction has troubled ecologists for decades. My recent contributions to this field have perhaps helped bring the opposing camps somewhat closer together. I have shown, for example, that bird species on the islands of the Vanuatu Archipelago are not randomly distributed, just as Diamond proclaimed long ago. But my work has also demonstrated that judgments about perceived patterns must be reserved until they can be subjected to the appropriate statistical tests, as Connor and Simberloff first argued.

This methodology also proves useful in situations that do not involve islands. Ecologists have, for example, charted the distribution of some species along a range of elevations (or along some other environmental gradient) and have sorely needed statistical tools to evaluate the patterns they see. Because these observations can be arranged in an array where each row corresponds to one species and each column corresponds to one elevation, the steps are essentially the same: Create a large number of null communities using the knight’s tour algorithm and then test the matrix of observations against this set with an appropriate metric. I have convinced myself that this technique works using observations of the amphibians living on the flanks of Mount Kupe in Cameroon (a study that Ulrich Hofer and his colleagues published last year), and I suspect it would be useful in many other gradational environments as well.

The knight’s tour algorithm also proves valuable for quantifying what ecologists call nestedness, a term applied when the species present in one place are a subset of the species found in another, which in turn are a subset of the species that inhabit a third locale, and so forth. This situation commonly arises when the sites considered are of different sizes, because larger areas (be they islands, mountain tops or wildlife reserves) generally support more species than smaller ones. So ecologists are not surprised when they discover nestedness; they have, however, struggled to determine the proper way to gauge deviations from complete nestedness. Richard Brualdi of the University of Wisconsin and I proposed last year that the touring chess knight and his associated statistical entourage can again come to the rescue by showing whether any observed discrepancy from pure nestedness is statistically significant.

I would hope that analyses of other ecological patterns, too, might eventually benefit from this general approach to building null communities. Indeed, I would be surprised if others did not suggest applications that are beyond the few I have considered. Such testing is, after all, central to the work of many dedicated and imaginative scientists, people who usually toil just as long and hard to evaluate their observations as they do to collect them in the first place. Although that division of labor might not fit one’s romanticized picture of the life of a field ecologist, it is typical-and quite fitting. As the late Robert Helmer MacArthur wrote in the opening sentence to his acclaimed book Geographical Ecology, “To do science is to search for repeated patterns, not simply to accumulate facts.”

## Aiding Conservation Management

The analysis of species patterns is of more than academic interest. One application arises when people decide to restore a wild animal to a place where it has become extinct. (Conservation managers might, for example, want to reintroduce certain species of Galapagos finches to islands where they had once lived but are now lost.) Efforts of this kind, including the current campaigns to restore the lynx (above) to its former ranges in Europe and the United States, often prove to be difficult and expensive.

Conservationists involved in these programs do not want to waste their energies trying to establish a species where it will not survive. The reintroduced species may, for example, face an inordinate amount of competition from other animals within its current (often restricted) habitat. So the author’s statistical tools for gauging competition between similar species should help managers to evaluate the prospects for success.