The behavior of admixed populations in neighbor-joining inference of population trees

Naama M. Kopelman, Lewi Stone, Olivier Gascuel, Noah A. Rosenberg

Research output: Contribution to journalConference articlepeer-review

15 Scopus citations


Neighbor-joining is one of the most widely used methods for constructing evolutionary trees. This approach from phylogenetics is often employed in population genetics, where distance matrices obtained from allele frequencies are used to produce a representation of population relationships in the form of a tree. In phylogenetics, the utility of neighbor-joining derives partly from a result that for a class of distance matrices including those that are additive or tree-like - generated by summing weights over the edges connecting pairs of taxa in a tree to obtain pairwise distances - application of neighbor-joining recovers exactly the underlying tree. For populations within a species, however, migration and admixture can produce distance matrices that reflect more complex processes than those obtained from the bifurcating trees typical in the multispecies context. Admixed populations - populations descended from recent mixture of groups that have long been separated - have been observed to be located centrally in inferred neighbor-joining trees, with short external branches incident to the path connecting their source populations. Here, using a simple model, we explore mathematically the behavior of an admixed population under neighbor-joining. We show that with an additive distance matrix, a population admixed among two source populations necessarily lies on the path between the sources. Relaxing the additivity requirement, we examine the smallest nontrivial case - four populations, one of which is admixed between two of the other three - showing that the two source populations never merge with each other before one of them merges with the admixed population. Furthermore, the distance on the constructed tree between the admixed population and either source population is always smaller than the distance between the source populations, and the external branch for the admixed population is always incident to the path connecting the sources. We define three properties that hold for four taxa and that we hypothesize are satisfied under more general conditions: antecedence of clustering, intermediacy of distances, and intermediacy of path lengths. Our findings can inform interpretations of neighbor-joining trees with admixed groups, and they provide an explanation for patterns observed in trees of human populations.

Original languageEnglish
Pages (from-to)273-284
Number of pages12
JournalPacific Symposium on Biocomputing
StatePublished - 2013
Event18th Pacific Symposium on Biocomputing, PSB 2013 - Kohala Coast, United States
Duration: 3 Jan 20137 Jan 2013


FundersFunder number
Burroughs Wellcome Fund
National Institutes of HealthR01 GM081441
National Science FoundationBCS-1024627, DBI-1146722
National Institute of General Medical SciencesR01GM081441


    • Admixture
    • Neighbor-joining
    • Phylogenetics
    • Population genetics


    Dive into the research topics of 'The behavior of admixed populations in neighbor-joining inference of population trees'. Together they form a unique fingerprint.

    Cite this