5.2. Tree Construction: Distance Methods – Basics 187
Notice that one edge has turned out to have negative length. Because
this cannot really be meaningful, many practitioners would choose to simply
reassign the length as 0. If this happens, however, we should at least check
that the negative length was close to 0 or we would worry about the quality
of the data.
Although it may seem surprising at first, both the Fitch-Margoliash al-
gorithm and UPGMA will produce exactly the same topological tree when
applied to a data set. The reason for this is that, when deciding which taxa or
groups to join at each step, both methods consider exactly the same collapsed
data table and both choose the pair corresponding to the smallest entry in the
table. It is only the metric features of the resulting trees that will differ. This un-
dermines a bit the hope that the Fitch-Margoliash algorithm is much better than
UPGMA. Although it may produce a better metric tree, topologically it never
differs.
Fitch and Margoliash (Fitch and Margoliash, 1967) actually proposed their
algorithm not as an end in itself, but rather as a heuristic method for pro-
ducing a tree likely to have a certain optimality property (see the Problems
section). We are viewing it here, like UPGMA, as a step toward the Neigh-
bor Joining algorithm of the next section. Familiarity with UPGMA and the
Fitch-Margoliash algorithm will aid us in understanding that more elaborate
method.
Of course, both UPGMA and the Fitch-Margoliash algorithm are better
done by computer programs than by hand. However, a few hand calcula-
tions are necessary to understand fully how the methods function and what
assumptions go into them.
Rooting a tree. Although the Fitch-Margoliash algorithm has allowed us
to obtain unequal branch lengths in our trees, we have paid a price – the trees
it constructs are unrooted. However, since finding a root is often desirable, a
clever idea can get around this deficiency.
When applying any phylogenetic tree method that produces an unrooted
tree, an additional taxon can be included. This extra taxon is chosen so that it
is known to be more distantly related to each of the taxa of interest than they
are to each other, and is known as an outgroup. For instance, if we are trying
to relate species of ducks to one another, we might include a different type of
bird as the outgroup. Once an unrooted tree is constructed, we locate the root
where the edge to the outgroup joins the rest of the tree. Biological knowledge
that the outgroup must have diverged from the other taxa before they split
from one another gives us the location in the tree of the common ancestor.