15. Trees

CAUTION - CHAPTER UNDER CONSTRUCTION!

This chapter was last updated on May 7, 2025. Contents locked until 11:59 p.m. Pacific Standard Time on May 23, 2025.

A tree is a connected simple graph that contains no cycles. Trees are used to model decisions, to sort data, and to optimize networks.

Key terms and concepts covered in this chapter:

  • Trees

    • Properties

    • trees (binary, spanning)

    • expression trees

  • Traversal strategies

  • Spanning trees/forests

  • Spanning trees

  • tree traversals

15.1. Definitions, Examples, and Properties of Trees and Forests

Recall that a simple graph has no loops and no parallel edges, that is, no edge connects a vertex to itself, and no two edges can connect the same pair of vertices. In a simple graph, each edge is determined by its two vertices.

A tree is an undirected simple graph that is connected and has no cycles, which you may recall are paths that start and end at the same vertex. Some sources use the term acyclic to mean "has no cycles."

A forest is the union of several trees. A forest is a simple graph that has one or more connected components, where each connected component is a tree.

Trees1v1

The image shows a forest composed of three trees.

Notice that if you choose any pair of vertices in one of the trees in the image, there is only one path that joins that pair of vertices. In fact, this property is True for any tree.

Theorem

Suppose that \(G\) is a undirected simple graph.

\(G\) is a tree if and only for every pair of vertices of \(G\) there is exactly one path between the vertices.

Proof

First, assume that \(G\) is a tree; we will prove, using proof by contradiction, that for every pair of vertices of \(G\) there is exactly one path between the vertices. By assumption, \(G\) is connected and for every pair of vertices in \(G\) there is a least one path joining those two vertices, so we only need to show that there cannot be two different paths that connect the same pair of vertices. So suppose that for some pair of vertices \(u\) and \(v\) there are two different paths between \(u\) and \(v,\) then we can start to go from \(u\) to \(v\) along a first path and then "turn around" and go back to \(u\) along a second path. This means that there must be a cycle in \(G\) that starts and ends at \(u.\)

Why must there be a cycle?
If we can go all the way from \(u\) to \(v\) along the first path and then go all the way back, in reverse order, from \(v\) to \(u\) along the second path without repeating any edges or vertices (except \(u\)) then we have found a cycle that starts and ends at \(u.\)
If the two paths have some common edges or vertices, we can use only part of each path to create a cycle that does not repeat any edges or vertices. To do this, write the first path as \(v_{0}v_{1}...v_{n-1}v_{n}\) where \(u = v_{0}\) and \(v = v_{n}\) and let \(j\) stand for the smallest positive index such that vertex \(v_{j}\) appears in both the first and second paths. Now create a cycle by using the beginning of the first path to go from \(u\) to \(v_{j}\) and then using the beginning part of the second path that goes from \(u\) to \(v_j\), but in reverse order, to go back from \(v_{j}\) to \(u.\) Notice that the index \(j\) is chosen so that that no edge or vertex (except \(u\) and \(v_j\)) that we use can belong to both of these shorter paths from \(u\) to \(v_{j},\) so no edges are repeated when we use this path to go from \(u\) back to \(u\) - we have created a cycle.

To continue with the proof by contradiction, we’ve shown that there is a cycle in \(G,\) but this contradicts the assumption that \(G\) is a tree that has no cycles. This means that the assumption "there is a pair of vertices in \(G\) that are connected by two different paths" must be False. We have proven that for any pair of vertices in \(G\) there is exactly one path joining that pair of vertices.

Secondly, assume that for every pair of vertices of \(G\) there is exactly one path between the vertices. We will prove, using proof by contradiction, that \(G\) must be a tree. By assumption, \(G\) is connected because for any pair of vertices there is a path between those vertices, so we only need to prove that \(G\) has no cycles. So suppose that \(G\) does have a cycle \(v_{0}v_{1} \cdots v_{n-1}v_{n}\) where \(v_{0} = v_{n}\) (that is, \(v_{0}\) and \(v_{n}\) are the same vertex, but no other vertex is repeated in the cycle.) Notice that, because \(G\) is a simple graph, the integer \(n\) must be greater than or equal to 3, so we have two different trails, \(v_{0}v_{1}\) and \(v_{1} \cdots v_{n},\) between \(v_0\) and \(v_1\) (Recall that \(v_{0}\) and \(v_{n}\) are the same vertex.) Now reverse the order of vertices in the second path to get two different paths between the vertices \(v_{0}\) and \(v_{1}\) - but we assumed that for every pair of vertices of \(G\) there is exactly one trail between those vertices. We have gotten a contradiction, which means that \(G\) cannot have a cycle. Therefore, \(G\) is connected and has no cycles, which by definition means that \(G\) is a tree.

Q.E.D.

Here is another characterization of trees.

Theorem

For every positive integer \(n,\) if a tree has \(n\) vertices then the tree has \(n-1\) edges.

Proof

We use mathematical induction on the number of vertices \(n.\)

Let \(P(n)\) be the predicate \[P(n): \text{If a tree has } n \text{ vertices then the tree has } n-1 \text{ edges.}\] We will prove that the proposition \((\forall n \in \mathbb{N}_{>0})P(n)\) is True.

Basis Step: \(P(1)\) is True since a tree that has \(1\) vertex has \(0\) edges (otherwise, the edge would have to be a loop, but trees are simple graphs that don’t have loops.)
We can also prove that \(P(2)\) is True in case the proof of \(P(1)\) feels unsatisfying. If a tree has \(2\) vertices, then since a tree is a connected simple graph, the \(2\) vertices must be connected by a path, and since a tree cannot have parallel edges, the vertices are the endpoints of exactly \(1\) edge. This proves that \(P(2)\) is True.

Induction Step: First, we assume that the induction hypothesis \(P(k)\) is True for some positive natural number \(k\).

Secondly, we will prove that the conditional \(P(k) \rightarrow P(k+1)\) must be True, which means we can use modus ponens (or the equivalent tautology \(( P(k) \land ( P(k) \rightarrow P(k+1) ) ) \rightarrow P(k+1)\)) to show that \(P(k+1)\) is also True.

Assume that \(P(k)\) is True for the positive integer \(k,\) that is, if a tree has \(k\) vertices then the tree has \(k-1\) edges. We can assume \(k \geq 2\) since the cases when \(k \in \{ 1, 2 \}\) were proved in the Basis Step. Suppose that we have a tree \(T\) that has \(k+1\) vertices; we will prove that the tree must have \(k\) edges. First, there must be at least one vertex \(v\) in \(T\) such that the degree of \(v\) is \(1:\) If every vertex had at least degree \(2,\) then we could find a cycle in \(T\) which cannot be True since \(T\) is a tree. Remove one vertex \(v\) that has degree \(1\) along with the edge that has \(v\) as an endpoint to obtain the subgraph \(T-v.\) Notice that for every pair of vertices of \(T-v\) there is exactly one trail between the vertices, so from the previous theorem, \(T-v\) is a tree. Also, \(T-v\) has \(k\) vertices because we removed only \(v,\) and we can apply the Induction Hypothesis to conclude that \(T-v\) has \(k-1\) edges. Now reinsert vertex \(v\) and the edge that was removed to obtain the tree \(T\) that has \(k+1\) vertices and \(k\) edges. Therefore, if \(P(k)\) is True then \(P(k+1)\) is True, too. That is, \(P(k) \rightarrow P(k+1)\).

Conclusion Step: We have proven both the Basis Step and the Induction Step. Therefore, we can use universal generalization to conclude that for all positive integers \(n,\) if a tree has \(n\) vertices then the tree has \(n-1\) edges.

Q.E.D.

15.2. Additional Topics

Students in CSC 230 Spring 2025 can refer to the slide decks (from Spring 2024 and Fall 2024) that are posted in Canvas.

  • Spanning Trees and Spanning Forests

    • Kruskal’s Algorithm

  • Binary Trees

    • Tree Traversal Strategies

    • Expression Trees

  • Algorithms for Binary search trees

    • Algorithms for Depth- and breadth-first traversals