Mathematica 9 is now available


Among the most pervasive data structures in computer science are trees. A tree consists of a root containing some atomic value, and children which are themselves either atomic values or subtrees. For example, the expression (a+b)*c*Sin[x] can be regarded as a tree with Times as its root and children that are respectively the subtree Plus[a,b], the atom c, and the subtree Sin[x] .

Trees are used in a variety of algorithms both for reasons of efficiency and because many data abstractions are best modelled via trees. Indeed, all Mathematica expressions fall into the class of structures known as "directed acyclic graphs (DAGs), and these are a small generalization of trees.

A particularly simple flavor is known as a binary tree. Each subtree has at most two children, referred to by the terms "left" and "right." Such trees may be used to implement fast average-time sorting algorithms. The simplest example is shown below. We build a tree by placing new elements to the left or right of a given node according to ordering between node and new element. Once all are in the tree we flatten it; the result is a sorted version of the original set. If we have n elements then the average-case complexity is O(n*log(n)). This is because each atom (typically) will be at depth not much more than log(n), hence each of n insertions averages that many steps.

Below we implement an ordered binary tree. We use Mathematica canonical ordering to compare elements. We also use List as our head for each triple {left,val,right}. This effectively precludes use of lists for element values. One can use a different head to get around this drawback, but for our purposes this suffices.


Now we run a few examples and demonstrate that this does indeed sort the original list of elements. We gather timings to check algorithmic complexity.


With this data we can now see that the algorithmic complexity is roughly as billed, O(n*log(n)).


It should be noted that this sorting algorithm takes a minor shortcut. In ordering elements in a tree node as {left,val,right} we may simply recover the sorted list using Flatten. A different internal node element ordering would not support this. We could more generally use a stack and walk the tree, as was done in the code to emulate Flatten, and this would still give the appropriate algorithmic complexity.