08 Apr 2025
– Tags:
sage
Another day, another blog post that starts with
“I was on mse the other day…”. This time, someone asked
an interesting question amounting to “how many unordered rooted
ternary trees with $n$
nodes are there, up to isomorphism?”. I’m a sucker for these kinds of
combinatorial problems, and after finding a generating function solution
I wanted to push myself to get an asymptotic approximation using
Flajolet–Sedgewick style analytic combinatorics! I’ve never actually
done this before, so I learned a lot, and I want to share some of the things
I learned – especially how to do this stuff in sage!
Now, you might be thinking – didn’t you write a very similar blog post
three years ago? Yes. Yes I did. Did I also completely forget what
was in that post? Yes. Yes I did, haha. For some reason I was getting it mixed
up with this other post from even more years ago, which isn’t nearly
as relevant. Thankfully it didn’t matter much,
since I’m fairly sure what I wanted to do wouldn’t work in the framework of
that post anyways, and now I have a better idea of what this machinery is
actually doing which is really exciting (since I’ve been
wanting to understand this stuff for years).
Let’s start with a warmup problem! How might we count the number of
rooted ordered ternary trees with $n$ nodes?
These are the kinds of “ternary trees” that you might remember from a
first class on algorithms and data structures. Such a tree is either
a leaf or an internal node with 1,2, or 3 children. Crucially these
children come with an order, because the datatype keeps track of which
child is on the left/right (in the case of 2 children) or on the
left/middle/right (in the case of 3 children). After all, from a CS
perspective, we need to remember this in order to traverse the tree!
This means that the following two trees are distinct for our purposes, even
though they’re isomorphic as graphs:
In a functional programming language, you might describe this datatype by
T z = Leaf z
| Unary z (T z)
| Binary z (T z) (T z)
| Ternary z (T z) (T z) (T z)
and this translates immediately to give a functional equation for the
generating function of $t_n$, which counts the number of rooted ordered
ternary trees with $n$ nodes:
\[T(z) = z + zT + zT^2 + zT^3\]
We can rearrange this as
\[z = \frac{T}{1 + T + T^2 + T^3}\]
so that we can compute the power series expansion of $T$ by
lagrange inversion:
Incredibly, this agrees with A036765, which is the
“number of ordered rooted trees with n non-root nodes and all outdegrees
<= three”, as we hoped!
You might reasonably ask if there’s a closed form for these numbers, and
this is too much to ask for (indeed, it’s already too much to ask for a
closed form for fibonacci numbers, and this is more complicated). But like
the fibonacci numbers, we can come up with an excellent approximation:
\[t_n = [z^n] T \approx
C \frac{(3.6107\ldots)^n}{n^{3/2}}
\left (1 \pm O \left ( \frac{1}{n} \right ) \right )\]
where $C = \frac{(0.8936\ldots)}{2 \sqrt{\pi}}$ is a constant.
Indeed, this gives fantastic results! Let’s plot the ratio of this
approximation to the true value so we can see just how good this approximation
gets as $n$ gets large. Note that it respects the promised big-oh error bound
too!
This is extremely cool, but where the hell did this approximation come from?
The answer is called Singularity Analysis, and
can be found in Chapter 2 Section 3 of Melczer’s excellent
An Invitation to Analytic Combinatorics, or Chapters VI and VII
of Flajolet and Sedgewick’s tome. See especially Theorem
VII.3 on pg 468.
Like seemingly every theorem in complex analysis,
this is basically an application of the Residue Theorem. I won’t say
too much about why it works, but I’ll at least gesture at a proof. You
can read the above references if you want something more precise.
First, we recall
\[t_n = \frac{1}{2\pi i} \int_C \frac{T}{z^{n+1}} \ \mathrm{d}z\]
where $C$ is any contour containing the origin inside a region where $T$
is holomorphic.
Then we draw the most important picture in complex analysis:
Here the obvious marked point is our singularity $\omega$, and we’ve
chosen a branch cut for $T$ (shown in blue) so that $T$ is
holomorphic in the region where the pink curve lives. We’ll estimate
the value of this integral by estimating the contribution from the big
circle, the little circle, and the two horizontal pieces.
It turns out that the two horizontal pieces basically contribute $O(1)$
amount to our integral, so we ignore them. Since the big circle is compact,
$T$ will attain a maximal value on it, say $M$. Then the integral along
the big circle (of radius $\omega + \epsilon$, say) is bounded by
$2 \pi (\omega + \epsilon) \frac{M}{(\omega + \epsilon)^{n+1}} = O((\omega+\epsilon)^{-n})$
To estimate the integral around the little circle, it would be really
helpful to have a series expansion at $\omega$ since we’re staying
really close to it… Unfortunately, $\omega$ is a singular point so we
don’t have a taylor series, but fortunately there’s another tool
for exactly this job: a Puiseux Series!
I won’t say much about what these are, especially since
Richard Borcherds already put out such a great video on the topic.
What matters is is that Sage can compute them for us, so we can actually
get our hands on the approximation!
We compute the integral around the little circle to be roughly:
\[\begin{align}
\frac{1}{2 \pi i} \int \frac{T}{z^{n+1}}
\ \mathrm{d}z
&\overset{(1)}{\approx}
\frac{1}{2 \pi i} \int \frac{c_\alpha (1-\frac{z}{\omega})^\alpha}{z^{n+1}}
\ \mathrm{d}z \\
&\overset{(2)}{=}
\frac{1}{2 \pi i} \int \frac{c_\alpha \sum_k \binom{\alpha}{k} (\frac{-z}{\omega})^k}{z^{n+1}}
\ \mathrm{d}z \\
&\overset{(3)}{=}
c_\alpha \binom{\alpha}{n} \frac{(-1)^n}{\omega^n}
\end{align}\]
In step $(1)$ we approximate $T$ by the first term of its puiseux series,
in step $(2)$ we apply the generalized binomial theorem, so that in step
$(3)$ we can apply the residue theorem to realize this integral as the
coefficient of $z^{-1}$ in our laurent expansion.
This gives us something which grows like $\widetilde{O}(\omega^{-n})$,
dominating the contribution $O((\omega + \epsilon)^{-n})$ from the big circle,
so that asymptotically this is the only term that matters. If we were
more careful to keep track of the big-oh error for the puiseux series for $T$
we could easily sharpen this bound to see
\[t_n =
c_\alpha \binom{\alpha}{n} \frac{(-1)^n}{\omega^n}
\left ( 1 + O \left ( \frac{1}{n} \right ) \right )\]
This looks a bit weird with the $(-1)^n$, but remember that $\binom{\alpha}{n}$
also alternates sign. Indeed, asymptotics for $\binom{\alpha}{n}$ are
well known so that we can rewrite this as
\[t_n =
\frac{c_\alpha}{\Gamma(-\alpha)} \frac{\omega^{-n}}{n^{\alpha + 1}}
\left ( 1 + O \left ( \frac{1}{n} \right ) \right )\]
Which finally shows us where our magic numbers came from!
Sage happily tells us that the dominant singularity for $T$ is at
$\omega = (0.2769\ldots)$ and that a puiseux expansion for $T$ at $\omega$
is
\[T \approx (0.65\ldots) – (0.8936\ldots) (1-z/\omega)^{1/2} + \ldots\]
so that for us $\alpha = 1/2$, and $C_{1/2} = -(0.8936\ldots)$.
Finally, since $\Gamma(-1/2) = -2 \sqrt{\pi}$ and
$(0.2769\ldots)^{-1} = (3.6107\ldots)$ we see
\[t_n =
\frac{(0.8936\ldots)}{2 \sqrt{\pi}} \frac{(3.6107\ldots)^{n}}{n^{3/2}}
\left ( 1 + O \left ( \frac{1}{n} \right ) \right )\]
as promised!
Indeed, here’s how you could actually get sage to do all this for you!
As a fun exercise, you might modify this code (using s.add_term()
) to
compute a longer puiseux series and get asymptotics valid up to a
multiplicative error of $O \left ( \frac{1}{n^2} \right )$
for the number of rooted ternary trees.
Try modifying the previous code block to see that our current approximation
is not accurate to $O \left ( \frac{1}{n^2} \right )$, and then
check that your approximation is!
Ok, now that we understand the warmup, let’s get to the actual problem!
How many unordered rooted ternary trees are there, up to isomorphism?
Now we’re counting up to graph isomorphism, so that our two trees
are now considered isomorphic. It’s actually much less obvious how one
might pin down a generating function for something like this, but the
answer, serendipitously, comes from Pólya-Redfield counting! If this is
new to you, you might be interested in my recent blog post where
I talk a bit about one of its corollaries. Today, though, we’ll be using
it in a much more sophisticated way.
Say you have a structure $X$ acted on by a group $G$ and a collection $C$
of “colors”. How can we count the number of ways to give each $x \in X$
a color from $C$, up to the action of $G$?
We start by building the cycle index polynomial of the action
$G \curvearrowright X$, which we call $P_G(x_1, \ldots, x_n)$. Then we can
plug all sorts of things into the variables $x_i$ in order to solve various
counting problems. For example, if $C$ is literally just a finite set of colors,
we can plug in $x_1 = x_2 = \ldots = x_n = |C|$ to recover the expression
from my recent blog post. But we can also do much more!
Say that $C$ is a possibly infinite family of allowable “colors”, each
of a fixed “weight”. (For us, our “colors” will be trees, and the “weight”
will be the number of nodes). Then we can arrange them into a generating
function $F(t) = \sum c_i t^i$, where $c_i$ counts the number of colors
of weight $i$. Then an easy argument (given in full on the
wikipedia page) shows that $P_G(F(t), F(t^2), \ldots, F(t^n))$
is a generating function counting the number of ways to color $X$
by colors from $C$, counted up to the $G$ action, and sorted by
their total weight. Check out the wikipedia page for a bunch of
great examples!
For us, we realize an unordered rooted ternary tree is either empty,
or a root with 3 recursive children. In the recursive case we
also want to count up to the obvious $\mathfrak{S}_3$ action permuting
the children, so by the previous discussion we learn that
\[T(z) = 1 + z P_{\mathfrak{S}_3}(T(z), T(z^2), T(z^3))\]
where
\(P_{\mathfrak{S}_3}(x_1, x_2, x_3) = \frac{1}{6}(x_1^3 + 3x_1 x_2 + 2 x_3)\)
is the cycle index polynomial for the symmetric group on three letters.
Plugging this in we see
\[T(z) = 1 + \frac{z}{6} \left ( T(z)^3 + 3 T(z) T(z^2) + 2 T(z^3) \right )\]
Now, how might we get asymptotics for $t_n$ using this functional equation?
First let’s think about how our solution to the warmup worked. We
wrote $F(z,T)=0$ for a polynomial $F$, used the implicit function
theorem to get a taylor series for $T$ at the origin, then got a
puiseux series near the dominant singularity $\omega$ which let us accurately
estimate the taylor coefficients $t_n$.
We’re going to play the same game here, except we’ll assume that $F$ is
merely holomorphic rather than a polynomial. Because we’re no longer
working with a polynomial, this really seems to require an infinite amount
of data, so I’m not sure how one might get an exact solution for the relevant
constants… But following Section VII.5 in Flajolet and Sedgewick we can get
as precise a numerical solution as we like!
We’ll assume that the functions $T(z^2)$ and $T(z^3)$ are already known
analytic functions, so that we can write
$F(z,w) = -w + 1 + \frac{z}{6} \left ( w^3 + 3 T(z^2) w + 2 T(z^3) \right )$.
This is an analytic function satisfying $F(z,T) = 0$.
Now for a touch of magic. Say we can find a pair $(r,s)$ with both
$F(r,s)=0$ and $\left . \frac{\partial F}{\partial w} \right |_{(r,s)} = 0$…
Then $F$ is singular in the $w$ direction at $(r,s)$ so this is a branch
point for $T$. Since both $F$ and $F_w = \frac{\partial F}{\partial w}$
vanish at $(r,s)$ the taylor series for $F$ at $(r,s)$ starts
\[F_z(r,s) (z-r) + \frac{1}{2} F_{ww}(r,s) (w-s)^2\]
Since we know $F(z,T)$ vanishes, we estimate up to smaller order terms
\[F_z(r,s) (z-r) + \frac{1}{2} F_{ww}(r,s) (T-s)^2 = 0\]
so that a puiseux series for $T$ at $(r,s)$ begins
\[T = s \pm \sqrt{\frac{2r F_z(r,s)}{F_{ww}(r,s)}} \left ( 1 – \frac{z}{r} \right )^{1/2}\]
If we write $\gamma = \sqrt{\frac{2r F_z(r,s)}{F_{ww}(r,s)}}$ and are
slightly more careful with our error terms, the same technique from the
warmup shows
\[t_n = \frac{\gamma}{\Gamma(-1/2)} \frac{r^{-n}}{n^{3/2}}
\left ( 1 + O \left ( \frac{1}{n} \right ) \right )\]
See Flajolet and Sedgewick Theorem VII.3 on page 468 for a more careful
proof of this theorem.
Now how do we use this? We can approximate $T$ by its taylor series
at the origin, then numerically solve for the unique positive real solution
to the system $F(r,s) = F_w(r,s) = 0$ using this approximation:
If we run this code locally with $N=20$, we get the approximation
\[\begin{align}
r &\approx 0.3551817478731632 \\
s &\approx 2.1174205967276225 \\
\gamma &\approx -1.8358222405236164
\end{align}\]
so that we expect
\[t_n \approx \frac{(1.835\ldots)}{2 \sqrt{\pi}} \frac{(0.355\ldots)^{-n}}{n^{3/2}}\]
and indeed this seems to work really well!
Our taylor expansion for $T$ agrees with A000598, as we expected,
and comparing our approximation to our taylor expansion gives:
I’m pretty proud of this approximation, so I think this is a good
place to stop ^_^.
As a fun exercise, can you write a program that outputs the number
of cyclic rooted ternary trees on $n$ vertices? For these we consider
two trees the same if they’re related by cyclicly permuting their children.
Compare your solution to A000625
For bonus points, can you check that the number of such trees is,
asymptotically,
\[t_n = (0.34630\ldots) \frac{(3.2871\ldots)^n}{n^{3/2}}\]
Wow! It’s been super nice to be writing up so many posts lately! Like
I said in one of my other recent posts, I’ve had a lot of time to think
about more bite sized problems and mse stuff while waiting for my DOI
generating code to run, so I’ve had more things that felt quick to write up
on my mind.
My research is actually going quite well too! I have a few interesting
directions to explore, and at least one project that might be wrapping
up soon. Of course when that happens I’ll be sure to talk about it here,
and I’m still planning out a series on fukaya categories, hall algebras,
skein algebras, and more! That’s a pretty long one, though, so it’s easy
for me to deprioritize, haha.
It’s already heating up in Riverside, consistently in the 80s (Fahrenheit),
so I can really feel the summer coming. I’m enjoying the sunny days, though,
and it’s been nice to spend time working outside and under trees.
Thanks for hanging out, all! Take care of each other, and I can’t wait
to chat soon ^_^.