Enumerating the rationals

Share on:

The rational numbers are well known to be countable, and one standard method of counting them is to put the positive rationals into an infinite matrix \(M=m_{ij}\), where \(m_{ij}=i/j\) so that you end up with something that looks like this:

\[ \left[\begin{array}{ccccc} \frac{1}{1}&\frac{1}{2}&\frac{1}{3}&\frac{1}{4}&\dots\\\[1ex] \frac{2}{1}&\frac{2}{2}&\frac{2}{3}&\frac{2}{4}&\dots\\\[1ex] \frac{3}{1}&\frac{3}{2}&\frac{3}{3}&\frac{3}{4}&\dots\\\[1ex] \frac{4}{1}&\frac{4}{2}&\frac{4}{3}&\frac{4}{4}&\dots\\\[1ex] \vdots&\vdots&\vdots&\vdots&\ddots \end{array}\right] \]

It is clear that not only will each positive rational appear somewhere in this matrix, but its value will appear an infinite number of times. For example \(2 / 3\) will appear also as \(4 / 6\), as \(6 / 9\) and so on.

Then we can enumerate all the elements of this matrix by traversing all the SW--NE diagonals:

This provides an enumeration of all the positive rationals: \[ \frac{1}{1}, \frac{1}{2}, \frac{2}{1}, \frac{3}{1}, \frac{2}{2}, \frac{1}{3}, \frac{1}{4}, \frac{2}{3},\ldots \] To enumerate all rationals (positive and negative), we simply place the negative of each value immediately after it: \[ \frac{1}{1}, -\frac{1}{1}, \frac{1}{2}, -\frac{1}{2}, \frac{2}{1}, -\frac{2}{1}, \frac{3}{1}, -\frac{3}{1}, \frac{2}{2}, -\frac{2}{2}, \frac{1}{3}, -\frac{1}{3}, \frac{1}{4}, \\frac{1}{4}, \frac{2}{3}, -\frac{2}{3}\ldots \] This is all standard, well-known stuff, and as far as countability goes, pretty trivial.

One might reasonably ask: is there a way of enumerating all rationals in such a way that no rational is repeated, and that every rational appears naturally in its lowest form?

Indeed there is; in fact there are several, of which one of the newest, most elegant, and simplest, is using the Calkin-Wilf tree. This is named for its discoverers (or creators, depending on which philosophy of mathematics you espouse), who described it in an article happily available on the archived web site of the second author. Herbert Wilf died in 2012, but the Mathematics Department at the University of Pennsylvania have maintained the page as he left it, as an online memorial to him.

The Calkin-Wilf tree is a binary tree with root \(a / b = 1 / 1\). From each node \(a / b\) the left child is \(a / (a+b)\) and the right child is \((a+b) / b\). From each node \(a / b\), the path back to the root contains the fractions which encode, as it were, the Euclidean algorithm for determining the greatest common divisor of \(a\) and \(b\). It is not hard to show that every fraction in the tree is in its lowest terms, and appears only once; also that every rational appears in the tree.

The enumeration of the rationals can thus be made by a breadth-first transversal of the tree; in other words listing each level of the tree one after the other:

\[ \underbrace{\frac{1}{1}}_{\text{The root}},\; \underbrace{\frac{1}{2},\; \frac{2}{1}}_{\text{first level}},\; \underbrace{\frac{1}{3},\; \frac{3}{2},\; \frac{2}{3},\; \frac{3}{1}}_{\text{second level}},\; \underbrace{\frac{1}{4},\; \frac{4}{3},\; \frac{3}{5},\; \frac{5}{2},\; \frac{2}{5},\; \frac{5}{3},\; \frac{3}{4},\; \frac{4}{1}}_{\text{third level}}\;\ldots \]

Note that the denominator of each fraction is the numerator of its successor (again, this is not hard to prove in general); thus given the sequence

\[ b_i=0,1,1,2,1,3,2,3,1,4,3,5,2,5,3,4,\ldots \]

(indexed from zero), the rationals are enumerated by \(b_i/b_{i+1}\). This sequence pre-dates Calkin and Wilf; is goes back to an older enumeration now called the Stern-Brocot tree named for the mathematician Moritz Stern and the clock-maker Achille Brocot (who was investigating gear ratios), who discovered this tree independently in the early 1860's.

The sequence \(b_i\) is called Stern's diatomic sequence and can be generated recursively:

\[ b_i=\left\{\begin{array}{ll} i,&\text{if $i\le 1$}\\ b_{i/2},&\text{if $i$ is even}\\ b_{(i-1)/2}+b_{(i+1)/2},&\text{if $i$ is odd} \end{array} \right. \]

Alternatively: \[ b_0=0,\;b_1=1,\;b_{2i}=b_i,b_{2i+1}=b_i+b_{i+1}\text{ for }i\ge 1. \] This is the form in which it appears as sequence 2487 in the OEIS.

So we can generate Stern's diatomic sequence \(b_i\), and then the successive fractions \(b_i/b_{i+1}\) will generate each rational exactly once.

If that isn't remarkable enough, sometime prior to 2003, Moshe Newman showed that the Calkin-Wilf enumeration of the rationals can in fact be done directly: \[ x_0 = 1,\quad x_{i+1}=\frac{1}{2\lfloor x_i\rfloor -x_i +1}\;\text{for}\;i\ge 1 \] will generate all the rationals. I can't find anything at all about Moshe Newman; he is always just mentioned as having "shown" this result. Never where, or to whom. There is a proof for this in an article "New Looks at Old Number Theory" by Aimeric Malter, Dierk Schleicher and Don Zagier, published in The American Mathematical Monthly , Vol. 120, No. 3 (March 2013), pp. 243-264. The part of the article relating to enumeration of rationals is based on a prize-winning mathematical essay by the first author (who at the time was a high school student in Bremen, Germany), when he was only 13. Here is the skeleton of Malter's proof:

If \(x\) is any node, then its left and right hand children are \(L = x / (x+1)\) and \(R = 1+x = 1 / (1-L)\) respectively. And clearly \(R = 1/(2\lfloor L\rfloor -L +1)\). Suppose now that \(A\) is a right child, and \(B\) is its successor rational. Then \(A\) and \(B\) will have a common ancestor \(z=p/q\), say \(k\) generations ago. To get from \(z\) to \(A\) will require one left step and \(k-1\) right steps. It is easy to show (by induction if you like), that

\[ A = k-1+\frac{p}{p+q} \] and for its successor \(B\), obtained by one right step from \(z\) and \(k-1\) left steps: \[ B = \frac{1}{\frac{q}{p+q}+k-1}. \] Since \(k-1=\lfloor A\rfloor\), and since \[ \frac{p}{p+q} = A-\lfloor A\rfloor \] it follows that \[ B=\frac{1}{1-(A-\lfloor A\rfloor))+\lfloor A\rfloor}=\frac{1}{2\lfloor A\rfloor-A+1}. \] The remaining case is moving from the end of one row to the beginning of the next, that is, from \(n\) to \(1 / (n+1)\). And this is trivial.

What's more, we can write down the isomorphisms between this sequence of positive rationals and in positive integers. Define \(N:\Bbb{Q}\to\Bbb{Z}\) as follows:

\[ N(p/q)=\left\{\begin{array}{ll} 1,&\text{if $p=q$}\\ 2 N(p/(q-p)),&\text{if $p\lt q$}\\ 2 N((p-q)/q)+1,&\text{if $p\gt q$} \end{array} \right. \]

Without going through a formal proof, what this does is simply count the number of steps taken to perform the Euclidean algorithm on \(p\) and \(q\). The extra factors of 2 ensure that rationals in level \(k\) have values between \(2^k\) and \(2^{k+1}\), and the final "\(+1\)" differentiates left and right children. This function assumes that \(p\) and \(q\) are relatively prime; that is, that the fraction \(p/q\) is in its lowest terms.

(The isomorphism in the other direction is given by \(k\mapsto b_k/b_{k+1}\) where \(b_k\) are the elements of Stern's diatomic sequence discussed above.)

This is just the sort of mathematics I like: simple, but surprising, and with depth. What's not to like?