Pretty much every speaker reads the words on their slides, as though the audience were illiterate. I went to a talk once which consisted of 60 – yes, sixty – slides of very dense text, and the presenter read through each one. I think people were gnawing their own limbs off out of sheer boredom by the end. Andy Warhol’s “Empire” would have been a welcome relief.

Since most of my talks are technical and full of mathematics, I have naturally gravitated to the LaTeX presentation tool Beamer. Now Beamer is a lovely thing for LaTeX: as part of the LaTeX ecosystem you get all of LaTeX loveliness along with elegant slide layouts, transitions, etc. My only issue with Beamer (and this is not a new observation by any means), is that all Beamer presentations have a certain sameness to them. I suspect that this is because most Beamer users are mathematicians, who are rightly more interested in content than appearance. It is quite possible of course to make Beamer look like something new and different, but hardly anybody does.

However, I am not a mathematician, I am a mathematics educator, and I do like my presentations to look good, and if possible to stand out a little. I also have a minor issue in that I use Linux on my laptop, which sometimes means my computer won’t talk to an external projector system. Or my USB thumb drive won’t be recognized by the computer I’ll be using, and so on. One way round all this is to use an online system; maybe one which can be displayed in a browser, and which can be placed on a web server somewhere. There are of course plenty of such tools, and I have had a brief dalliance with prezi, but for me prezi was not the answer: yes it was fun and provided a new paradigm for organizing slides, but really, when you took the whizz-bang aspect out, what was left? The few prezis I’ve seen in the wild showed that you can be as dull with prezi as with any other software. Also, at the time it didn’t support mathematics.

In fact I have an abiding distrust of the whole concept of “presentations”. Most are a colossal waste of time – people can read so there’s no need for wordiness, and most of the graphs and charts that make up the rest of most slides are dreary and lacklustre. Hardly anybody knows how to present information graphically in a way that really grabs people’s attention. It’s lazy and insulting to your audience to simply copy a chart from your spreadsheet and assume they’ll be delighted by it. Then you have the large class of people who fill their blank spaces with cute cartoons and clip art. This sort of thing annoys me probably more than it should – when I’m in an audience I don’t want to be entertained with cute irrelevant additions, I want to *learn*. This comes to the heart of presenting. A presenter is acting as a teacher; the audience the learners. So presenting should be about engaging the audience. What’s in your slides comes a distant second. I don’t want new technology with clever animations and transitions, bookmarks, non-linear slide shows; I want presenters to be themselves interesting. (As an aside, some of the very worst presentations have been at education conferences.)

For a superb example of attention-grabbing graphics, check out the TED talk by the late Hans Rosling. Or you can admire the work of David McCandless.

I seem to have digressed, from talking about presentation software to banging on about the awfulness of presentations generally. So, back to the topic.

For a recent conference I determined to do just that: use an online presentation tool, and I chose reveal.js. I reckon reveal.js is presentations done right: elegant, customizable, making the best use of html for content and css for design; and with nicely chosen defaults so that even if you just put a few words on your slides the result will still look good. Even better, you can take your final slides and put them up on github pages so that you can access them from anywhere in the world with a web browser. And if you’re going somewhere which is not networked, you can always take your slides on some sort of portable media. And it has access to almost all of LaTeX via MathJax.

One minor problem with reveal.js is that the slides are built up with raw html code, and so can be somewhat verbose and hard to read (at least for me). However, there is a companion software for emacs org mode called org-reveal, which enables you to structure your reveal.js presentation as an org file. This is presentation heaven. The org file gives you structure, and reveal.js gives you a lovely presentation.

To make it available, you upload all your presentations to github.pages, and you can present from anywhere in the world with an internet connection! You can see an example of one of my short presentations at

https://amca01.github.io/ATCM_talks/lindenmayer.html

Of course the presentation (the software and what you do with it), is in fact the least part of your talk. By far the most important part is the presenter. The best software in the world won’t overcome a boring speaker who can’t engage an audience.

I like my presentations to be simple and effect-free; I don’t want the audience to be distracted from my leaping and capering about.

]]>As you know, the Vigenère cipher works using a plaintext and a keyword, which is repeated as often as need be:

T H I S I S T H E P L A I N T E X T

K E Y K E Y K E Y K E Y K E Y K E Y

K E Y K E Y K E Y K E Y K E Y K E Y

The corresponding letters are added modulo 26 (using the values A=0, B=1, C=2, and on up to Z=25), then converted back to letters again. So for the example above, we have these corresponding values:

19 7 8 18 8 18 19 7 4 15 11 0 8 13 19 4 23 19

10 4 24 10 4 24 10 4 24 10 4 24 10 4 24 10 4 24

10 4 24 10 4 24 10 4 24 10 4 24 10 4 24 10 4 24

Adding modulo 26 and converting back to letters:

3 11 6 2 12 16 3 11 2 25 15 24 18 17 17

D L G C M Q D L C Z P Y S R R

D L G C M Q D L C Z P Y S R R

gives us the ciphertext.

The Vigenère cipher is historically important as it is one of the first cryptosystems where a single letter may be encrypted to different characters in the ciphertext. For example, the two “S”s are encrypted to “C” and “Q”; the first and last “T”s are encrypted to “D” and “R”. For this reason the cipher was considered unbreakable – as indeed it was for a long time – and was known to the French as *le chiffre indéchiffrable* – the unbreakable cipher. It was broken in 1863. See the Wikipedia page for more history.

Suppose the length of the keyword is . Then the -th character of the plaintext will correspond to the character of the keyword (assuming a zero-based indexing). Thus the encryption can be defined as

However, encryption can also be done without knowing the length of the keyword, but by shifting the keyword each time – first letter to the end – and simply taking the left-most letter. Like this:

T H I S I S T H E P L A I N T E X T

K E Y

K E Y

so “T”+”K” (modulo 26) is the first encryption. Then we shift the keyword:

T H I S I S T H E P L A I N T E X T

E Y K

E Y K

and “H”+”E” (modulo 26) is the second encrypted letter. Shift again:

T H I S I S T H E P L A I N T E X T

Y K E

Y K E

for “I”+”Y”; shift again:

T H I S I S T H E P L A I N T E X T

K E Y

K E Y

for “S”+”K”. And so on.

This is almost trivial in Haskell. We need two extra functions from the module `Data.Char`: `chr` which gives the character corresponding to the ascii value, and `ord` which gives the ascii value of a character:

λ> ord 'G'

71

λ> chr 88

'X'

71

λ> chr 88

'X'

So here’s what might go into a little file called `vigenere.hs`:

import Data.Char (ord,chr)

vige :: [Char] -> [Char] -> [Char]

vige [] k = []

vige p [] = []

vige (p:ps) (k:ks) = (encode p k):(vige ps (ks++[k]))

where

encode a b = chr $ 65 + mod (ord a + ord b) 26

vigd :: [Char] -> [Char] -> [Char]

vigd [] k = []

vigd p [] = []

vigd (p:ps) (k:ks) = (decode p k):(vigd ps (ks++[k]))

where

decode a b = chr $ 65 + mod (ord a - ord b) 26

vige :: [Char] -> [Char] -> [Char]

vige [] k = []

vige p [] = []

vige (p:ps) (k:ks) = (encode p k):(vige ps (ks++[k]))

where

encode a b = chr $ 65 + mod (ord a + ord b) 26

vigd :: [Char] -> [Char] -> [Char]

vigd [] k = []

vigd p [] = []

vigd (p:ps) (k:ks) = (decode p k):(vigd ps (ks++[k]))

where

decode a b = chr $ 65 + mod (ord a - ord b) 26

And a couple of tests: the example from above, and the one on the Wikipedia page:

λ> vige "THISISTHEPLAINTEXT" "KEY"

"DLGCMQDLCZPYSRROBR"

λ> vige "ATTACKATDAWN" "LEMON"

"LXFOPVEFRNHR"

"DLGCMQDLCZPYSRROBR"

λ> vige "ATTACKATDAWN" "LEMON"

"LXFOPVEFRNHR"

In Australia, the voting method used for almost all lower house elections (state and federal), is Instant Runoff Voting, also known as the Alternative Vote, and known locally as the “preferential method”. Each voter must number the candidates sequentially starting from 1. All boxes must be filled in (except the last); no numbers can be repeated or missed. In Northcote there were 12 candidates, and so each voter had to number the boxes from 1 to 12 (or 1 to 11); any vote without those numbers is invalid and can’t be counted. Such votes are known as “informal”. Ballots are distributed according to first preferences. If no candidate has obtained an absolute majority, then the candidate with the lowest count is eliminated, and all those ballots distributed according to their second preferences. This continues through as many redistributions as necessary until one candidate ends up with an absolute majority of ballots. So at any stage the candidate with the lowest number of ballots is eliminated, and those ballots redistributed to the remaining candidates on the basis of the highest preferences. As voting systems go it’s not the worst, although it has many faults. However, it is too entrenched in Australian political life for change to be likely.

Each candidate had prepared a How to Vote card, listing the order of candidates they saw as being most likely to ensure a good result for themselves. In fact there is no requirement for any voter to follow a How to Vote card, but most voters do. For this reason the ordering of candidates on these cards is taken very seriously, and one of the less savoury aspects of Australian politics is backroom “preference deals”, where parties will wheel and deal to ensure best possible preference positions on other How to Vote cards.

Here are the 12 candidates and their political parties, in the order as listed on the ballots:

Russell HAYWARD | Independent |

Brian SANAGHAN | Independent |

Lidia THORPE | The Greens |

Nina LENK | Animal Justice Party |

Laura CHIPP | Reason Party |

Philip COOPER | Independent |

Dean ROSSITER | Liberal Democrats |

Clare BURNS | Labor |

Joseph TOSCANO | Independent |

Bryony EDWARDS | Independent |

Nevena SPIROVSKA | Independent |

Vince FONTANA | Independent |

For this election the How to Vote cards can be seen at the ABC news site. The only candidate not to provide a full ordered list was Joseph Toscano, who simple advised people to number his square 1, and the other squares in any order they liked, along with a recommendation for people to number Lidia Thorpe 2.

As I don’t have a complete list of all possible ballots with their orderings and numbers, I’m going to make the following assumptions:

- Every voter followed the How to Vote card of their preferred candidate exactly.
- Joseph Toscano’s preference ordering is: 3,4,2,5,6,7,8,9,1,10,11,12 (This gives Toscano 1; Thorpe 2; and puts the numbers 3 – 12 in order in the remaining spaces).

These assumptions are necessarily crude, and don’t reflect the nuances of the election. But as we’ll see they end up providing a remarkably close fit with the final results.

For the exploration of the voting data I’ll use Python, and so here is all the How to Vote information as a dictionary:

In [ ]: htv = dict() htv['Hayward']=[1,10,7,6,8,5,12,11,3,2,4,9] htv['Sanaghan']=[3,1,2,5,6,7,8,9,10,11,12,4] htv['Thorpe']=[6,9,1,3,10,8,12,2,7,4,5,11] htv['Lenk']=[7,8,3,1,5,11,12,2,9,4,6,10] htv['Chipp']=[10,12,4,5,1,6,7,3,11,9,2,8] htv['Cooper']=[5,12,8,6,2,1,7,3,11,9,10,4] htv['Rossiter']=[6,12,9,11,2,7,1,5,8,10,3,4] htv['Burns']=[10,12,5,3,2,4,6,1,11,9,8,7] htv['Toscano']=[3,4,2,5,6,7,8,9,1,10,11,12] htv['Edwards']=[2,10,4,3,8,9,12,6,5,1,7,11] htv['Spirovska']=[2,12,3,7,4,5,6,8,10,9,1,11] htv['Fontana']=[2,3,4,5,6,7,8,9,10,11,12,1] In [ ]: cands = list(htv.keys())

Voting took place at different voting centres (also known as “booths”), and the first preferences for each candidate at each booth can be found at the Victorian Electoral Commission. I copied this information into a spreadsheet and saved it as a CSV file. I then used the data analysis library pandas to read it in as a DataFrame:

In [ ]: import pandas as pd firstprefs = pd.read_csv('northcote_results.csv') firsts = firstprefs.loc[:,'Hayward':'Fontana'].sum(axis=0) firsts Out[ ]: Hayward 354 Sanaghan 208 Thorpe 16254 Lenk 770 Chipp 1149 Cooper 433 Rossiter 1493 Burns 12721 Toscano 329 Edwards 154 Spirovska 214 Fontana 1857 dtype: int64

As Thorpe has more votes than any other candidate, then by the voting system of simple plurality (or First Past The Post) she would win. This system is used in the USA, and is possibly the worst of all systems for more than two candidates.

So let’s first check how IRV works, with a little program that starts with a dictionary and first preferences of each candidate. Recall our simplifying assumption that all voters vote according to the How to Vote cards, which means that when a candidate is eliminated, all those votes will go to just one other remaining candidate. In practice, of course, those ballots would be redistributed across a number of candidates.

Here’s a simple program to manage this version of IRV:

def IRV(votes): # performs an IRV simulation on a list of first preferences: at each stage # deleting the candidate with the lowest current score, and distributing # that candidates votes to the highest remaining candidate vote_counts = votes.copy() for i in range(10): m = min(vote_counts.items(), key = lambda x: x[1]) ind = next(j for j in range(2,11) if cands[htv[m[0]].index(j)] in vote_counts) c = cands[htv[m[0]].index(ind)] vote_counts += m[1] del(vote_counts[m[0]]) return(vote_counts)

We could make this code a little more efficient by stopping when any candidate has amassed over 50% pf the votes. But for simplicity we’ll eliminate 10 of the 12 candidates, so it will be perfectly clear who has won. Let’s try it out:

In [ ]: IRV(firsts) Out[ ]: Thorpe 18648 Burns 17288 dtype: int64

Note that this is very close to the results listed on the VEC site:

Thorpe: 18380 Burns: 14410 Fontana: 3298

At this stage it doesn’t matter where Fontana’s votes go (in fact they would go to Burns), as Thorpe already has a majority. But the result we obtained above with our simplifying assumptions gives very similar values.

Now lets see what happens if we work through each booth independently:

In [ ]: finals = {'Thorpe':0,'Burns':0} In [ ]: for i in firstprefs.index: ...: booth = dict(firstprefs.loc[i,'Hayward':'Fontana']) ...: f = IRV(booth) ...: finals['Thorpe'] += f['Thorpe'] ...: finals['Burns'] += f['Burns'] ...: print(firstprefs.loc[i,'Booth'],': ',f) ...: Alphington : {'Thorpe': 524, 'Burns': 545} Alphington North : {'Thorpe': 408, 'Burns': 485} Bell : {'Thorpe': 1263, 'Burns': 893} Croxton : {'Thorpe': 950, 'Burns': 668} Darebin Parklands : {'Thorpe': 180, 'Burns': 204} Fairfield : {'Thorpe': 925, 'Burns': 742} Northcote : {'Thorpe': 1043, 'Burns': 875} Northcote North : {'Thorpe': 1044, 'Burns': 1012} Northcote South : {'Thorpe': 1392, 'Burns': 1137} Preston South : {'Thorpe': 677, 'Burns': 639} Thornbury : {'Thorpe': 1158, 'Burns': 864} Thornbury East : {'Thorpe': 1052, 'Burns': 804} Thornbury South : {'Thorpe': 1310, 'Burns': 1052} Westgarth : {'Thorpe': 969, 'Burns': 536} Postal Votes : {'Thorpe': 1509, 'Burns': 2262} Early Votes : {'Thorpe': 5282, 'Burns': 3532} In [ ]: finals Out[ ]: {'Burns': 16250, 'Thorpe': 19686}

Note again that the results are surprisingly close to the “two-party preferred” results as reported again on the VEC site. This adds weight to the notion that our assumptions, although crude, do in fact provide a reasonable way of experimenting with the election results.

These are named for Jean Charles de Borda (1733 – 1799) an early voting theorist. The idea is to weight all the preferences, so that a preference of 1 has a higher weighting that a preference of 2, and so on. All the weights are added, and the candidate with the greatest total is deemed to be the winner. With candidates, there are different methods of determining weighting; probably the most popular is a simple linear weighting, so that a preference of is weighted as . This gives weightings from down to zero. Alternatively a weighting of can be used, which gives weights of down to 1. Both are equivalent in determining a winner. Another possible weighting is .

Here’s a program to compute Borda counts, again with our simplification:

def borda(x): # x is 0 or 1 borda_count = dict() for c in cands: borda_count=0.0 for c in cands: v = firsts # number of 1st pref votes for candidate c for i in range(1,13): appr = cands[htv.index(i)] # the candidate against position i on c htv card if x==0: borda_count[appr] += v/i else: borda_count[appr] += v*(11-i) if x==0: for k, val in borda_count.items(): borda_count[k] = float("{:.2f}".format(val)) else: for k, val in borda_count.items(): borda_count[k] = int(val) return(borda_count)

Now we can run this, and to make our lives easier we’ll sort the results:

In [ ]: sorted(borda(1).items(), key = lambda x: x[1], reverse = True) Out[ ]: [('Burns', 308240), ('Thorpe', 279392), ('Lenk', 266781), ('Chipp', 179179), ('Cooper', 167148), ('Spirovska', 165424), ('Edwards', 154750), ('Hayward', 136144), ('Fontana', 88988), ('Toscano', 80360), ('Rossiter', 75583), ('Sanaghan', 38555)] In [ ]: sorted(borda(0).items(), key = lambda x: x[1], reverse = True) Out[ ]: [('Burns', 22409.53), ('Thorpe', 20455.29), ('Lenk', 11485.73), ('Chipp', 10767.9), ('Spirovska', 6611.22), ('Cooper', 6592.5), ('Edwards', 6569.93), ('Hayward', 6186.93), ('Fontana', 6006.25), ('Rossiter', 5635.08), ('Toscano', 4600.15), ('Sanaghan', 4196.47)]

Note that in both cases Burns has the highest output. This is in general to be expected of Borda counts: that the highest value does not necessarily correspond to the candidate which is seen as better overall. For this reason Borda counts are rarely used in modern systems, although they can be used to give a general picture of an electorate.

There are a vast number of voting systems which treat the vote as simultaneous pairwise contests. For example in a three way contest, between Alice, Bob, and Charlie the system considers the contest between Alice and Bob, between Alice and Charlie, and between Bob and Charlie. Each of these contests will produce a winner, and the outcome of all the pairwise contests is used to determine the overall winner. If there is a single person who is preferred, by a majority of voters, in each of their pairwise contests, then that person is called a *Condorcet winner*. This is named for the Marquis de Condorcet (1743 – 1794) another early voting theorist. The *Condorcet criterion* is one of many criteria considered appropriate for a voting system; it says that if the ballots return a Condorcet winner, then that winner should be chosen by the system. This is one of the faults of IRV: that it does not necessarily return a Condorcet winner.

Let’s look again at the How to Vote preferences, and the numbers of voters of each:

In [ ]: htvd = pd.DataFrame(list(htv.values()),index=htv.keys(),columns=htv.keys()).transpose() In [ ]: htvd.loc['Firsts']=list(firsts.values) In [ ]: htvd Out[ ]: Hayward Sanaghan Thorpe Lenk Chipp Cooper Rossiter Burns Toscano Edwards Spirovska Fontana Hayward 1 3 6 7 10 5 6 10 3 2 2 2 Sanaghan 10 1 9 8 12 12 12 12 4 10 12 3 Thorpe 7 2 1 3 4 8 9 5 2 4 3 4 Lenk 6 5 3 1 5 6 11 3 5 3 7 5 Chipp 8 6 10 5 1 2 2 2 6 8 4 6 Cooper 5 7 8 11 6 1 7 4 7 9 5 7 Rossiter 12 8 12 12 7 7 1 6 8 12 6 8 Burns 11 9 2 2 3 3 5 1 9 6 8 9 Toscano 3 10 7 9 11 11 8 11 1 5 10 10 Edwards 2 11 4 4 9 9 10 9 10 1 9 11 Spirovska 4 12 5 6 2 10 3 8 11 7 1 12 Fontana 9 4 11 10 8 4 4 7 12 11 11 1 Firsts 354 208 16254 770 1149 433 1493 12721 329 154 214 1857

Here the how to vote information is in the columns. If we look at just the first two candidates, we see that Hayward is preferred to Sanaghan by all voters except for those who voted for Sanaghan. Thus a majority (in fact, nearly all) voters preferred Hayward to Sanaghan.

For each pair of candidates, the number of voters preferring one to the other can be computed by this program:

def condorcet(): condorcet_table = pd.DataFrame(columns=cands,index=cands).fillna(0) for c in cands: hc = htv for i in range(12): for j in range(12): if hc[i] < hc[j]: condorcet_table.loc[cands[i],cands[j]] += firsts return(condorcet_table)

We can see the results of this program:

In [ ]: ct = condorcet(); ct Out[ ]: Hayward Sanaghan Thorpe Lenk Chipp Cooper Rossiter Burns Toscano Edwards Spirovska Fontana Hayward 0 35728 4505 5042 19370 21633 20573 3116 35607 4888 3335 18283 Sanaghan 208 0 2065 2394 18648 3164 19926 2748 2835 2394 2394 17715 Thorpe 31431 33871 0 21504 20140 20935 34010 19370 33760 35428 32726 32153 Lenk 30894 33542 14432 0 19926 33442 34229 3886 33760 33935 32726 31945 Chipp 16566 17288 15796 16010 0 18895 34443 6037 18845 18404 18960 33871 Cooper 14303 32772 15001 2494 17041 0 34443 3395 18075 18404 15548 31608 Rossiter 15363 16010 1926 1707 1493 1493 0 4101 18075 18404 17041 15906 Burns 32820 33188 16566 32050 29899 32541 31835 0 35099 35428 32726 32024 Toscano 329 33101 2176 2176 17091 17861 17861 837 0 3887 2902 18075 Edwards 31048 33542 508 2001 17532 17532 17532 508 32049 0 20359 18075 Spirovska 32601 33542 3210 3210 16976 20388 18895 3210 33034 15577 0 20717 Fontana 17653 18221 3783 3991 2065 4328 20030 3912 17861 17861 15219 0

What we want to see, of course, if anybody has obtained a majority of preferences against everybody else. To do this we can find all the values greater than the majority, and add up their number. A value of 11 indicates a Condorcet winner:

In [ ]: maj = firsts.sum()//2 + 1; maj Out[ ]: 17969 In [ ]: ((ct >= maj)*1).sum(axis = 1) Out[ ]: Hayward 6.0 Sanaghan 2.0 Thorpe 11.0 Lenk 9.0 Chipp 6.0 Cooper 5.0 Rossiter 2.0 Burns 10.0 Toscano 2.0 Edwards 5.0 Spirovska 6.0 Fontana 2.0 dtype: float64

So in this case we do indeed have a Condorcet winner in Thorpe, and this election (at least with our simplifying assumptions) is also one in which IRV returned the Condorcet winner.

If you go to rangevoting.org you’ll find a spirited defense of a system called *range voting*. To vote in such a system, each voter gives an “approval weight” for each candidate. For example, the voter may mark off a value between 0 and 10 against each candidate, indicating their level of approval. There is no requirement for a voter to mark candidates differently: a voter might give all candidates a value of 10, or of zero, or give one candidate 10 and all the others zero. One simplified version of range voting is approval voting, where the voter simply indicates as many or as few candidates as she or he approves of. A voter may approve of just one candidate, or all of them. As with range voting, the winner is the one with the maximum number of approvals. A system where each voter approves of just one candidate is the First Past the Post system, and as we have seen previously, this is equivalent to simply counting only the first preferences of our ballots.

We can’t possibly know how voters may have approved of the candidates, but we can run a simple simulation: given a number between 1 and 12, suppose that each voter approves of their first preferences. Given the preferences and numbers, we can easily tally the approvals for each voter:

def approvals(n): # Determines the approvals result if voters took their # first n preferences as approvals approvals_result = dict() for c in cands: approvals_result = 0 firsts = firstprefs.loc[:,'Hayward':'Fontana'].sum(axis=0) for c in cands: v = firsts # number of 1st pref votes for candidate c for i in range(1,n+1): appr = cands[htv.index(i)] # the candidate against position i on c htv card approvals_result[appr] += v return(approvals_result)

Now we can see what happens with approvals for :

In [1 ]: for i in range(1,7): ...: si = sorted(approvals(i).items(),key = lambda x: x[1],reverse=True) ...: print([i]+[s[0] for s in si]) ...: [1, 'Thorpe', 'Burns', 'Fontana', 'Rossiter', 'Chipp', 'Lenk', 'Cooper', 'Hayward', 'Toscano', 'Spirovska', 'Sanaghan', 'Edwards'] [2, 'Burns', 'Thorpe', 'Chipp', 'Hayward', 'Fontana', 'Rossiter', 'Spirovska', 'Lenk', 'Edwards', 'Cooper', 'Toscano', 'Sanaghan'] [3, 'Burns', 'Lenk', 'Thorpe', 'Chipp', 'Hayward', 'Spirovska', 'Sanaghan', 'Fontana', 'Rossiter', 'Toscano', 'Edwards', 'Cooper'] [4, 'Burns', 'Lenk', 'Thorpe', 'Edwards', 'Chipp', 'Cooper', 'Fontana', 'Spirovska', 'Hayward', 'Sanaghan', 'Rossiter', 'Toscano'] [5, 'Thorpe', 'Lenk', 'Burns', 'Spirovska', 'Edwards', 'Chipp', 'Cooper', 'Fontana', 'Hayward', 'Sanaghan', 'Rossiter', 'Toscano'] [6, 'Lenk', 'Thorpe', 'Burns', 'Hayward', 'Spirovska', 'Chipp', 'Edwards', 'Cooper', 'Rossiter', 'Fontana', 'Sanaghan', 'Toscano']

It’s remarkable, that after , the first number of approvals required for Thorpe again to win is .

There are of course many many other methods of selecting a winning candidate from ordered ballots. And each of them has advantages and disadvantages. Some of the disadvantages are subtle (although important); others have glaring inadequacies, such as first past the post for more than two candidates. One such comparison table lists voting methods against standard criteria. Note that IRV – the Australian preferential system – is one of the very few methods to fail monotonicity. This is seen as one of the system’s worst failings. You can see an example of this in an old blog post.

Rather than write our own programs, we shall simply dump our information into the Ranked-ballot voting calculator page and see what happens. First the data needs to be massaged into an appropriate form:

In [ ]: for c in cands: ...: st = str(firsts)+":"+c ...: for i in range(2,13): ...: st += ">"+cands[htv.index(i)] ...: print(st) ...: 354:Hayward>Edwards>Toscano>Spirovska>Cooper>Lenk>Thorpe>Chipp>Fontana>Sanaghan>Burns>Rossiter 208:Sanaghan>Thorpe>Hayward>Fontana>Lenk>Chipp>Cooper>Rossiter>Burns>Toscano>Edwards>Spirovska 16254:Thorpe>Burns>Lenk>Edwards>Spirovska>Hayward>Toscano>Cooper>Sanaghan>Chipp>Fontana>Rossiter 770:Lenk>Burns>Thorpe>Edwards>Chipp>Spirovska>Hayward>Sanaghan>Toscano>Fontana>Cooper>Rossiter 1149:Chipp>Spirovska>Burns>Thorpe>Lenk>Cooper>Rossiter>Fontana>Edwards>Hayward>Toscano>Sanaghan 433:Cooper>Chipp>Burns>Fontana>Hayward>Lenk>Rossiter>Thorpe>Edwards>Spirovska>Toscano>Sanaghan 1493:Rossiter>Chipp>Spirovska>Fontana>Burns>Hayward>Cooper>Toscano>Thorpe>Edwards>Lenk>Sanaghan 12721:Burns>Chipp>Lenk>Cooper>Thorpe>Rossiter>Fontana>Spirovska>Edwards>Hayward>Toscano>Sanaghan 329:Toscano>Thorpe>Hayward>Sanaghan>Lenk>Chipp>Cooper>Rossiter>Burns>Edwards>Spirovska>Fontana 154:Edwards>Hayward>Lenk>Thorpe>Toscano>Burns>Spirovska>Chipp>Cooper>Sanaghan>Fontana>Rossiter 214:Spirovska>Hayward>Thorpe>Chipp>Cooper>Rossiter>Lenk>Burns>Edwards>Toscano>Fontana>Sanaghan 1857:Fontana>Hayward>Sanaghan>Thorpe>Lenk>Chipp>Cooper>Rossiter>Burns>Toscano>Edwards>Spirovska

The above can be copied and pasted into the given text box. Then the page returns:

winner | method(s) |
---|---|

Burns | Borda Bucklin |

Thorpe | Baldwin Black Carey Coombs Copeland Dodgson Hare Nanson Raynaud Schulze Simpson Small Tideman* |

You can see that Thorpe would be the winner under almost every other voting system. This indicates that Thorpe being returned by IRV seems not just an artifact of the system, but represents the genuine wishes of the electorate.

]]>I then discovered that there are lots of different CAD “programming languages”; or more properly scripting languages, where the user describes how the figure is to be built in the system’s language. Then the system builds it from the script. In this sense these systems are descendants of the venerable VRML, of which you can see some examples here, and its modern version X3D.

Some of the systems that I looked at were:

- OpenSCAD, which uses its own scripting language
- OpenJSCAD, based on JavaScript
- implicitCAD, based on Haskell,

No doubt there are others. All of these systems have primitive shapes (spheres, cubes, cylinders etc), operations on shapes (shifting, stretching, rotating, extruding etc) so a vast array of different forms can be generated. Some systems allow for a great deal of flexibility, so that a cylinder with a radius of zero at one end will be a cone, or of different radii at each end a frustum.

I ended up choosing OpenJSCAD, which is being actively developed, is based on a well known and robust language, and is also great fun to use. Here is a simple example, to construct a tetrahedron whose vertices are chosen from the vertices of a cube with vertices . The vertices whose product is 1 will be the vertices of a tetrahedron. We can make a nice tetrahedral shape by putting a small sphere at each vertex, and joining each sphere by a cylinder of the same radius:

// vertices of tetrahedron at (1,1,1), (1,-1,-1), (-1,1,-1), (-1,-1,1) var rad = 0.1; // radius of sphere at vertex and cylinders var v0 = [1,1,1]; var v1 = [1,-1,-1]; var v2 = [-1,1,-1]; var v3 = [-1,-1,1]; var vertices = [v0,v1,v2,v3]; // adjacency lists: var adj = [[1,2,3],[0,2,3],[0,1,3],[0,1,2]]; function main() { var t = []; for(var i = 0; i < 4; i++) { // loop through the list of vertices var here = vertices[i]; t.push(translate(here,sphere({r:rad}))); for(var j = 0; j < 3; j++) { // for each vertex join it to the others in its adjacency list var there = vertices[adj[i][j]]; t.push(cylinder({start:here,end:there,r:rad})); } } return union(t); }

The code should be fairly self-explanatory. And here is the tetrahedron:

I won’t put these models in this post, as one of them is slow to render: but look at a coloured tetrahedron, and an icosahedron.

Note that CAD design of this sort is not so much for animated media so much as precise designs for 3D printing. But I like it for exploring 3D geometry.

]]>Suppose we have a list of incomes, and the number of people earning that income (where we may assume that each number here refers to thousands), for example:

Population Income ---------- ------ 741 0 381 200 692 400 778 600 20 800 662 1000 228 1200 796 1400 221 1600 51 1800 361 2000

The only restriction is that the incomes must be listed in increasing order. Start by ordering by forming the cumulative sum of both incomes and populations, and scale each sum to between 0 and 1:

Scaled cumulative population Scaled cumulative income ---------------------------- ------------------------ 0.15027 0.00000 0.22754 0.01818 0.36788 0.05455 0.52565 0.10909 0.52971 0.18182 0.66396 0.27273 0.71020 0.38182 0.87163 0.50909 0.91645 0.65455 0.92679 0.81818 1.00000 1.00000

The right column is now a fraction of the total income earned, and the left column is the fraction of the population who earn up to that income.

What we have now is the fraction of population which earns a fraction of the total income. Plot income against population; the result will be a concave curve known as a *Lorenz curve* :

If income is perfectly equal, then for any fraction between 0 and 1, that fraction of the population will earn that fraction of total income, and the Lorenz curve will be the straight line . The Gini coefficient is defined to be the fraction:

or more simply

Thus the larger the Gini coefficient, the more unequal the income. A population which enjoys perfectly equal incomes will have a Gini coefficient of zero.

Given a discrete list of scaled cumulative sums of incomes and population, the integral of the Lorenz curve can be approximated by trapezoidal sums. If the cumulative income values are

and the cumulative population values are

then the area of the trapezoid between the values of and is

Thus the Gini coefficient can be computed as

where the sum is twice the area of trapezoids which form the area under the Lorenz curve.

This can be computed easily in Matlab or any other matrix-oriented language such as GNU Octave or Scilab. Suppose `cp` and `ci` are the lists from above corresponding to population and income. Then the Gini coefficient can be computed as:

1-sum(diff(cp).*(ci(1:end-1)+ci(2:end))) ans = 0.52579

We shall look at Australian incomes as obtained from the Australian Bureau of Statistics, in ten year intervals; 1995-1996, 2005-2006, and 2015-2016 obtained from http://www.abs.gov.au/ausstats/abs@.nsf/mf/6302.0

The raw data given in the following table, where the values represents thousands of people earning that particular income:

Income 1995-1996 2005-2006 2015-2016 ------- --------- --------- --------- 0.00 28.70 19.80 0.00 99.00 57.00 67.10 92.40 199.00 58.20 36.30 48.30 299.00 502.10 195.00 87.80 399.00 393.10 601.90 159.20 499.00 511.70 272.60 598.90 599.00 363.30 456.10 359.40 699.00 340.60 402.20 401.60 799.00 315.00 328.70 398.70 899.00 291.00 312.80 341.40 999.00 307.50 274.00 317.80 1099.00 264.60 266.50 315.30 1199.00 244.30 282.40 255.60 1299.00 204.60 295.40 271.40 1399.00 239.70 262.50 282.50 1499.00 220.80 236.20 238.30 1599.00 211.70 250.70 236.60 1699.00 200.30 216.80 251.80 1799.00 190.10 202.50 228.60 1899.00 174.20 213.80 230.90 1999.00 185.50 204.60 221.00 2199.00 263.00 369.40 426.30 2399.00 214.90 367.60 369.00 2599.00 165.70 271.60 334.80 2799.00 144.40 251.50 293.50 2999.00 120.10 219.50 248.70 3499.00 161.40 338.70 547.80 3999.00 90.00 243.20 371.40 4999.00 79.40 217.90 487.60 5000.00 77.10 227.60 509.90

Suppose the data is read into GNU Octave as a matrix `H`, then we can determine the scaled cumulative sums, and hence the Gini coefficients:

> tmp = H(:,1); > inc = cumsum(tmp)/sum(tmp); > tmp = H(:,2); > pop1 = cumsum(tmp)/sum(tmp); > tmp = H(:,3); > pop2 = cumsum(tmp)/sum(tmp); > tmp = H(:,4); > pop3 = cumsum(tmp)/sum(tmp); > gini1 = 1-sum(diff(pop1).*(inc(1:end-1)+inc(2:end))) ans = 0.57696 > gini2 = 1-sum(diff(pop2).*(inc(1:end-1)+inc(2:end))) ans = 0.42627 > gini3 = 1-sum(diff(pop3).*(inc(1:end-1)+inc(2:end))) ans = 0.29479

This means we have:

Year | Gini Coefficient |
---|---|

1995 – 1996 | 0.57696 |

2005 – 2006 | 0.42627 |

2015 – 2016 | 0.29479 |

which seems to indicate that inequality is in fact *decreasing*. We can also plot the Lorenz curves for each population set in turn, and for comparison also show the line of maximum income equality:

The Lorenz curves become closer to the straight line, which indicates a decrease in inequality over this sequence of measurements, at least as measured by the Gini coefficient.

]]>One of the outcomes of this theorem is that any attempt to map the earth’s surface onto a plane – to create a map projection – will require some compromise. You have to give up any of shape, size, angles. This hasn’t stopped cartographers trying for several thousand years to find the best compromise, and there are now many many different projections.

Note, for simplicity I will speak of the earth as a sphere, even though it’s not: it’s flattened slightly north-south, and bulges slightly at the equator, making it an oblate spheroid. But it’s pretty *close* to a sphere: its flattening, the value of with and being the minor and major axes of an ellipse from a cross-section through the poles, is only about .

Standard projections, seen in atlases, on classroom walls, and on the web, include the ancient equirectangular projection, where the earth’s surface is flattened onto a cylinder, thus vastly distorting the regions away from the equator, and the Mercator projection, where as well as flattening onto a cylinder, it is also stretched upwards – so preserving angles. This makes the Mercator projection excellent for navigation, which is what it was designed for. Mercator himself realized that there were unavoidable distortions of shape and size, and that his map was not useful for depicting landmasses. By a curious quirk of fate, this is what it is now mostly used for!

Efforts to decrease the distortions near the poles include the Robinson and Winkel tripel projections (which look superficially similar) – the latter is now the projection of choice by the National Geographic Society.

All of those projections, and many others, project the earth’s surface onto a single unbroken map. Other projections split the map to reduce distortions. One of these is the Goode Homolosine projection, which has been described as “abominable”, and a “travesty”: not only are land masses distorted, but lines of longitude bend all over the place.

Here’s a picture taken from geoawesomeness.com showing some of these standard projections:

Although you can’t map the entire sphere onto a plane, you can map small bits of it with manageable distortions. So one approach to mapping the earth was to project the sphere onto a polyhedron, and then flatten the polyhedron. Buckminster Fuller had a go at this with his Dymaxion world map, using an icosahedron. The result is certainly excellent for reducing distortion, but has an ugly, jagged look to it:

Also, many of the landmasses are (unavoidably) in curious places and at odd angles: Australia and South America are at opposite ends of the map, as the oceans have been chopped into bits to preserve the landmasses. I don’t believe this map has never got much love.

Although the icosahedron would seem to be the best choice of polyhedron because of its large number of faces (and no doubt this was Buckminster Fuller’s reasoning), the best results seems to have been obtained using an octahedron. Here is Waterman’s “butterfly projection” first developed in 1996, which is in many ways a magnificent example of good cartography:

The green circles here are Tissot indicatricies: they show the local distortions by means of small circles. As you can see, the distortions are very small indeed. And of course you can break the world up into an octahedron in such a way as to minimize distortions over particular regions. You can see more projections at the map’s own page.

Most Waterman maps show Antarctica as a separate entity, and another approach was provided many years earlier in 1909 by Bernard Cahill; Cahill’s map has been redeveloped, starting in 1975, by Gene Keyes, and Keye’s own website is a treasure house of cartographic information, as well as stern critiques of many standard projections. (This is where you’ll find Goode’s homolosine projection soundly trashed.) Keye’s version of Cahill’s map: the Cahill-Keyes projection, is as of now the best projection available:

Keyes has discussed Waterman’s map against his own, and provides various reasons why he believes the Cahill-Keyes projection is the better of the two. Remarkably, the landmasses are placed in positions and angles not vastly different from standard (Mercator, Robinson) projections we are all used to, so it doesn’t appear too strange. As with all maps, there are compromises: I’d love it if Australia and New Zealand were next to each other rather than at opposite edges. But that’s just they way the octahedron has been placed.

I believe that properly constructed polyhedral projections are the way to go, and of the several in existence, the Cahill-Keyes is the best. Even if you don’t care about the mathematics and the cartography, it simply looks terrific.

Gene Keyes very kindly responded to this post, and recommended – in view of my comments about the placements of Australia and New Zealand – that I check out another projection against a “starry night” background, and available at http://www.genekeyes.com/DW-STARRY/C-K-DW-starry.html. However, the map’s designer – Duncan Webb – was dissatisfied with this map and asked that it not be published. So I won’t include the picture here, but invite you to view it on Gene Keye’s site. But notice Australia and New Zealand in chummy proximity!

]]>One of the more powerful uses of turtle graphics is for investigating Lindenmayer Systems, named for Aristid Lindenmayer, who developed them for modelling plant growth. (As an aside, note his first name: *Aristid*. At least once I’ve seen it misspelled “Astrid”.)

To start, we can draw a simple Y-shaped tree, with a trunk, and two branches at 45 degrees from the vertical, with these commands:

fd 100 rt 45 fd 100 bk 100 lt 90 fd 100 bk 100 rt 45 bk 100

which produces something like this:

Of course we would want to be able to change the size of the tree, so we could write the procedure

to tree :size fd :size rt 45 fd :size bk :size lt 90 fd :size bk :size rt 45 bk :size end

And now with the command `tree` we can draw Y-shaped trees of any size. We have carefully designed our tree so that the turtle ends up back where we started. This means we can replace branches of the tree with smaller versions of itself:

to tree2 :size fd :size rt 45 tree :size/2 lt 90 tree :size/2 rt 45 bk :size end

Then “`cs tree2 100`” produces this:

To replace all branches with smaller copies of the tree we can perform a recursion, stopping only when the branch size reaches a certain lower limit. Like this:

to tree3 :size if :size < 2 [stop] fd :size rt 45 tree3 :size/2 lt 90 tree3 :size/2 rt 45 bk :size end

If we now enter

cs bk 150 tree3 200

(with the "`bk 150`" simply to give the tree room to grow in our graphics window), we obtain:

Now this looks like no tree found in nature. But we can easily manipulate it. All we require is for our basic, fundamental, tree, to be so designed that the turtle ends up at the start. Then we can recursively plot the branches with smaller copies of itself.

For example, here is a new basic tree:

to tree :size fd :size rt 30 fd :size/1.5 bk :size/1.5 lt 45 fd :size bk :size rt 15 bk :size end

so that "`tree 100`" produces this:

and a recursive version (which I'll now call "ltree"):

to ltree :size if :size < 2 [stop] fd :size rt 30 ltree :size/2 lt 45 ltree :size/1.5 rt 15 bk :size end

so that "`cs bk 150 ltree 150`" produces:

Already we are getting some vague semblance of a "natural" plant. Here's an example I pinched from the Ruby pages linked to above, rewritten in Logo, and with colors and widths taken out:

to ltree2 :size if :size < 5 [stop] fd :size/3 lt 30 ltree2 :size*2/3 rt 30 fd :size/6 rt 25 ltree2 :size/2 lt 25 fd :size/3 rt 25 ltree2 :size/2 lt 25 bk :size*5/6 end

And here's the result of "`cs bk 150 ltree2 250`":

(I had to do this in the online Logo interpreter at http://www.calormen.com/jslogo/, as for some reason my ucblogo kept crashing with core dumps.) But it's remarkable that with a very simple procedure we can construct a picture which already looks more like a real plant than ever.

In fact, our plants aren't Lindenmayer systems themselves, but the output of such systems. Formally, a Lindenmayer System is a grammar, or set of rules, for creating recursively defined structures. Our initial plant, the highly unlife-like one, can be expressed in the L-systems language as

angle = 45 1 -> [1 1] 0 -> [1 [0] 0]

where 1 and 0 may be interpreted as drawing a line segment, and a line segment ending in a leaf, respectively. The brackets [ and ] may be interpreted as turning left and right (by 45 degrees) respectively.

This might seem a little obtuse, and an equivalent method is given by Chris Jennings who defines this tree as

angle = 45 axiom = FX X -> S[-FX]+FX

Here F and X mean drawing the trunk and branches respectively, and S means "draw everything smaller from now on". And the term "axiom" simply means the starting value. Here it's the + and - which refer to the turns, and the brackets indicate pushing and popping objects off a stack which ensure that we maintain the current position. This means that the last line here, which is this system's "rule" can be read as:

- Make everything shorter, and
- draw a tree to the left
- draw a tree to the right

Using these symbols, another plant can be described (this is example 7 from https://en.wikipedia.org/wiki/L-system) as:

angle = 25 axiom = X X -> F[−X][X]F[−X]+FX F -> FF

In this example, the shortening is assumed. This can be turned into a Logo procedure as follows:

to ltree7 :size if :size < 5 [stop] fd :size lt 25 ltree7 :size/2 rt 25 ltree7 :size/2 fd :size/2 lt 25 ltree7 :size/2 rt 25 rt 25 fd :size/2 ltree7 :size/2 bk :size/2 lt 25 bk :size/2 bk :size end

and when invoked with

cs pu bk 250 ht pd ltree7 200 st

produces this noble plant:

Remarkable that such simple instruction sets can yield results of such beauty! It's quite possible to add colors and thicknesses to these plants to make them even more lifelike. But you get the idea.

]]>- If contains all vertices of G without repeats, and the end of P is the same as the start, then return TRUE.
- For all vertices adjacent to the end of and which are not contained in output the result of
**isHamilton**(, ). - return FALSE

For a nice introduction to backtracking, see https://www.cis.upenn.edu/~matuszek/cit594-2012/Pages/backtracking.html.

We will enter our graphs as *adjacency lists*: this is a list of lists, where each member list consists of a vertex followed by all its adjacent vertices. For example, the adjacency list of the wheel graph on 6 vertices, with 1 as the central vertex and vertices 2 – 6 in clockwork order:

would look like this:

`[[1 2 3 4 5 6] [2 1 3 6] [3 1 2 4] [4 1 3 5] [5 1 4 6] [6 1 2 5]]`

We start with two helper programs, one which tests the adjacency of two vertices, and one which tests whether a vertex can be added to an existing path:

to adjacentp :vert1 :vert2 :graph output memberp :vert1 bf item :vert2 :graph end to allowable :vert :path :graph make "bool1 memberp :vert :path make "bool2 adjacentp :vert (last :path) :graph output (and (not :bool1) :bool2) end

The Hamiltonian code now simply copies the above algorithm description:

to hamilton :path :graph print :path make "count :count + 1 if equalp (count :path) (count :graph) ~ [ifelse adjacentp (first :path) (last :path) :graph [output "TRUE] [output "FALSE]] foreach firsts :graph ~ [if allowable # :path :graph ~ [if hamilton lput # :path :graph [output "TRUE]]] output "FALSE end

The `print :path` statement does exactly that; the idea is that when the program finishes its run with a positive result, the last path printed before output TRUE will be the cycle we want. Here’s an example:

? make "wheel6 [[1 2 3 4 5 6] [2 1 3 6] [3 1 2 4] [4 1 3 5] [5 1 4 6] [6 1 2 5]] ? show hamilton [1] :wheel6 1 1 2 1 2 3 1 2 3 4 1 2 3 4 5 1 2 3 4 5 6 TRUE

As an example of a non-Hamiltonian graph, consider this simple graph on 5 vertices:

? show hamilton [1] [[1 2 4 5] [2 1 3] [3 2 4 5] [4 1 3] [5 1 3]] 1 1 2 1 2 3 1 2 3 4 1 2 3 5 1 4 1 4 3 1 4 3 2 1 4 3 5 1 5 1 5 3 1 5 3 2 1 5 3 4 FALSE

Note how in this last example, the routine explored all possible paths before giving up and returning FALSE. This is a very inefficient program; there are ways to speed up such a program, some of which are listed in the references here. When I ran this program on the Barnette-Bosák-Lederberg graph, it took several hours, and 4,173,760 calls of the program before it returned FALSE.

Clearly there are many possible improvements which could be made: for example there should be a “calling” program for which you simply enter a graph, and then the above program is used as a driver. Also, printing out all paths as they are formed is unnecessary (although it provides insight into the backtracking approach). The `allowable` program returns FALSE immediately if the path is empty: this is why in its current form the `hamilton` program needs a non-empty path at the start. We could also more conventionally enter a graph as a list of edges (for an unordered graph, each edge would be an unordered list of two elements – being the vertices at its ends), and construct an adjacency list from that. But I was only interested in a proof of concept, and in how relatively easy it was to use Logo for it.

Logo was developed at MIT in 1967 (which makes it 50 this year) by the late Seymour Papert and associates. It was designed as a “language for learning” and it’s one of the unfair twists of fate that most people’s conception of Logo (if they have one at all), is that it’s programming “for kids”; not a serious language; not worth considering by adults. Such a view is very wrong. Logo was designed according to the principle of “low threshold, high ceiling” which means that even very young children can drive a turtle around with simple commands; more adept programmers can write high-level compilers.

In many ways Logo has been superseded by more colourful, whizz-bang environments such as Scratch (also from MIT), and certainly Scratch is more immediately exciting and interesting. But Scratch is more of a kids environment. I don’t know of anybody who would seriously suggest teaching tertiary computing with Scratch. But Logo as such a beginner language has had a very persuasive advocate in Brian Harvey, whose three volumes of “Computer Science, Logo style” – still in print but also available as downloads from his website, show how Logo can be used to teach computing at a deep level.

Anyway, the apparent simplicity of Logo is one of its great strengths.

Logo is a dialect of Lisp, more by way of Scheme than Common Lisp, which should immediately give it some street cred. Although most people’s concept of Logo start and finish with Turtle Graphics, it has powerful text and list handling abilities, lots of control structures, and of course being a lisp-like language means you can write your own control structures for it.

There are many versions of Logo, but probably the closest to a “standard” would be the version developed by Brian Harvey and his students at the UC Berkeley, and known as Berkeley Logo, or just as `ucblogo`.

Here’s a little example of Logo in action: implementation of merge sort. As you may recall, this is a general purpose sorting algorithm that works by dividing the list into two roughly equal halves, recursively merge-sorting each half, and then merging the two sorted haves into one list. One of the many nice attributes of Logo is the ease of using recursion – so much so, in fact, that recursion is a more natural way of looping or repeating than standard for or while loops. So merge sort (or really, any other recursive procedure) is a doddle in Logo.

;; halve breaks up a list into two halves: [1 2 3 4 5] -> [[1 2] [3 4 5]] to halve :list localmake "n count :list localmake "h int :n/2 output list (cascade :h [lput item # :list ?] []) ~ (cascade (:n-:h) [lput item #+h :list ?] []) end ;; merge joins two sorted lists to merge :list1 :list2 if emptyp :list1 [output :list2] if emptyp :list2 [output :list1] ifelse (first :list1) < (first :list2) output se (first :list1) (merge (bf :list1) :list2) ~ output se (first :list2) (merge :list1 (bf :list2)) end ;; This is the mergesort procedure to mergesort :list if emptyp :list output [] localmake "halves halve :list output merge (mergesort first :halves) (mergesort last :halves) end

This is a pretty brain-dead implementation, all I’ve done really is copy down the definition. No doubt a better Logo programmer could use more of its clever commands to write a smaller, and no doubt faster, program. You can see another version at the Rosetta Stone site; also check out their Logo version of Quicksort. And there are also many other algorithms implemented in Logo.

]]>

As before, we invoke our numerical software:

>> N = 13; n = 1:N; a = 1./(n.^2); s = cumsum(a); >> s.' ans = 1.000000000000000 1.250000000000000 1.361111111111111 1.423611111111111 1.463611111111111 1.491388888888889 1.511797052154195 1.527422052154195 1.539767731166541 1.549767731166541 1.558032193976458 1.564976638420903 1.570893798184216 >> atk = aitken(s); >> atk(end,:).' ans = 1.570893798184216 1.604976638420905 1.623249742431601 1.633102742772414 1.638466686494426 1.641588484246542 1.641583981554060

We are a little better off here; the final cumulative sum is in error by about , and using Aitken’s process gives us a final value which is in error by 0.003. But this is nowhere near as close as we achieved earlier. And this is because Aitken’s process doesn’t work for a sequence whose convergence is *logarithmic*, that is

This is named for Samuel Lubkin, who wrote extensively about series acceleration in the 1950’s. In fact, his 1952 paper is available online. In the 1980’s J. E. Drummond at the Australian National University took up Lubkin’s cudgels and further extended and developed his methods. Drummond noticed that Aitlen’s and Lubkin’s processes were very closely linked.

First, Aitken’s process can be written as

We can check this using any symbolic package, or working it out by hand:

This last expression can be expanded to:

After some algebraic fiddlin’, we end up with

which is the initial formula for Aitken’s process.

Lubkin’s process can be written as

Given that for a sequence we have , then can be expanded as

It turns out to be more numerically stable to write in the form

of which the numerator of the fraction is equal to . A little bit of algebra shows that the denominator is equal to We can thus write the W-transformation as

So let’s experiment with

>> N = 13; n = 1:N; a = 1./n.^2; >> s = cumsum(a); >> s = cumsum(a); s(end) ans = 1.570893798184216 >> abs(s(end)-pi^2/6) ans = 0.074040268664010

So far, not a particularly good approximation, as we’d expect. So we’ll try the W-transformation, in its original form as the quotient of two second differences:

>> s0 = s(1:N-3); s1 = s(2:N-2); s2 = s(3:N-1); s3 = s(4:N); >> w = (s2./(s3-s2)-2*s1./(s2-s1)+s0./(s1-s0))./(1./(s3-s2)-2./(s2-s1)+1./(s1-s0)); >> w(end) ans = 1.644837749532052 >> abs(w(end)-pi^2/6) ans = 9.631731617454342e-05

which is a great improvement. We can do this again, simply by

>> s = w; N = N-3;

and repeating the above commands. The new values of the final result, and error, are:

>> w(end) ans = 1.644933894309522 >> abs(w(end)-pi^2/6) ans = 1.725387044348992e-07

Just as with Aitken’s process, we can whip up a little program to perform the W-transformation iteratively:

function out = lubkin(c) % Applies Lubkin's W-process to a vector c, which we suppose to % represent a sequence converging to some limit L N = length(c); M = N; % length of current vector s = reshape(c,N,1); % ensures we are working with column vectors out = s; for i = 1:floor(N/3) s0 = s(1:M-3); s1 = s(2:M-2); s2 = s(3:M-1); s3 = s(4:M); t = s2./(s3-s2)-2*s1./(s2-s1)+s0./(s1-s0); t = t./(1./(s3-s2)-2./(s2-s1)+1./(s1-s0)); tcol = zeros(N,1); tcol(3*i+1:N) = t; out = [out tcol]; M = M-3; s = t; end

Note that the generalization of these processes, using

for integers , have been explored by J. E. Drummond, and you can read his 1972 paper online here.

]]>