[{"body":"","link":"https://numbersandshapes.net/posts/","section":"posts","tags":null,"title":"Posts"},{"body":"A self avoiding walk is a path through a graph where no vertex is visited more than once. One problem is to consider the cartesian lattice, and the paths from \\((0,0)\\) to \\((m,n)\\). If we only allow paths in the \u0026ldquo;positive direction\u0026rdquo;, so that from \\((i,j)\\) the path can only proceed to \\((i+1,j)\\) or \\((i,j+1)\\) we have what are sometimes called \u0026ldquo;staircase paths\u0026rdquo;:\nThe number of such paths can be easily shown to be\n\\[ \\dbinom{m+n}{n}. \\]\nSuppose that the number of ways of reaching \\((x,y)\\) is \\(N(x,y)\\). There is only one way to reach either of \\((i,0)\\) or \\((0,j)\\), that is \\(N(i,0) = N(0,j) = 1\\). Also, the number of ways of reaching \\((x,y)\\) is the sum of the numbers of ways of reaching \\((x-1,y)\\) and \\((x,y-1)\\); that is \\(N(x,y) = N(x-1,y)+N(x,y-1)\\).\nWe thus have the following numbers:\n\\[\\begin{array}{rrrrrrr} \\vdots \\\\ 1\u0026amp;6\u0026amp;21\u0026amp;56\u0026amp;126\u0026amp;252\\\\ 1\u0026amp;5\u0026amp;15\u0026amp;35\u0026amp;70\u0026amp;126\\\\ 1\u0026amp;4\u0026amp;10\u0026amp;20\u0026amp;35\u0026amp;56\\\\ 1\u0026amp;3\u0026amp;6\u0026amp;10\u0026amp;15\u0026amp;21\\\\ 1\u0026amp;2\u0026amp;3\u0026amp;4\u0026amp;5\u0026amp;6\\\\ 1\u0026amp;1\u0026amp;1\u0026amp;1\u0026amp;1\u0026amp;1\u0026amp;\\ldots \\end{array}\\]\nwhich are obviously the binomial coefficients, and with\n\\[ N(x,y) = \\dbinom{x+y}{x}. \\]\nSo for the \\(10\\times 10\\) grid above, the number of staircase walks is\n\\[ \\dbinom{20}{10}=184{,}756. \\]\nIf we allow walks that can go in any direction (with the only restriction being that they stay within the grid, and never visit a lattice point more than once), the number of self avoiding walks becomes very large.\nThere is in fact no known formula for the number of such paths on an \\(n\\times n\\) grid, but the numbers are given as sequence A007764 of the OEIS, and for a \\(10\\times 10\\) grid there are\n\\[ 1{,}568{,}758{,}030{,}464{,}750{,}013{,}214{,}100 \\]\nsuch paths. That\u0026rsquo;s about 1.6 heptillion.\n","link":"https://numbersandshapes.net/posts/self_avoiding_walks/","section":"posts","tags":null,"title":"Self avoiding walks"},{"body":"","link":"https://numbersandshapes.net/","section":"","tags":null,"title":"What's all this, then?"},{"body":"The Asian Technology Conference in Mathematics is a friendly, genial conference which has been held almost every year since 1996. All the proceedings are freely available on that website, but there is no search method. If you want to find out if anybody has presented a paper about, say, teaching linear algebra with Python, you\u0026rsquo;re stuck.\nTo help, I created a database of all papers and authors since 1997. The 1996 proceedings only seem to exist as a PDF file for which each page is an image; to create a databse of authors, affiliations, papers, titles and abstracts would require more copying by hand that I want. Maybe some day\u0026hellip;\nThis database has been put online using the excellent open-source self-hosted system Mathesar - named, you\u0026rsquo;ll be delighted to know, for Enrico Colantoni\u0026rsquo;s brilliant character in the utterly brilliant film Galaxy Quest.\nMy implementation is at https://mathesar.numbersandshapes.net, and if you want to play around you can log in with the username guest and password g.u.e.s.t.. currently off-line, but will be made available again soon (I hope).\nThe database has two tables: one consists of all papers (well, all presentations really), which includes papers, workshops, posters and panel sessions. There is one record for each paper. The other table lists all the authors: there is a record for every author/paper combination. So if a paper has four authors, there will be four records in this table.\nThere are many faults with this database:\nAs much of the information has been scraped from the ATCM website using the Python library BeautifulSoup, there is text which hasn\u0026rsquo;t survived the scraping. A single author might exist in different forms, depending on whther initials are used, or various accented letters. For example, \u0026ldquo;Nguyễn Ngọc Trường Sơn\u0026rdquo; might appear as \u0026ldquo;Ngọc Trường Sơn Nguyễn\u0026rdquo;, or without any of the accents as \u0026ldquo;Truong Son Ngoc Nguyen\u0026rdquo; or as \u0026ldquo;Truong Son N Nguyen\u0026rdquo; or as \u0026ldquo;T S N Nguyen\u0026rdquo;, and so on. This partly depends on the spelling in the proceedings. Ensuring that all authors appear with only one spelling of their name is nearly impssible - at least for me. This in turn makes it impossible to determine precisely the number of different authors and speakers. Some titles and abstracts include mathematics typeset with LaTeX. This generally becomes garbled by the time it makes it into the database. A single author might appear several times with a different affiliation, if they\u0026rsquo;ve changed jobs, or maybe were on a sabbatical. Thus \u0026ldquo;Wolfgang A. Mozart, University of Vienna\u0026rdquo;, and \u0026ldquo;Wolfgang A. Mozart, University of Salzburg\u0026rdquo;, will appear as 2 different authors. Then again, two different people might share the same name. Are \u0026ldquo;Li Chen, Normal Universty of Beijing\u0026rdquo;, and \u0026ldquo;Li Chen, University of Shanghai\u0026rdquo; different people or not? However, even with those caveats, we can obtain a pretty good idea of the conference statistics since 1997.\nSome numbers There have been 2027 papers published, by about 2013 different authors. (The exact number of authors will be a bit less than this, but I\u0026rsquo;m hoping not by too much.)\nThe country which has provided the largest number of different authors is China; the following table shows the top ten:\nCountry Number of authors China 244 Malaysia 228 Japan 201 USA 175 Philippines 149 Singapore 90 Australia 81 Taiwan 75 South Korea 72 India 69 The most published author has appeared 37 times. (I\u0026rsquo;ve got a measly 19). Authors have come from 75 different countries. By region:\nRegion Number of authors Asia 1233 Europe 204 North America 192 Oceania 101 Middle East 77 Africa 17 Independent States 12 The \u0026ldquo;Independent states\u0026rdquo; includes Russia and previous members of the USSR.\n","link":"https://numbersandshapes.net/posts/30_years_atcm/","section":"posts","tags":null,"title":"Thirty years of the ATCM conference"},{"body":"Recall from the previous post that the Weierstrass-Durand-Kerner (WDK) method finds all roots of a polynomial equation in a manner similar to Newton\u0026rsquo;s method, and so converges quadratically. The Aberth-Ehrlich method (more commonly called Aberth\u0026rsquo;s method), is similar, but converges cubically, and as we will see, is similar to Halley\u0026rsquo;s method for root finding.\nDerivation from Halley\u0026rsquo;s method Halley\u0026rsquo;s root finding method is sometimes called the most often rediscovered method in numerical analysis. It is an iterative method defined by\n\\[ x \\leftarrow x - \\frac{f(x)f\u0026rsquo;(x)}{f\u0026rsquo;(x)^2 - \\dfrac{1}{2}f(x)f\u0026rsquo;\u0026rsquo;(x)}. \\]\nSee the Wikipedia page for the derivation.\nWe can divide through by \\(f(x)f\u0026rsquo;(x)\\) to obtain the formulation\n\\[ x \\leftarrow x - \\frac{1}{\\dfrac{f\u0026rsquo;(x)}{f(x)} - \\dfrac{1}{2}\\dfrac{f\u0026rsquo;\u0026rsquo;(x)}{f\u0026rsquo;(x)}} = x - \\left[\\frac{f\u0026rsquo;(x)}{f(x)} - \\frac{1}{2}\\frac{f\u0026rsquo;\u0026rsquo;(x)}{f\u0026rsquo;(x)}\\right]^{-1}. \\]\nwhich is very slightly easier to write,\nSuppose now we are trying to find all roots simultaneously of a monic quartic polynomial \\(p(x)\\), and that the current approximations are \\(a,b,c,d\\). And as before, define the temporary function\n\\[ t(x) = (x-a)(x-b)(x-c)(x-d). \\]\nThen we have\n\\[\\begin{aligned} t\u0026rsquo;(x) \u0026amp;= (x-a)(x-b)(x-c)+(x-a)(x-b)(x-d)\\\\ \u0026amp;\\qquad +(x-a)(x-c)(x-d)+(x-b)(x-c)(x-d) \\end{aligned}\\]\nand\n\\[\\begin{aligned} t\u0026rsquo;\u0026rsquo;(x) \u0026amp;= 2\\bigl((x-a)(x-b)+(x-a)(x-c)+(x-a)(x-d)\\bigr.\\\\ \u0026amp;\\qquad +\\bigl.(x-b)(x-c)+(x-b)(x-d)+(x-c)(x-d)\\bigr). \\end{aligned}\\]\nThen\n\\[\\begin{aligned} \\frac{t\u0026rsquo;\u0026rsquo;(a)}{t\u0026rsquo;(a)} \u0026amp;= \\frac{2\\left((x-b)(x-c)+(x-b)(x-d)+(x-c)(x-d)\\right)}{(x-b)(x-c)(x-d)}\\\\ \u0026amp; = 2\\left(\\frac{1}{x-b}+\\frac{1}{x-c}+\\frac{1}{x-d}\\right). \\end{aligned}\\]\nSubstituting this into the second fraction in the denominator above produces the iteration:\n\\[ a\\leftarrow a - \\left[\\frac{p\u0026rsquo;(a)}{p(a)}-\\left(\\frac{1}{x-b}+\\frac{1}{x-c}+\\frac{1}{x-d}\\right)\\right]^{-1}. \\]\nMore generally, if the current approximations are \\(a_1,a_2,\\ldots,a_n\\), then\n\\[ a_k \\leftarrow a_k - \\left[\\frac{p\u0026rsquo;(a_k)}{p(a_k)}-\\sum\\limits_{\\substack{i=1\\\\i\\ne k}}^n\\frac{1}{a_k-a_i}\\right]^{-1}. \\]\nThis is the Aberth-Ehrlich method.\nIt will be seen that our derivation is quite general, and will work for polynomials of any degree. If the polynonmial is not monic, or if we don\u0026rsquo;t want to divide through by the leading coefficient, the only difference is that we define\n\\[ t(x) = a_0(x-a)(x-b)(x-c)(x-d) \\]\nwhere \\(a_0\\) is the leading coefficient; in this case the coefficient of \\(x^4\\).\nImplementation As previously, we\u0026rsquo;ll use PARI/GP, and write the method in the form\n\\[ a_k \\leftarrow a_k - \\left[\\frac{p\u0026rsquo;(a_k)}{p(a_k)}-\\frac{1}{2}\\frac{t\u0026rsquo;\u0026rsquo;(a_k)}{t\u0026rsquo;(a_k)}\\right]^{-1} \\]\nwhere \\(t(x)\\) is the product of \\(x - x_k\\) for all current root approximations \\(a_k\\).\nThe function definition is very similar to that for WDK (note that we\u0026rsquo;ve included the use of the leading coefficient):\naberth(f,eps,N = 200) = { local(deg,xs,df,ys,dt,dfrac,single_diffs,mean_diffs_vector,mean_diffs,count); deg = poldegree(f); lead = pollead(f); xs = vector(deg,k,(0.6 + 0.8*I)^k); \\\\ starting values df = deriv(f); \\\\ derivative of f ys = vector(deg); \\\\ next computed values single_diffs = vector(deg); mean_diffs = 1.0; mean_diffs_vector = Vec([1.0]); count = 1; while(mean_diffs \u0026gt; eps \u0026amp;\u0026amp; count \u0026lt; N, t = lead*prod(k = 1,deg,x - xs[k]); \\\\ create temporary function t(x) dt = deriv(t,x); ddt = deriv(dt,x); \\\\ compute its first two derivatives for(k = 1, deg, dtk = subst(dt,\u0026#39;x,xs[k]); ddtk = subst(ddt,\u0026#39;x,xs[k]); fk = subst(f,\u0026#39;x,xs[k]); dfk = subst(df,\u0026#39;x,xs[k]); ys[k] = xs[k] - 1/(dfk/fk - 0.5*ddtk/dtk); single_diffs[k] = abs(ys[k] - xs[k]); xs[k] = ys[k] ); mean_diffs = vecsum(single_diffs)/deg; mean_diffs_vector = concat(mean_diffs_vector,[mean_diffs]); count = count + 1; ); return([count,mean_diffs_vector,xs]); } To see how fast it converges, we\u0026rsquo;ll try it out on a fifth degree polynomial:\n\\p2000 s = aberth(x^5 + x^2 - 7,0.1^1500); printoutput(s) which produces:\nnumber of iterations: 12 1.188550919 0.9079189478 0.09240861551 0.001667886271 7.282133417 e-9 6.098846060 e-25 3.582735741 e-73 7.263000773 e-218 6.050900797 e-652 3.498902657 e-1954 1.384316604 + 0.e-3032I 0.5300508632 + 1.457706660I -1.222209165 + 0.7797478348I -1.222209165 - 0.7797478348I 0.5300508632 - 1.457706660I You can see that the accuracy basically triples each time.\n","link":"https://numbersandshapes.net/posts/aberth-ehrlich/","section":"posts","tags":null,"title":"The Aberth-Ehrlich method"},{"body":"This is a method for finding all the roots of a polynomial simultaneously, by applying a sort of Newton-Raphson method. It gets its name from everybody associated wth it: Weierstrass published a version of the algorithm in 1891, and then rediscovered (independently) by Durand in 1960 and Kerner in 1966, as you can see at its Wikipedia page.\nIt\u0026rsquo;s easier to describe by an example:\n\\[ x^4 - 26x^2 - 75x -56 = 0 \\]\nwhich has roots \\((5\\pm\\sqrt{57})/2, (-5\\pm\\sqrt{3} i)/2\\). Numerically, the roots are\n\\[ 6.27491721763537,\\; -1.27491721763537,\\; -2.5 + 0.866025403784439i,\\; -2.5 - 0.866025403784439i \\]\nFor a fourth degree equation such as above, if one approximation to the roots is given by \\(x_0, x_1,x_2,x_3\\), then the next approximation is given by\n\\[\\begin{aligned} y_0 \u0026amp;= x_0 - \\frac{f(x_0)}{(x_0-x_1)(x_0-x_2)(x_0-x_3)}\\\\ y_1 \u0026amp;= x_1 - \\frac{f(x_1)}{(x_1-x_2)(x_1-x_3)(x_1-x_0)}\\\\ y_2 \u0026amp;= x_2 - \\frac{f(x_2)}{(x_2-x_3)(x_2-x_0)(x_2-x_1)}\\\\ y_3 \u0026amp;= x_3 - \\frac{f(x_3)}{(x_3-x_0)(x_3-x_1)(x_3-x_2)} \\end{aligned}\\]\nSuppose however we create the temporary polynomial (for this to work, we make the assumption that \\(f\\) is a monic polynomial; that is, that its leading coefficient is 1):\n\\[ t(x) = (x-x_0)(x-x_1)(x-x_2)(x-x_3). \\]\nSince\n\\[\\begin{aligned} t\u0026rsquo;(x) \u0026amp;= (x-x_1)(x-x_2)(x-x_3) + (x-x_0)(x-x_2)(x-x_3) \\\\ \u0026amp; \\qquad + (x-x_0)(x-x_1)(x-x_3) + (x-x_0)(x-x_1)(x-x_2) \\end{aligned}\\]\nwe can write the above expressions for \\(y_k\\) more simply as\n\\[ y_k = x_k - \\frac{f(x_k)}{t\u0026rsquo;(x_k)}. \\]\nwhich shows the connection with the Newton-Raphson method. Here\u0026rsquo;s how it might be done in PARI/GP - more specifically, in the scripting language gp:\ndurandkerner(f,eps,N = 200) = { local(pd,xs,ys,dsa,dss,count,dt); pd = poldegree(f); xs = vector(pd,k,(0.6 + 0.8*I)^k); ys = vector(pd); ds = vector(pd); dsa = 1.0; dss = Vec([1.0]); count = 1; while(dsa \u0026gt; eps \u0026amp;\u0026amp; count \u0026lt; N, dt = deriv(prod(k = 1,pd,x - xs[k]),x); for(k = 1,pd, ys[k] = xs[k] - subst(f,\u0026#39;x,xs[k])/subst(dt,\u0026#39;x,xs[k]); ds[k] = abs(ys[k] - xs[k]); xs[k] = ys[k] ); dsa = vecsum(ds)/pd; dss = concat(dss,[dsa]); count = count + 1; ); return([count,dss,xs]); } We\u0026rsquo;ve chosen as our beginning values \\((0.6 + 0.8i)^k\\) - one of the requirements of the method is that all computations must be in complex numbers.\nHere goes:\n\\p20 f = x^4 - 26*x^2 - 75*x - 56 s = durandkerner(f,0.1^20); print(\u0026#34;number of iterations: \u0026#34;,s[1]); print() vp = vecextract(s[2],\u0026#34;-10..-1\u0026#34;); \\\\ extracts the last ten differences {for(k = 1,10, printf(\u0026#34;%.10g\\n\u0026#34;,vp[k]))} \\\\ and prints them print() { for(k = 1,poldegree(f), \\\\ prints out the solutions printf(\u0026#34;%.15g\\n\u0026#34;,s[3][k]) ) } which produces the output:\nnumber of iterations: 17 1.123079176 0.6846519849 0.3550410771 0.1757081770 0.04978125947 0.003740061706 2.016207001 e-5 5.729914134 e-10 4.478174636 e-19 2.615741385 e-37 6.27491721763537-1.45113252725757 e-89I -2.50000000000000+0.866025403784439I -1.27491721763537+3.98066040407694 e-74I -2.50000000000000-0.866025403784439I As you see, the average absolute difference between successive iterates is about \\(3.7\\times 10^{-37}\\), obtained in 17 iterations. And the values are certainly correct. You\u0026rsquo;ll also notice that the imaginary parts of the first and third solutions are vanishingly small; in effect, zero.\nFor a bit of fun, let\u0026rsquo;s try with a high precision:\n\\p1000 f = x^4 - 26*x^2 - 75*x - 56 s = durandkerner(f,0.1^1000); the result of which (using the same printing script as above) is\nnumber of iterations: 22 0.003740061706 2.016207001 e-5 5.729914134 e-10 4.478174636 e-19 2.615741385 e-37 8.409722008 e-74 8.032655807 e-147 6.583139811 e-293 3.794211385 e-585 9.907223519 e-1170 6.27491721763537+2.96119399627067 e-2361I -2.50000000000000+0.866025403784439I -1.27491721763537+1.01326544935096 e-2339I -2.50000000000000-0.866025403784439I It\u0026rsquo;s only taken 5 more steps to get over 1000 place precision! This is because, once it hits its stride, so to speak, this method converges quadratically, which you can see in that the accuracy of the iteration bascally doubles each step.\nWe can also experiment with a large polynomial:\ndeg = 50; f = x^deg + sum(k=0,deg-1,(random(20)-10)*x^k); s = durandkerner(f,0.1^1000); print(\u0026#34;number of iterations: \u0026#34;,s[1]); print() vp = vecextract(s[2],\u0026#34;-10..-1\u0026#34;); \\\\ extracts the last ten differences {for(k = 1,10, printf(\u0026#34;%.10g\\n\u0026#34;,vp[k]))} which produces the output\nnumber of iterations: 56 4.520878142 e-6 7.817132307 e-9 9.072720843 e-14 1.471330591 e-23 3.873798136 e-43 2.685287774 e-82 1.290323278 e-160 2.979297878 e-317 1.588344551 e-630 4.514465082 e-1257 (If you try this you\u0026rsquo;ll get different numbers, seeing as you\u0026rsquo;ll be sarting with a different random polynomial.) We won\u0026rsquo;t print out all the solutions, as there are too many, but we can print out the absolute values \\(\\|f(x_k)\\|\\) of the function at each solution:\n{ for(k=1,deg, printf(\u0026#34;%.10g\\n\u0026#34;,abs(subst(f,\u0026#39;x,s[3][k]))) ) } which provides a list of 50 valaus beginning (in my case) with:\n2.395686462 e-2332 9.801085659 e-2386 2.362011441 e-2291 4.227497886 e-2284 4.335737935 e-2330 These are vanishingly small; in effect zero.\nPlaying with the algorithm To try this algorithms, open up the PARI/GP site, and go to \u0026ldquo;Try GP in your browser\u0026rdquo;, or just go here. Then you can cut and paste the Durand-Kerner function into a cell and play with it.\nFor ease of playing, here are a couple of \u0026ldquo;helper\u0026rdquo; functions for printing the output. The function printcomplex simply adds more space around the + or - in a complex number, and the function printoutput does just that.\nprintcomplex(z) = { local(rz,iz); rz = real(z); iz = imag(z); if(imag(z) \u0026lt; 0, printf(\u0026#34;%.10g - %.10gI\\n\u0026#34;,rz,-iz), printf(\u0026#34;%.10g + %.10gI\\n\u0026#34;,rz,iz)); } printoutput(s) = { print(\u0026#34;number of iterations: \u0026#34;,s[1]); print(); if(length(s[2]) \u0026lt; 11, foreach(s[2],z,printf(\u0026#34;%.10g\\n\u0026#34;,z)), foreach(vecextract(s[2],\u0026#34;-10..-1\u0026#34;),z,printf(\u0026#34;%.10g\\n\u0026#34;,z))); print(); foreach(s[3],z,printcomplex(z)); } So, open up GP \u0026ldquo;in your browser\u0026rdquo;, and in a cell enter the durandkerner, printcomplex, and printoutput functions. You can enlarge the cell from the bottom right. Press the \u0026ldquo;Evaluate with PARI\u0026rdquo; function, or simply Shift+Enter.\nOpen up a new cell, and enter something like\n\\p50 f = x^4 + x^3 - 19 sol = durandkerner(f,0.1^50); printoutput(sol) Enjoy!\n","link":"https://numbersandshapes.net/posts/weierstrass-durand-kerner/","section":"posts","tags":null,"title":"The Weierstrass-Durand-Kerner method"},{"body":"Recap, and background Two posts ago we showed how, given four points in the plane in general position, but with a few restrictions, it was possible to find two parabolas through those points. We used computer algebra.\nThe steps were:\nCreate four equations\n\\[ (Ax_i+By_i)^2+Cx_i+Dy_i+E=0 \\]\nfor each of the four \\((x_i,y_i)\\) coordinates.\nSolve the first three equations for \\(C\\), \\(D\\) and \\(E\\): the results will be expressions in \\(A\\) and \\(B\\).\nSubstitute the values from the previous step into the last equation and solve for \\(A\\) and \\(B\\) - there will in general be two solutions.\nSubstitute those \\(A\\) and \\(B\\) values into the expressions for \\(C\\), \\(D\\) and \\(E\\) to obtain the parabola equations.\nAn example As an example, with the four points:\n\\[ (2,3),\\quad (2,1),\\quad (-4,1),\\quad (1,0). \\]\nwe have the equations (which we note are linear in \\(C\\), \\(D\\) and \\(E\\)):\n\\[\\begin{gathered} (2A+3B)^2+2C+3D+E=0\\\\ (2A+B)^2+2C+D+E=0\\\\ (-4A+B)^2-4C+D+E=0\\\\ (A)^2+C+E=0\\end{gathered}\\]\nfor which we find from the first three that\n\\[\\begin{aligned} C\u0026amp;=2A^2-2AB\\\\ D\u0026amp;=-4AB-4B^2\\\\ E\u0026amp;=-8A^2+4AB+3B^2 \\end{aligned}\\]\nor alternatively, that\n\\[\\begin{bmatrix}C\\\\ D\\\\ E\\end{bmatrix}=\\begin{bmatrix*}[r]2\u0026amp;-1\u0026amp;0\\\\ 0\u0026amp;-2\u0026amp;-4\\\\ -8\u0026amp;2\u0026amp;3\\end{bmatrix*}\\begin{bmatrix}A^2\\\\ 2AB\\\\ B^2\\end{bmatrix}\\]\nSubstituting these expressions into the last equation produces (after a bit of algebraic simplification):\n\\[ -5A^2+2AB+3B^2=0 \\]\nwhich has the two solutions\n\\[ A = s, B = -\\frac{5}{4}s\\qquad\\mathrm{and}\\qquad A=t,\\;B=t \\]\nfor arbitrary \\(s\\) and \\(t\\). Substituting these back into the expressions for \\(C\\), \\(D\\) and \\(E\\):\n\\[\\begin{aligned} C\u0026amp;=\\frac{16}{3}s^2\u0026amp;D\u0026amp;=-\\frac{40}{9}s^2\u0026amp;E\u0026amp;=-\\frac{19}{3}s^2\\\\[2mm] C\u0026amp;=0\u0026amp;D\u0026amp;=-8t^2\u0026amp;E\u0026amp;=-t^2 \\end{aligned}\\]\nThese expressions can now be used for the parabola equations. In each equation, the squared parameter \\(s^2\\) or \\(t^2\\) can be factored out.\nNumerical approach Here we show how to do the same thing numerically.\nRather than explain in full algebraic detail, we\u0026rsquo;ll simply work through an example, from which the general method will be obvious.\nWe start with the four points \\((x_i,y_i)\\):\n\\[ (2,3),\\quad (2,1),\\quad (-4,1),\\quad (1,0). \\]\nWe first create a \\(3\\times 3\\) matrix \\(M\\) for which the first two columns are the first three \\(x\\) and \\(y\\) values, and the last column is all ones:\n\\[ M=\\begin{bmatrix*}[r]2\u0026amp;3\u0026amp;1\\\\ 2\u0026amp;1\u0026amp;1\\\\ -4\u0026amp;1\u0026amp;1\\end{bmatrix*} \\]\nNext is a \\(4\\times 3\\) matrix \\(N\\) whose rows consist of the values \\(x^2,\\;2xy,\\; y^2\\) for each \\((x,y)\\) coordinates:\n\\[ N = \\begin{bmatrix*}[r]4\u0026amp;12\u0026amp;9\\\\ 4\u0026amp;4\u0026amp;1\\\\ 16\u0026amp;-8\u0026amp;1\\\\ 1\u0026amp;0\u0026amp;0\\end{bmatrix*} \\]\nNext multiply the inverse of \\(M\\) by the top three rows of \\(N\\):\n\\[ P=\\begin{bmatrix*}[r]2\u0026amp;3\u0026amp;1\\\\ 2\u0026amp;1\u0026amp;1\\\\ -4\u0026amp;1\u0026amp;1\\end{bmatrix*}^{-1}\\begin{bmatrix*}[r]4\u0026amp;12\u0026amp;9\\\\ 4\u0026amp;4\u0026amp;1\\\\16\u0026amp;-8\u0026amp;1\\end{bmatrix*}=\\begin{bmatrix*}[r]2\u0026amp;-2\u0026amp;0\\\\ 0\u0026amp;-4\u0026amp;-4\\\\ -8\u0026amp;4\u0026amp;3\\end{bmatrix*} \\]\nWhat this matrix contains, of course, are the coefficients of \\(A^2\\), \\(AB\\) and \\(B^2\\) in the initial expressions for \\(C\\), \\(D\\) and \\(E\\).\nThen compute\n\\[ Q=\\begin{bmatrix}1\u0026amp;0\u0026amp;1\\end{bmatrix}\\begin{bmatrix*}[r]2\u0026amp;-2\u0026amp;0\\\\ 0\u0026amp;-4\u0026amp;-4\\\\ -8\u0026amp;4\u0026amp;3\\end{bmatrix*}+\\begin{bmatrix}1\u0026amp;0\u0026amp;0\\end{bmatrix}=\\begin{bmatrix*}[r]-5\u0026amp;2\u0026amp;3\\end{bmatrix*} \\]\nHere, the first matrix on the right consists of the last (so far unused) coordinate values and a 1; the central matrix is \\(P\\) (just computed) and the last matrix is the bottom row of \\(N\\). Thus:\n\\[ Q=\\begin{bmatrix}x_4\u0026amp;y_4\u0026amp;1\\end{bmatrix}P+\\begin{bmatrix}n_{41}\u0026amp; n_{42}\u0026amp; n_{43}\\end{bmatrix} \\]\nThe values of \\(Q\\) are the coefficients \\(a,b,c\\) for the final equation \\(aA^2+2bAB+cB^2=0\\) for \\(A\\) and \\(B\\).\nNow, it is easy to show that the solutions of the equation\n\\[ aA^2+2bAB+cB^2=0 \\]\nare\n\\[ A=s,\\quad B = \\frac{-b+\\sqrt{b^2-ac}}{c}s\\qquad\\mathrm{and}\\qquad A=t,\\quad B=\\frac{-b-\\sqrt{b^2-ac}}{c}t \\]\nfor arbitrary values of \\(s\\) and \\(t\\). Since in our case we have \\(a=-5\\), \\(b=1\\) and \\(c=3\\):\n\\[ A=s,\\quad B = \\frac{-1+\\sqrt{1+15}}{3}s=s\\qquad\\mathrm{and}\\qquad A=t,\\quad B=\\frac{-1-\\sqrt{1+15}}{3}t=-\\frac{5}{3}t \\]\nWe can set \\(s=t=1\\), or we can set \\(s=1\\) and \\(t=3\\) (which eliminates fractions) and substitute those values for \\(A\\) and \\(B\\) into the equations for \\(C\\), \\(D\\) and \\(E\\). So putting \\(s=1\\):\n\\[\\begin{bmatrix}C\\\\D\\\\ E\\end{bmatrix}=\\begin{bmatrix*}[r]2\u0026amp;-2\u0026amp;0\\\\ 0\u0026amp;-4\u0026amp;-4\\\\ -8\u0026amp;4\u0026amp;3\\end{bmatrix*}\\begin{bmatrix*}[r]1\\\\ 1\\\\ 1\\end{bmatrix*}=\\begin{bmatrix*}[r]0\\\\ -8\\\\ -1\\end{bmatrix*}\\]\nIf we put \\(t=1\\), in the second solution, then \\(A = 1\\) and \\(B = -5/3\\), from which \\(C\\), \\(D\\) and \\(E\\) can be easily obtained.\nFinishing off From above, we have the two solutions, for\n\\[\\begin{aligned} A,B,C,D,E\u0026amp;=1,1,0,-8,1\\\\ \u0026amp; = 1,\\dfrac{5}{3},\\dfrac{16}{3},-\\dfrac{40}{9},-\\dfrac{19}{3} \\end{aligned}\\]\nThis produces the two parabolas:\n\\[\\begin{aligned} \u0026amp;(x+y)^2-8y+1=0\\\\ \u0026amp;(3x+5y)^2+48x-40y-57=0 \\end{aligned}\\]\nwhere in the last equation every coefficient has been multiplied by 9 to clear all the fractions.\nUsing Python With NumPy:\n[] import numpy as np [] import numpy.linalg as la [] xs = np.array([2,2,-4,1]) [] ys = np.array([3,1,1,0]) [] M = np.matrix([xs[:-1],ys[:-1],[1,1,1]]).T [] N = np.matrix([xs**2,2*xs*ys,ys**2]).T [] P = -la.inv(M1)*N[:-1,:] [] Q = np.matrix([[xs[-1],ys[-1],1]])*P+N[-1,:] [] [a,b,c] = Q.tolist()[0] [] b/=2 [] b0 = (-b+np.sqrt(b*b-a*c))/c [] b1 = (-b-np.sqrt(b*b-a*c))/c [] AB = np.matrix([[1,b0,b0*b0],[1,b1,b1*b1]]).T [] R = P*AB [] coeffs = np.vstack([AB[:-1,:],R]).T [] display(coeffs) matrix([[ 1.00000000e+00, 1.00000000e+00, 2.77555756e-17, -8.00000000e+00, -1.00000000e+00], [ 1.00000000e+00, -1.66666667e+00, 5.33333333e+00, -4.44444444e+00, -6.33333333e+00]]) The parabolas can then be plotted using the parameterization given in the previous post.\n","link":"https://numbersandshapes.net/posts/parabola_numeric/","section":"posts","tags":null,"title":"Parabolas, numerically"},{"body":"It is (well?) known that if \\(x = at^2+bt+c\\) and \\(y=pt^2+qt+r\\), then\n\\[ (Ax+By)^2+Cx+Dy+E=0 \\]\nwhere\n\\[\\begin{aligned} A\u0026amp;=p\\\\ B\u0026amp;=-a\\\\ C\u0026amp;=qv_2-2pv_1\\\\ D\u0026amp;=-bv_2+2av_1\\\\ E\u0026amp;=v_1^2-v_2v_3 \\end{aligned}\\]\nwith \\(\\langle v1, v2, v3\\rangle =\\langle a,b,c\\rangle \\times \\langle p,q,r\\rangle\\); that is, the \\(v_i\\) values are the elements of the cross product of the vectors of the coefficients.\nIn other words, two quadratic functions parameterize a parabola.\nFinding the equations which parameterise a given parabola Suppose we have a general parabola given by \\((Ax+By)^2+Cx+Dy+E=0\\). Clearly we need to choose coefficients of \\(t^2\\) in \\(x\\) and \\(y\\) in such a way that they cancel out in the first bracket.\nWe can start with, say\n\\[\\begin{aligned} x\u0026amp;=Bt^2+bt+c\\\\ y\u0026amp;=-(At^2+qt+r) \\end{aligned}\\]\nThen\n\\[\\begin{aligned} (Ax+By)^2\u0026amp;=((ABt^2+Abt+Ac)-(BAt^2+Bqt+Br))^2\\\\ \u0026amp;=((Ab-Bq)t+(Ac-Br))^2\\\\ \u0026amp;=(Ab-Bq)t^2+2(Ab-Bq)(Ac-Br)t+(Ac-Br)^2 \\end{aligned}\\]\nAdding this to the rest of the expression (Cx+Dy+E) and collecting \u0026ldquo;like terms\u0026rdquo;, we end up with:\n\\[\\begin{aligned} t^2:\u0026amp;\\quad (Ab-Bq)^2+CB-DA=0\\\\ t:\u0026amp;\\quad 2(Ab-Bq)(Ac-Br)+Cb-Dq = 0\\\\ 1:\u0026amp;\\quad (Ac-Br)^2+Cc-Dr+E=0 \\end{aligned}\\]\nFrom the first equation, if \\(b=s\\), say, then\n\\[ q = \\frac{\\sqrt{DA-CB}-As}{B}. \\]\nBut we don\u0026rsquo;t want square roots if we can avoid them. One thing we can do is to introduce an extra multiplicative variable into the original equations:\n\\[\\begin{aligned} x\u0026amp;=k(Bt^2+bt+c)\\\\ y\u0026amp;=-k(At^2+qt+r) \\end{aligned}\\]\nThen the three equations corresponding to the coefficients \\(t^2,t,1\\) can be written as\n\\[\\begin{aligned} \u0026amp;(Ab+Bq)^2k^2+(BC-AD)k=0\\\\ \u0026amp;2(A b + B q)(A c + B r)k^2 + (Cb+Dq)k=0\\\\ \u0026amp;(Ac+Br)^2k^2+(Cc+Dr)k+E=0 \\end{aligned}\\]\nThe first equation can be solved to produce two solutions:\n\\[ b=s,\\quad q = \\frac{-Aks+\\sqrt{(AD-BC)k}}{Bk} \\]\nand\n\\[ b=t,\\quad q = \\frac{-Akt-\\sqrt{(AD-BC)k}}{Bk} \\]\nClearly to eliminate the square root we can set \\(k=1/(AD-BC)\\). This produces the solutions\n\\[ b=s,\\quad q = \\frac{-As-(AD-BC)}{B} \\]\nand\n\\[ b=t,\\quad q = \\frac{-At+(AD-BC)}{B} \\]\nSince \\(s\\) and \\(t\\) are arbitrary values, and we only want one solution to our equations, we can choose \\(s=-D\\) and \\(t=D\\). Then both solutions collapse to:\n\\[ b=D,\\quad q = \\pm C. \\]\nIn fact, we can do the entire solution using a computer algebra system (in this case SageMath). We start by creating the variables we need, finishing with a polynomial in \\(t\\):\n\u0026lt;Sage\u0026gt;: var(\u0026#39;A,B,C,D,E,x,y,t,b,c,q,r\u0026#39;) \u0026lt;Sage\u0026gt;: x = k*(B*t^2 + b*t +c) \u0026lt;Sage\u0026gt;: y = -k*(A*t^2 + q*t + r) \u0026lt;Sage\u0026gt;: p = (A*x + B*y)^2 + C*x + D*y + E \u0026lt;Sage:\u0026gt; pc = p.expand().poly(t) The equations are the coefficients of this polynomial, which we require to be equal to zero:\n\u0026lt;Sage\u0026gt;: eqns = [pc.coefficient(t,i).subs({k:1/(A*D-B*C)}).factor().numerator() for i in range(3)] \u0026lt;Sage\u0026gt;: sols = solve(eqs,[b,c,q,r],solution_dict=True) Now we substitute in D for the free variable given by \\(p\\), and simplify:\n\u0026lt;Sage\u0026gt;: s = sols[0][b].free_variables()[0] \u0026lt;Sage\u0026gt;: [sols[0][z].subs({s:-D}).full_simplify() for z in [b,c,q,r]] This produces the output\n\\[ [-D, BE, -C, AE] \\]\nwhich means that the general parabola \\((Ax+By)^2+Cx+Dy+E=0\\) can be parameterized by\n\\[ x = \\frac{Bt^2 - Dt +BE}{AD-BC},\\qquad y = \\frac{-At^2 + Ct -AE}{AD-BC} \\]\n","link":"https://numbersandshapes.net/posts/parabola_parameterization/","section":"posts","tags":null,"title":"Parameterization of the parabola"},{"body":"Introduction It is (or should be) well known that a parabola has the cartesian form\n\\[ (Ax+By)^2+Cx+Dy+E = 0. \\]\nThis looks as though there are five values needed, but we can divide through in such a way as to make any of the coefficients we like equal to 1:\n\\[ (Px+Qy)^2+Rx+Sy+1 = 0. \\]\nand so we see that only four values are needed to define a parabola. A parabola thus has one more degree of freedom than a circle, for which only three values are needed:\n\\[ (x-A)^2+(y-B)^2 = C^2. \\]\nFor any four points \u0026ldquo;in general position\u0026rdquo; in the plane, there will be two parabolas that pass through those points. The purpose here is to explore how that can be done.\nThere are some explanations at mathpages and also here but the confusion of algebra and the old-fashioned typesetting makes these articles hard to read.\nIt\u0026rsquo;s easiest to do it by an example first, then discuss the general method afterwords.\nOne way - a bit more complicated than necessary Suppose we start with four points\n\\[ (0,-3),\\quad (-3,3),\\quad (1,-3),\\quad (-3,-1) \\]\nBy substituting each of these values into the first equation above, we obtain four equations:\n\\begin{gather} (-3B)^2 + 3D + E = 0\\\\ (-3A+3B)^2-3C+3D+E=0\\\\ (A-3B)^2+C-3D+E=0\\\\ (-3A-B)^2-3C-D-E=0 \\end{gather}\nThe plan of attack is this:\nSolve the first three equations for \\(C\\), \\(D\\) and \\(E\\). The results will be expressions in \\(A\\) and \\(B\\). Substitute the results from step 1 into the last equation, and solve for \\(A\\) and \\(B\\). There will in general be two solutions. Substitute the newly found values of \\(A\\) and \\(B\\) into the expressions for \\(C\\) and \\(D\\) to obtain the parabola equations. And of course we can do all of this in Sage. Here\u0026rsquo;s step 1:\nvar(\u0026#39;A,B,C,D,E,x,y\u0026#39;) xs,ys = [0,-3,1,-3],[-3,3,-3,-1] eqns = [(A*x+B*y)^2+C*x+D*y+E for x,y in zip(xs,ys)] sols1 = solve(eqns[:-1],[C,D,E],solution_dict=True) sols1 This produces:\n\\[ \\left[\\left\\{C : -A^{2} + 6 \\, A B, D : -2 \\, A^{2} + 6 \\, A B, E : -6 \\, A^{2} + 18 \\, A B - 9 \\, B^{2}\\right\\}\\right] \\]\nNow for step 2:\nsols2 = solve(eqns[-1].subs(sols1[0]),[A,B], solution_dict=True) This produces:\n\\[ \\left[\\left\\{A : r_{1}, B : -r_{1}\\right\\}, \\left\\{A : r_{2}, B : r_{2}\\right\\}\\right] \\]\nNote that the parameters in which \\(A\\) and \\(B\\) are given may well depend on how many such computations you\u0026rsquo;ve done. Sage will simple keep increasing the parameter index each time parameters are needed. (When I performed this particular calculation, the indices were in fact 925 and 926.) You can however re-set the indices by prefixing the above command as follows:\nmaxima_calculus.reset() sols2 = solve(eqns[-1].subs(sols1[0]),[A,B], solution_dict=True) Before we substitute back into the expressions for \\(C\\), \\(D\\), \\(E\\), we turn the parameters into variables:\np = sols2[0][A].free_variables()[0] q = sols2[1][A].free_variables()[0] Now for step 3:\ns0 = sols2[0] s1 = sols2[1] para0 = (s0[A]*x+s0[B]*y)^2+sols1[C].subs(s0)*x+sols1[D].subs(s0)*y+sols1[E].subs(s0) parab0 = (para0.factor()/p^2).numerator().expand() para1 = (s1[A]*x+s1[B]*y)^2+sols1[C].subs(s1)*x+sols1[D].subs(s1)*y+sols1[E].subs(s1) parab1 = (para1.factor()/q^2).numerator().expand() At this point, the equations of the two parabolas are\n\\[ x^{2} - 2 x y + y^{2} - 7 x - 8 y - 33 = 0,\\qquad x^{2} + 2 x y + y^{2} + 5 x + 4 y + 3 = 0 \\]\nor, to conform with the general equation from above:\n\\[ (x-y)^2-7x-8y-33=0,\\quad (x+y)^2+5x+4y+3=0. \\]\nThey can be displayed, along with the initial points, like this:\nb = 3 xmin,xmax = min(xs)-b,max(xs)+b ymin,ymax = min(ys)-b,max(ys)+b p1 = implicit_plot(parab0,(x,xmin,xmax),(y,ymin,ymax)) p2 = implicit_plot(parab1,(x,xmin,xmax),(y,ymin,ymax),color=\u0026#39;green\u0026#39;) p3 = list_plot([(s,t) for s,t in zip(xs,ys)],plotjoined=False,marker=\u0026#34;o\u0026#34;,size=60,color=\u0026#39;red\u0026#39;,faceted=True) p1+p2+p3 A slightly simpler way Shift the points so that one of them is at the origin. For example, with the points above, we could shift by \\((0,3)\\) to obtain\n\\[ (0,0),\\quad (-3,6),\\quad (1,0),\\quad (-3,2) \\]\nSince the parabolas must pass through the origin, we must have \\(E=0\\), so that we are looking for expressions of the form\n\\[ (Ax+By)^2+Cx+Dy=0 \\]\nand we only need to use the three points away from the origin. Then we work very similarly to above, except that we will only use three points and three equations.\na,b = xs[0],ys[0] xs,ys = [x-a for x in xs[1:]],[y-b for y in ys[1:]] eqns = [(A*x+B*y)^2+C*x+D*y+E for x,y in zip(xs,ys)] sols1 = solve(eqns[:-1],[C,D,E],solution_dict=True)[0] sols1 The outcome of all this is\n\\[ \\left\\{C : -A^{2}, D : -2 \\, A^{2} + 6 \\, A B - 6 \\, B^{2}\\right\\} \\]\nNow we can substitute and solve for \\(A\\) and \\(B\\) as above:\nmaxima_calculus.reset() sols2 = solve(eqns[-1].subs(sols1),[A,B], solution_dict=True) Continuing as before:\np = sols2[0][A].free_variables()[0] q = sols2[1][A].free_variables()[0] s0 = sols2[0] s1 = sols2[1] para0 = (s0[A]*x+s0[B]*y)^2+sols1[C].subs(s0)*x+sols1[D].subs(s0)*y parab0 = (para0.factor()/p^2).numerator().expand() para1 = (s1[A]*x+s1[B]*y)^2+sols1[C].subs(s1)*x+sols1[D].subs(s1)*y parab1 = (para1.factor()/q^2).numerator().expand() The two parabolas here are\n\\[ x^{2} - 2 x y + y^{2} - x - 14y ,\\quad x^{2} + 2 x y + y^{2} - x - 2 y \\]\nTo get back to the parabolas we want, simply shift back:\nparab0.subs({x:x-a,y:y-b}).expand() parab1.subs({x:x-a,y:y-b}).expand() Another example \\[ (x,y) = (2,3),\\quad (2,1),\\quad (-4,1),\\quad (1,0). \\]\nRepeat the above commands with the points shifted to include the origin:\nxs0, ys0 = [2,2,-4,1], [3,1,1,0] a,b = xs0[0],ys0[0] xs,ys = [x-a for x in xs0[1:]],[y-b for y in ys0[1:]] eqns = [(A*x+B*y)^2+C*x+D*y+E for x,y in zip(xs,ys)] sols1 = solve(eqns[:-1],[C,D,E],solution_dict=True)[0] Now substitute into the last equation and solve for \\(A\\) and \\(B\\):\nsols2 = solve(eqns[-1].subs(sols1),[A,B], solution_dict=True) Finally, substitute both equations back to obtain \\(C\\) and \\(D\\), and factor out the parameter:\np = sols2[0][A].free_variables()[0] q = sols2[1][A].free_variables()[0] s0 = sols2[0] s1 = sols2[1] para0 = (s0[A]*x+s0[B]*y)^2+sols1[C].subs(s0)*x+sols1[D].subs(s0)*y parab0 = (para0.factor()/p^2).numerator().expand() para1 = (s1[A]*x+s1[B]*y)^2+sols1[C].subs(s1)*x+sols1[D].subs(s1)*y parab1 = (para1.factor()/q^2).numerator().expand() We could obtain the same result a bit more easily by letting the variables p and q be in a tuple, similarly for s0 and s1, and for the parabolas. This would cut the commands down to four instead of eight.\nBefore plotting, we need to substitute back in the original \\(x\\) and \\(y\\) values by another shift:\nparab0 = parab0.subs({x:x-a,y:y-b}).expand() parab1 = parab1.subs({x:x-a,y:y-b}).expand() Now plot them; we need to refer to the original xs0, ys0 values:\nb = 3 xmin,xmax = min(xs0)-b,max(xs0)+b ymin,ymax = min(ys0)-b,max(ys0)+b p1 = implicit_plot(parab0,(x,xmin,xmax),(y,ymin,ymax)) p2 = implicit_plot(parab1,(x,xmin,xmax),(y,ymin,ymax),color=\u0026#39;green\u0026#39;) p3 = list_plot([(s,t) for s,t in zip(xs0,ys0)],plotjoined=False,marker=\u0026#34;o\u0026#34;,size=60,color=\u0026#39;red\u0026#39;,faceted=True) p1+p2+p3 ","link":"https://numbersandshapes.net/posts/four_point_parabolas/","section":"posts","tags":null,"title":"Four point parabolas"},{"body":"Although the method is simple to describe, the algebra becomes messy when written in full generality. For example, suppose we use the second method, with three points \\((x_1,y_1)\\), \\((x_2,y_2)\\), \\((x_3,y_3)\\) none of which are at the origin.\nThe three equations are\n\\begin{gather} (Ax_1+By_1)^2+Cx_1+Dy_1=0\\\\ (Ax_2+By_2)^2+Cx_2+Dy_2=0\\\\ (Ax_3+By_3)^2+Cx_3+Dy_3=0 \\end{gather}\nSolving the first two for \\(C\\) and \\(D\\) produces:\n\\[ C = \\frac{(Ax_2+By_2)^2y_1-(Ax_1+By_1)^2y_2}{x_1y_2-x_2y_1},\\quad D = \\frac{(Ax_1+By_1)^2x_2-(Ax_2+By_2)^2x_1}{x_1y_2-x_2y_1} \\]\nIt will simplify matters to introduce the notation\n\\[ v_{ij}=x_iy_j-x_jy_i. \\]\nThe discussion at mathpages does much the same thing, but treats the \\(v\\) values as the elements of the cross product of the vectors \\([x_1,x_2,x_3]\\) and \\([y_1,y_2,y_3]\\).\nNow, substituting into the last equation produces an equation of the form\n\\[ aA^2+2bAB+cB^2=0 \\]\nwhere\n\\begin{aligned} a \u0026amp; = -v_{23}x_1^2+v_{13}x_2^2-v_{12}x_3^2\\\\ b \u0026amp; = -v_{23}x_1y_1+v_{13}x_2y_2-v_{12}x_3y_3\\\\ c \u0026amp; = -v_{23}y_1^2+v_{13}y_2^2-v_{12}y_3^2 \\end{aligned}\nThe solutions are then\n\\begin{aligned} A\u0026amp;=r, \u0026amp; B \u0026amp;= \\frac{-br+\\sqrt{b^2-acr}}{c}\\\\ A\u0026amp;=s, \u0026amp; B \u0026amp;= \\frac{-bs-\\sqrt{b^2-acs}}{c} \\end{aligned}\nThe values \\(a\\), \\(b\\) and \\(c\\) can all be expressed as the negative determinants:\n\\[ a = -\\begin{vmatrix}x_1^2\u0026amp;x_2^2\u0026amp;x_3^2\\\\ x_1\u0026amp;x_2\u0026amp;x_3\\\\ y_1\u0026amp;y_2\u0026amp;y_3\\end{vmatrix},\\qquad b = -\\begin{vmatrix}x_1y_1\u0026amp;x_2y_2\u0026amp;x_3y_3\\\\ x_1\u0026amp;x_2\u0026amp;x_3\\\\ y_1\u0026amp;y_2\u0026amp;y_3\\end{vmatrix},\\qquad c = -\\begin{vmatrix}y_1^2\u0026amp;y_2^2\u0026amp;y_3^2\\\\ x_1\u0026amp;x_2\u0026amp;x_3\\\\ y_1\u0026amp;y_2\u0026amp;y_3\\end{vmatrix}. \\]\nThe next step would be to substitute these values into the expressions above for \\(C\\) and \\(D\\), but as you see we\u0026rsquo;re already getting to the reasonable limit of complexity for algebraic expressions. Substituting the first pair of values for \\(A\\) and \\(B\\) into the equation for \\(C\\) produces an utterly hideous expression!\n","link":"https://numbersandshapes.net/posts/general-four_point_parabolas/","section":"posts","tags":null,"title":"General expressions"},{"body":"A bicentric heptagon is one for which all vertices lie on a circle, and for which all edges are tangential to another circle. If \\(R\\) and \\(r\\) are the radii of the outer and inner circles respectively, and \\(d\\) is the distance between their centres, there is an expression which relates the three values when a bicentric heptagon can be formed.\nTo start, define\n\\[ a = \\frac{1}{R+d},\\quad b = \\frac{1}{R-d},\\quad c = \\frac{1}{r} \\]\nand then:\n\\[ E_1 = -a^2+b^2+c^2,\\quad E_2 = a^2-b^2+c^2,\\quad E_3 = a^2+b^2-c^2 \\]\nThe expression we want is:\n\\[ E_1E_2E_3+2abE_1E_2 -2bcE_2E_3-2acE_1E_3=0. \\]\nSee the page at Wolfram Mathworld for details.\nHowever, a bicentric heptagon can exist in three forms: a convex polygon, and two stars.\nThe above expression, impressive though it is (even more so when it is rewritten in terms of \\(R\\), \\(r\\) and \\(d\\)), doesn\u0026rsquo;t give any hint as to which values give rise to which form of polygon.\nHowever, suppose we scale the heptagon by setting \\(R=1\\). We can then rewrite the above expression as a polynomial is \\(r\\), whose coefficients are functions of \\(d\\):\n\\begin{multline*} 64d^2r^6-32(d^2+1)(d^4-1)r^5-16d^2(d^2-1)^2r^4+8(d^2-1)^3(3d^2+1)r^3\\\\ -4(d^2-1)^4r^2-4(d^2-1)^5r+(d^2-1)^6=0. \\end{multline*}\nand this can be simplified with the substitutions \\(u=d^2-1\\) and \\(x=2r\\):\n\\[ (u+1)x^6-u(u+1)(u+2)^2x^5-u^2(u+1)x^4+u^3(3u+4)x^3-u^4x^2-2u^5x+u^6=0. \\]\nSince \\(R=1\\), it follows that \\(d\\) (and so also \\(u\\)) is between 0 and 1, and it turns out that in this range the sextic polynomial equation above has four real roots, of which only three can be used. For the other root \\(d+r\u0026gt;1\\), which would indicate the inner circle not fully contained in the outer circle.\nYou can play with this polynomial here:\nThen the different forms of the bicentric heptagon correspond with the different roots; the root with the largest absolute value produces a convex polygon, the root with the smallest absolute value produces the star with Schläfli symbol \\({7:3}\\) (which is the \u0026ldquo;pointiest\u0026rdquo; star), and the other root to the star with symbol \\({7:2}\\). Look at the table on the Wikipedia page just linked, and the column for heptagons.\nHere are the heptagons, which because of Poncelet\u0026rsquo;s Porism, can be dragged around (if the diagram doesn\u0026rsquo;t update, refresh the page; it should work):\n","link":"https://numbersandshapes.net/posts/bicentric_heptagons/","section":"posts","tags":null,"title":"Bicentric heptagons"},{"body":"","link":"https://numbersandshapes.net/tags/geometry/","section":"tags","tags":null,"title":"Geometry"},{"body":"","link":"https://numbersandshapes.net/tags/jsxgraph/","section":"tags","tags":null,"title":"Jsxgraph"},{"body":"","link":"https://numbersandshapes.net/tags/mathematics/","section":"tags","tags":null,"title":"Mathematics"},{"body":"Introduction Poncelet\u0026rsquo;s porism or Poncelet\u0026rsquo;s closure theorem is one of the most remarkable results in plane geometry. It is most easily described in terms of circles: suppose we have two circles \\(C\\) and \\(D\\), with \\(D\\) lying entirely inside \\(C\\). Pick a point \\(p_0\\) on \\(C\\), and find the tangent from \\(p_0\\) to \\(D\\). Let \\(p_1\\) be the other intersection of the tangent line at \\(C\\). So the line \\(p_0 - p_1\\) is a chord of \\(C\\) which is tangential to \\(D\\). Continue creating \\(p_2\\), \\(p_3\\) and so on. The porism claims that: If at some stage these tangents \u0026ldquo;join up\u0026rdquo;; that is, if there is a point \\(p_k\\) equal to \\(p_0\\), then the tangents will join up for any initial choice of \\(p_0\\) on \\(C\\).\nThe polygon so created from the vertices \\(p_0,\\, p_1,\\,\\cdots,p_k\\) is called a bicentric polygon: all its vertices lie on one circle \\(C\\), and all its edges are tangential to another circle \\(D\\).\nIf \\(r\\) and \\(R\\) are the radii of \\(D\\) and \\(C\\) respectively, and \\(d\\) is the distance between their centres, much effort has been expended over the past two centuries determining conditions on these three values for an \\(n\\) sided bicentric polygon to exist. Euler established that for triangles:\n\\[ \\frac{1}{R+d}+\\frac{1}{R-d}=\\frac{1}{r} \\]\nor that\n\\[ R^2-2Rr-d^2=0. \\]\nEuler\u0026rsquo;s amanuensis, Nicholas Fuss (who would marry one of Euler\u0026rsquo;s granddaughters) determined that for bicentric quadrilaterals:\n\\[ \\frac{1}{(R+d)^2}+\\frac{1}{(R-d)^2}=\\frac{1}{r^2} \\]\nor that\n\\[ (R^2-d^2)^2=2r^2(R^2+d^2). \\]\nLooking at the first expressions, you might hope that \\(n\\) sided polygons might have similarly nice expressions. Unfortunately, the expressions get considerably more complicated as \\(n\\) increases, and the only way to write them succinctly is with a sequence of substitutions.\nThere is a good demonstration and explanation at Wolfram Mathworld which has examples of some further expressions.\nAn example with two circles Here\u0026rsquo;s an example with a quadrilateral. To use it, move the point \\(A\\) along the \\(x\\) axis. You\u0026rsquo;ll see that the inner circle changes size according to Fuss' formula. Then you can drag the circled point around the outer circle to demonstrate the porism.\nAn example with non-circular conic sections Poncelet\u0026rsquo;s porism ia in fact a result for conic sections, not just circles. However, circles are easy to work with and define - as seen above, just three parameters are needed to define two circles. This means that nobody has tried to develop similar formulas to Euler and Fuss for general conic sections: the complexity is simply too great. In the most general form, five points are needed to fix a conic section. That is: given any five points in general position, there will be a unique conic section passing through all of them.\nHere\u0026rsquo;s how this figure works:\nThe green dots define the interior ellipse (two foci and a point on the ellipse). They can be moved any way you like.\nThe red points on the ellipse: \\(p_0\\), \\(p_1\\), \\(p_2\\), \\(p_3\\) and \\(p_4\\) can be slid around the ellipse.\nThe tangents to these points and their intersections define a pentagon, whose vertices define a larger ellipse.\nWhen you have a nice shape that you like, use the button \u0026ldquo;Hide initial pentagon\u0026rdquo;. All current labels will vanish, and you\u0026rsquo;ll have one circled point which can be dragged around the outer ellipse to demonstrate the porism.\nWhat happens if you allow two of the points \\(p_i\\) to \u0026ldquo;cross over\u0026rdquo;?\nA note on the diagrams These were created with the amazing JavaScript library JSXGraph which is a very powerful tool for creating interactive diagrams. I am indebted to the many answers I\u0026rsquo;ve received to questions on its Google group, and in particular to its lead developer, Professor Dr Albert Wassermann from the University of Bayreuth, who has been unfailingly generous with his time and detail in answering my many queries.\n","link":"https://numbersandshapes.net/posts/non_circular_poncelet/","section":"posts","tags":["mathematics","geometry","jsxgraph"],"title":"Poncelet's porism on non-circular conic sections"},{"body":"","link":"https://numbersandshapes.net/tags/","section":"tags","tags":null,"title":"Tags"},{"body":"A totally different approach to dithering is error diffusion. Here, the image is scanned pixel by pixel. Each pixel is thresholded t0 1 or 0 depending on whether the pixel value is greater than 0.5 or not, and the error - the difference between the pixel value and its threshold - is diffuse across neighbouring pixels.\nThe first method was developed by Floyd and Steinberg, who proposed the following diffusion:\n\\[\\frac{1}{16}\\begin{bmatrix} \u0026amp;*\u0026amp;7\\\\ 3\u0026amp;5\u0026amp;1 \\end{bmatrix}\\]\nWhat this means is that if the current pixel\u0026rsquo;s value is \\(m_{ij}\\), it will be thresholded to 1 or 0 depending on whether its value is greater than 0.5 or not. We set\n\\[t_{ij}=\\left\\{\\begin{array}{ll}1\u0026amp;\\mbox{if $m\\_{ij}\u003e0.5$}\\\\ 0\u0026amp;\\mbox{otherwise}\\end{array}\\right.\\]\nAlternatively,\n\\[t_{ij} = m_{ij}\u0026gt;0.5\\]\nassuming that the right hand expression returns 1 for true, 0 for false. Then the error is\n\\[e_{ij}=m_{ij}-t_{ij}\\].\nThe surrounding pixels are then updated by fractions of this error:\n\\[\\begin{aligned} m_{i,j+1}\u0026amp;=m_{i,j+1}+\\frac{7}{16}e_{ij}\\\\ m_{i+1,j-1}\u0026amp;=m_{i+1,j-1}+\\frac{3}{16}e_{ij}\\\\ m_{i+1,j}\u0026amp;=m_{i+1,j}+\\frac{5}{16}e_{ij}\\\\ m_{i+1,j+1}\u0026amp;=m_{i+1,j+1}+\\frac{1}{16}e_{ij} \\end{aligned}\\]\nThere is a Julia package for computing error diffusion called DitherPunk, but in fact the basic logic can be easily managed:\nJulia\u0026gt; r,c = size(img) Julia\u0026gt; out = Float64.(image) Julia\u0026gt; fs = [0 0 7;3 5 1]/16 Julia\u0026gt; for i = :r-1 for j = 2:c-1 old = out[i,j] new = round(old) out[i,j] = new error = old - new out[i:i+1,j-1:j+1] += error * fs end end To ensure that the final image is the same size as the original, we can just take the central, changed pixels, and pad them by replication:\nJulia\u0026gt; padding = Pad(:replicate, (0,1),(1,1)) Julia\u0026gt; out = paddarray(out[1:r-1, c-1:c+1], padding) Julia\u0026gt; out = Gray.(abs.(out)) Here\u0026rsquo;s the result applied to the bridge image from before, again with the original image for comparison:\nIf you compare this dithered image with the halftoned image from the last blog post, you\u0026rsquo;ll notice some slight cross-hatching in the halftoned image; this is an artefact of the repetition of the B4 Bayer matrix. Such an artefact doesn\u0026rsquo;t exist in the error-diffused image. To make that comparison easier, here they both are, with the half-tone image on the left, and the image with error diffusion on the right:\nError diffusion has the added advantage of being able to produce an output of more than two levels; this allows the number of colours in an image to be reduced while at the same time reducing the degradation of the image.\nIf we want \\(k\\) output grey values, from 0 to \\(k-1\\), all we do is change the definition of new in the loop given above to\nnew = round(old * (k-1)) / (k-1) Here, for example, is the same image with 4 and with 8 grey levels:\nThe above process for dithering can easily be adjusted for any other dither matrices, of which there are many, for example:\nJarvice-Judice-Ninke:\n\\[\\frac{1}{48}\\begin{bmatrix} 0\u0026amp; 0\u0026amp; *\u0026amp; 7\u0026amp; 5\\\\3\u0026amp; 5\u0026amp; 7\u0026amp; 5\u0026amp; 3\\\\1\u0026amp; 3\u0026amp; 5\u0026amp; 3\u0026amp; 1 \\end{bmatrix}\\]\nStucki:\n\\[\\frac{1}{42}\\begin{bmatrix} 0\u0026amp; 0\u0026amp; *\u0026amp; 8\u0026amp; 4\\\\2\u0026amp; 4\u0026amp; 8\u0026amp; 4\u0026amp; 2\\\\1\u0026amp; 2\u0026amp; 4\u0026amp; 2\u0026amp; 1 \\end{bmatrix}\\]\nAtkinson:\n\\[\\frac{1}{8}\\begin{bmatrix} 0\u0026amp; *\u0026amp; 1\u0026amp; 1\\\\1\u0026amp; 1\u0026amp; 1\u0026amp; 0\\\\0\u0026amp; 1\u0026amp; 0\u0026amp; 0 \\end{bmatrix}\\]\nBurke:\n\\[\\frac{1}{32}\\begin{bmatrix} 0\u0026amp; 0\u0026amp; *\u0026amp; 8\u0026amp; 4\\\\2\u0026amp; 4\u0026amp; 8\u0026amp; 4\u0026amp; 2 \\end{bmatrix}\\]\nSierra:\n\\[\\frac{1}{32}\\begin{bmatrix} 0\u0026amp; 0\u0026amp; *\u0026amp; 5\u0026amp; 3\\\\2\u0026amp; 4\u0026amp; 5\u0026amp; 4\u0026amp; 2\\\\0\u0026amp; 2\u0026amp; 3\u0026amp; 2\u0026amp; 0 \\end{bmatrix}\\]\nTwo row Sierra:\n\\[\\frac{1}{16}\\begin{bmatrix} 0\u0026amp; 0\u0026amp; *\u0026amp; 4\u0026amp; 3\\\\1\u0026amp; 2\u0026amp; 3\u0026amp; 2\u0026amp; 1 \\end{bmatrix}\\]\nSierra Lite\n\\[\\frac{1}{4}\\begin{bmatrix} 0\u0026amp; *\u0026amp; 2\\\\ 1\u0026amp; 1\u0026amp; 0 \\end{bmatrix}\\]\nNote that the DitherPunk package is named after a very nice blog post discussing dithering. This post makes reference to another post which discusses dithering in the context of the game Return of the Obra Dinn. This game is set in the year 1807, and to obtain a sort of \u0026ldquo;antique\u0026rdquo; look, its developer used dithering techniques extensively to render all scenes using \u0026ldquo;1-bit graphics\u0026rdquo;. A glimpse at its trailer will show you how well this has been done. (Note: I haven\u0026rsquo;t played the game; I\u0026rsquo;m not a game player. But as with all games there are plenty of videos - some many hours long - about the game and playing it.)\nColour images Dithering of colour images is very simple: simple dither each of the red, green and blue colour planes, then put them all back together. Assuming the above process for grey level dithering has been encapsulated in a function called dither, we can write:\nfunction cdither(image;nlevels = 2) image_rgb = channelview(image) rd = dither(image_rgb[1,:,:],nlevels) gd = dither(image_rgb[2,:,:],nlevels) bd = dither(image_rgb[3,:,:],nlevels) image_d = colorview(RGB, rd,gd,bd) return(image_d) end Here, for example, are the results of dithering at two and four levels of the \u0026ldquo;lighthouse\u0026rdquo; image from the test images database:\nThe first image shows some artefacts as noise, but is still remarkably clear; the second image is surprisingly good. If the three images (original, two-level dither, four-level dither) are saved into variables img, img_d2, img_d4, then:\nJulia\u0026gt; [length(unique(x) for x in [img,img_d2,img_d4]]\u0026#39; 1×3 adjoint(::Vector{Int64}) with eltype Int64: 29317 8 34 So the second image, clear as it is, uses only 34 distinct colours as opposed to the nearly 30000 of the original image.\n","link":"https://numbersandshapes.net/posts/error_diffusion/","section":"posts","tags":["julia","image-processing"],"title":"Image dithering (2): error diffusion"},{"body":"","link":"https://numbersandshapes.net/tags/image-processing/","section":"tags","tags":null,"title":"Image-Processing"},{"body":"","link":"https://numbersandshapes.net/tags/julia/","section":"tags","tags":null,"title":"Julia"},{"body":"Image dithering, also known as half-toning, is a method for reducing the number of colours in an image, while at the same time trying to retain as much of its \u0026ldquo;look and feel\u0026rdquo; as possible. Originally this was required for newspaper printing, where no shades of grey were possible, and only black and white could be printed. So a light grey area would be printed as a few dots of black, but mostly white, and a dark grey area with mostly black, with some white spots. One of the obvious problems is that the image resolution would be decreased, but in fact the human visual system can interpret an image even after a loss of information.\nAssuming an image to have grey scales between 0.0 (black) and 1.0 (white), one way of half-toning is to threshold the image against copies of the so-called \u0026ldquo;Bayer\u0026rdquo; matrices, of which the first two are:\n\\[B_2 = \\frac{1}{4} \\begin{bmatrix} 0\u0026amp;2\\\\ 3\u0026amp;1 \\end{bmatrix}, \\qquad B_4 = \\frac{1}{16} \\begin{bmatrix} 0\u0026amp;8\u0026amp;2\u0026amp;10\\\\ 12\u0026amp;4\u0026amp;14\u0026amp;6\\\\ 3\u0026amp;11\u0026amp;1\u0026amp;9\\\\ 15\u0026amp;7\u0026amp;13\u0026amp;5 \\end{bmatrix}\\]\nSee the Wikipedia page for discussions and derivation.\nAnd here\u0026rsquo;s a quick example in Julia:\nJulia\u0026gt; using Images, FileIO, TestImages Julia\u0026gt; img = Gray.(testimage(\u0026#34;walkbridge.tif\u0026#34;)); Julia\u0026gt; B4 = Gray.(N0f8.(1/16*[0 8 2 10;12 4 14 6;3 11 1 9;15 7 13 5])); Julia\u0026gt; B512 = repeat(B4,128,128); Julia\u0026gt; img_halftone = Gray(img .\u0026gt; B512); Julia\u0026gt; mosaic(img,img_halftone,nrow=1,npad=10,fillcolour=1) Note that\nJulia\u0026gt; length(unique(img)), length(unique(img_halftone)) 256, 2 The original image had 256 different grey levels, the new image has only 2 - yet, even if it is a much poorer image, it still retains a lot of the pictorial aspects of the original.\n","link":"https://numbersandshapes.net/posts/image_halftoning/","section":"posts","tags":["julia","image-processing"],"title":"Image dithering (1): half toning"},{"body":"","link":"https://numbersandshapes.net/tags/computation/","section":"tags","tags":null,"title":"Computation"},{"body":"In the previous post, we saw that a small change to the method of false position provided much faster convergence, while retaining its bracketing.\nThis was the Illinois method which is only one of a whole host of similar methods, some of which converge even faster.\nAnd as a reminder, here\u0026rsquo;s its definition, with a very slight change:\nGiven \\(x_{i-1}\\) and \\(x_i\\) that bracket a root and their function values \\(f_{i-1}\\), \\(f_i\\), first compute the secant value\n\\[ x_{i+1}=\\frac{x_{i-1} f_i - x_i f_{i-1}}{f_i - f_{i-1}}. \\]\nand let \\(f_{i+1}=f(x_{i+1})\\). Then:\nif \\(f_if_{i+1}\u0026lt;0\\), replace \\((x_{i-1},f_{i-1})\\) with \\((x_i,f_i)\\) if \\(f_if_{i+1}\u0026gt;0\\), replace \\((x_{i-1},f_{i-1})\\) with \\((x_{i-1},\\gamma f_{i-1})\\) with \\(\\gamma=0.5\\). In each case we replace \\((x_i,f_i)\\) by \\((x_{i+1},f_{i+1})\\).\nMuch research since has been investigating possible scaling values for \\(\\gamma\\). If \\(\\gamma\\) is to be constant, then it can be shown that \\(\\gamma=0.5\\) is optimal. But \\(\\gamma\\) need not be constant.\nThe Pegasus method This was defined by Dowell \u0026amp; Jarratt, whose form of the Illinois method we used in the last post; see their article \u0026ldquo;The \u0026lsquo;Pegasus\u0026rsquo; method for computing the root of an equation.\u0026rdquo; BIT Numerical Mathematics 12 (1972),pp503-508.\nHere we use\n\\[ \\gamma = \\frac{f_i}{f_i+f_{i+1}} \\]\nAnd here\u0026rsquo;s the Julia function for defining it; basically the same as the Illinois function of the previous post, with the differences both of \\(\\gamma\\) and of showing the absolute difference between successive iterations:\nfunction pegasus(f,a,b;num_iter = 20) if f(a)*f(b) \u0026gt; 0 error(\u0026#34;Values given are not guaranteed to bracket a root\u0026#34;) else fa, fb = f(a), f(b) c = b for k in 1:num_iter c_old = c c = b - fb*(b-a)/(fb-fa) fc = f(c) if fb*fc \u0026lt; 0 a, fa = b, fb else fa = fa*fb/(fb+fc) end b, fb = c, fc @printf(\u0026#34;%2d: %1.60f, %1.15e\\n\u0026#34;,k,c,abs(c_old-c)) #println(c,\u0026#34;, \u0026#34;,abs(c_old-c)) end end end And with the same function \\(f(x)=x^5-2\\):\nJulia\u0026gt; pegasus(f,BigFloat(\u0026#34;2\u0026#34;),BigFloat(\u0026#34;3\u0026#34;)) 1: 1.032258064516129032258064516129032258064516129032258064516128, 9.677419354838710e-01 2: 1.058249216160286723536401978566436704093370062081736874593804, 2.599115164415769e-02 3: 1.095035652659330505424147240084976534578418952240763525441535, 3.678643649904378e-02 4: 1.131485704080638653175790037904708402277455906284414289399986, 3.645005142130815e-02 5: 1.147884687198048718506398066361222614002745776909137282797727, 1.639898311741007e-02 6: 1.148720321893174344989720370927796480725850707839146637042703, 8.356346951256265e-04 7: 1.148698323855563082475350143443841164951825177665336110156093, 2.199803761126251e-05 8: 1.148698354995843974508573437584337692570858791644071608835544, 3.114028089203322e-08 9: 1.148698354997035006796886004382924209539326506468986782249386, 1.191032288312567e-12 10: 1.148698354997035006798626946777931199507739956238584846007635, 1.740942395006990e-21 11: 1.148698354997035006798626946777927589443850889097797494571041, 3.610063889067141e-33 12: 1.148698354997035006798626946777927589443850889097797505513712, 1.094267079108742e-53 13: 1.148698354997035006798626946777927589443850889097797505513712, 0.000000000000000e+00 14: 1.148698354997035006798626946777927589443850889097797505513711, 1.244603055572228e-60 15: 1.148698354997035006798626946777927589443850889097797505513711, 0.000000000000000e+00 16: 1.148698354997035006798626946777927589443850889097797505513712, 1.244603055572228e-60 17: 1.148698354997035006798626946777927589443850889097797505513711, 1.244603055572228e-60 18: 1.148698354997035006798626946777927589443850889097797505513711, 0.000000000000000e+00 19: 1.148698354997035006798626946777927589443850889097797505513712, 1.244603055572228e-60 20: 1.148698354997035006798626946777927589443850889097797505513711, 1.244603055572228e-60 This is indeed faster than the Illinois method, with an efficiency index of $E\u0026asymp; 1.64232$$.\nFor another example, compute that value of Lambert\u0026rsquo;s W function \\(W(100)\\); this is the solution of \\(xe^x-100=0\\); Lambert\u0026rsquo;s function is the inverse of \\(y=xe^x\\).\nJulia\u0026gt; pegasus(x-\u0026gt;x*exp(x)-100,BigFloat(\u0026#34;3\u0026#34;),BigFloat(\u0026#34;4\u0026#34;),num_iter=11) 1: 3.251324125460162218273541775021703846514592591436893826064677, 7.486758745398378e-01 2: 3.340634428196726809051441645629843725138212599102446141079158, 8.931030273656459e-02 3: 3.380778785367380815168698736373250080216344738301354721341292, 4.014435717065401e-02 4: 3.385665875268764090984779722929318950325131118803147296896820, 4.887089901383276e-03 5: 3.385630033731458485133511320033990390684414498737973972888242, 3.584153730560585e-05 6: 3.385630140287712138407043176029520651934410763166067578057173, 1.065562536532735e-07 7: 3.385630140290050184887119964591950428421045978948047751759305, 2.338046480076789e-12 8: 3.385630140290050184888244364529728481623112988246686294800948, 1.124399937778053e-21 9: 3.385630140290050184888244364529726867491694170157806679271792, 1.614131418818089e-33 10: 3.385630140290050184888244364529726867491694170157806680386175, 1.114382727073817e-54 11: 3.385630140290050184888244364529726867491694170157806680386175, 0.000000000000000e+00 Other similar methods These differ only in the definition of the scaling value \\(\\gamma\\); several are given by J. A. Ford in \u0026ldquo;Improved Algorithms of Illinois-Type for the Numerical Solution of Nonlinear Equations\u0026rdquo; University of Essex, Department of Computer Science (1995). This article is happily available online.\nFollowing Ford, define\n\\[ \\phi_k=\\frac{f_{k+1}}{f_k} \\]\nand then:\nMethod \\(\\gamma\\) Efficiency Index Anderson \u0026amp; Björck if \\(f_i\u0026gt;f_{i+1}\\) then \\(1-\\phi_i\\) else 0.5 1.70998 or 1.68179 Ford method 1 \\((1-\\phi_i-\\phi_{i-1})/(1+\\phi_i-\\phi_{i-1})\\) 1.55113 Ford method 2 \\((1-\\phi_i)/(1-\\phi_{i-1})\\) 1.61803 Ford method 3 \\(1-(\\phi_i/(1-\\phi_{i-1}))\\) 1.70998 Ford method 4 \\(1-\\phi_i-\\phi_{i-1}\\) 1.68179 Ford method 5 \\((1-\\phi_i)/(1+\\phi_i-\\phi_{i-1})\\) not given The efficiency of the Anderson-Björck method depends on whether the sign of\n\\[ K = \\left(\\frac{c_2}{c_1}\\right)^{\\!2}-\\frac{c_3}{c_1} \\]\nis positive or negative, where\n\\[ c_k = \\frac{f^{(k)}(x^*)}{k!},\\;k\\ge 1 \\]\nand \\(x^*\\) is the solution. Note that the \\(c_k\\) values are simply the coefficients of the Taylor series expansion of \\(f(x)\\) about the root \\(x^*\\); that is\n\\[ f(x-x^*) = c_1x+c_2x^2+c_3x^3+\\cdots \\]\n","link":"https://numbersandshapes.net/posts/pegasus_method/","section":"posts","tags":["mathematics","julia","computation"],"title":"The Pegasus and related methods for solving equations"},{"body":"Such a long time since a last post! Well, that\u0026rsquo;s academic life for you \u0026hellip;\nIf you look at pretty much any modern textbook on numerical methods, of which there are many, you\u0026rsquo;ll find that the following methods will be given for the solution of a single non-linear equation \\(f(x)=0\\):\ndirect iteration, also known as fixed-point iteration bisection method method of false position, also known as regula falsi secant method Newton\u0026rsquo;s method, also known as the Newton-Raphson method Occasionally a text might specify, or mention, one or two more, but these five seem to be the \u0026ldquo;classic\u0026rdquo; methods. All of the above have their advantages and disadvantages:\ndirect iteration is easy, but can\u0026rsquo;t be guaranteed to converge, nor converge to a particular solution bisection is guaranteed to work (assuming the function \\(f(x)\\) is continuous in a neighbourhood of the solution), but is very slow method of false position is supposed to improve on bisection, but has its own problems, including also slow convergence the secant method is quite fast (order of convergence is about 1.6) but is not guaranteed to converge always Newton\u0026rsquo;s method is fast (quadratic convergence), theoretically very straightforward, but does require the computation of the derivative \\(f\u0026rsquo;(x)\\) and is not guaranteed to converge. Methods such as the Brent-Dekker-van Wijngaarden method - also known as Brent\u0026rsquo;s method, Ridder\u0026rsquo;s method, both of which may be considered as a sort of amalgam of bisection and inverse quadratic interpolation, are generally not covered in introductory texts, although some of the newer methods are both simple and guaranteed to converge quickly. All these methods have the advantage of not requiring the computation of the derivative.\nThis blog post is about a variation of the method of false position, which is amazingly simple, and yet extremely fast.\nThe Illinois method This method seems to go back to 1953 when it was published in an internal memo at the University of Illinois Computer Laboratory by J. N. Snyder, \u0026ldquo;Inverse interpolation, a real root of $f(x)=0\u0026rdquo;, University of Illinois Digital Computer Laboratory, ILLIAC I Library Routine H1-71 4\nSince then it seems to have been called the \u0026ldquo;Illinois method\u0026rdquo; by almost everybody, although a few writers are now trying to name it \u0026ldquo;Snyder\u0026rsquo;s method\u0026rdquo;.\nTo start, note a well known problem with false position: if the function is concave (or convex) in a neighbourhood of the root including the bracketing interval, then the values will converge from one side only:\nThis slows down convergence. Snyder\u0026rsquo;s insight was that if this behaviour started: that is, if there were two consecutive iterations \\(x_i\\), \\(x_{i+1}\\) on one side of the root, then for the next iteration the secant would be computed not with the function value \\(f(x_n)\\) on the other side of the root, but half that function value \\(f(x_n)/2\\).\nThe algorithm for making this work has been described thus by M. Dowell and P. Jarratt (\u0026ldquo;A modified regula falsi method for computing the root of an equation\u0026rdquo;, BIT 11, 1971, pp168 - 174.) At each stage we keep track of the \\(x\\) values \\(x_i\\) and the corresponding function values \\(f_i=f(x_i)\\). As usual \\(x_{i-1}\\) and \\(x_i\\) bracket the root:\nStart by performing the usual secant operation\n\\[ x_{i+1}=x_i-f(x_i)\\frac{x_i-x_{i-1}}{f(x_i)-f(x_{i-1})}=\\frac{x_{i-1}f(x_i)-x_if(x_{i-1})}{f(x_i)-f(x_{i-1})} \\] and set \\(f_{i+1}=f(x_{i+1})\\). Then:\nif \\(f_if_{i+1}\u0026lt;0\\), replace \\((x_{i-1},f_{i-1})\\) with \\((x_i,f_i)\\) if \\(f_if_{i+1}\u0026gt;0\\), replace \\((x_{i-1},f_{i-1})\\) with \\((x_{i-1},f_{i-1}/2)\\) In each case we replace \\((x_i,f_i)\\) by \\((x_{i+1},f_{i+1})\\).\nBefore we show how fast this can be, here\u0026rsquo;s a function to perform false position, in Julia:\nfunction false_pos(f,a,b;niter = 20) if f(a)*f(b)\u0026gt;0 error(\u0026#34;Values are not guaranteed to bracket a root\u0026#34;) else for k in 1:niter c = b - f(b)*(b-a)/(f(b)-f(a)) if f(b)*f(c)\u0026lt;0 a = b end b = c @printf(\u0026#34;%2d: %.15f, %.15f, %.15f\\n\u0026#34;,k,a,b,abs(a-b)) end end end Note that we have adopted the Dowell-Jarratt logic, so that if \\(f_if_{i+1}\u0026gt;0\\), then we do nothing. And here \\(a\\), \\(b\\), \\(c\\) correspond to \\(x_{i-1}\\), \\(x_i\\), and \\(x_{i+1}\\).\nThis function, as you see, doesn\u0026rsquo;t so much return a value, but simply prints out the current bracketing values, along with their difference. Here\u0026rsquo;s an example:\nJulia\u0026gt; f(x) = x^5 - 2 Julia\u0026gt; false_pos(f,0.5,1.5) 1: 1.500000000000000, 0.760330578512397, 0.739669421487603 2: 1.500000000000000, 0.936277160385007, 0.563722839614993 3: 1.500000000000000, 1.041285513445667, 0.458714486554333 4: 1.500000000000000, 1.097156710176020, 0.402843289823980 5: 1.500000000000000, 1.124679454971997, 0.375320545028003 6: 1.500000000000000, 1.137668857062543, 0.362331142937457 7: 1.500000000000000, 1.143668984638562, 0.356331015361438 8: 1.500000000000000, 1.146412444361109, 0.353587555638891 9: 1.500000000000000, 1.147660927013766, 0.352339072986234 10: 1.500000000000000, 1.148227852400553, 0.351772147599447 11: 1.500000000000000, 1.148485034668113, 0.351514965331887 12: 1.500000000000000, 1.148601651598731, 0.351398348401269 13: 1.500000000000000, 1.148654519726769, 0.351345480273231 14: 1.500000000000000, 1.148678485212590, 0.351321514787410 15: 1.500000000000000, 1.148689348478166, 0.351310651521834 16: 1.500000000000000, 1.148694272572126, 0.351305727427874 17: 1.500000000000000, 1.148696504543074, 0.351303495456926 18: 1.500000000000000, 1.148697516236793, 0.351302483763207 19: 1.500000000000000, 1.148697974810134, 0.351302025189866 20: 1.500000000000000, 1.148698182668834, 0.351301817331166 In fact, because of the problem we showed earlier, which this example exemplifies, the distance between bracketing values converges to a non-zero value (hence is more-or-less irrelevant as a measure of convergence), and it\u0026rsquo;s the values in the second column which converge to the root.\nSince \\(2^{1/5}=1.148698354997035\\), we have got only 6 decimal places at 20 iterations, which is no faster than bisection.\nHere\u0026rsquo;s the Illinois method. Note that it is very similar to the above false position function, except that we are also keeping track of function values:\nfunction illinois(f,a,b;num_iter = 20) if f(a)*f(b) \u0026gt; 0 error(\u0026#34;Values given are not guaranteed to bracket a root\u0026#34;) else fa, fb = f(a), f(b) for k in 1:num_iter c = b - fb*(b-a)/(fb-fa) fc = f(c) if fb*fc \u0026lt; 0 a, fa = b, fb else fa = fa/2 end b, fb = c, fc @printf(\u0026#34;%2d: %.15f, %.15f, %.15f\\n\u0026#34;,k,a,b,abs(a-b)) end end end And with the same function and initial bracketing values as before:\njulia\u0026gt; illinois(f,0.5,1.5) 1: 1.500000000000000, 0.760330578512397, 0.739669421487603 2: 1.500000000000000, 0.936277160385007, 0.563722839614993 3: 1.500000000000000, 1.113315730198992, 0.386684269801008 4: 1.113315730198992, 1.179659804462764, 0.066344074263773 5: 1.179659804462764, 1.146786019205345, 0.032873785257419 6: 1.179659804462764, 1.148597847114352, 0.031061957348412 7: 1.148597847114352, 1.148787731780184, 0.000189884665832 8: 1.148787731780184, 1.148698339356448, 0.000089392423736 9: 1.148787731780184, 1.148698354994601, 0.000089376785583 10: 1.148698354994601, 1.148698354999468, 0.000000000004867 11: 1.148698354994601, 1.148698354997035, 0.000000000002434 12: 1.148698354994601, 1.148698354997035, 0.000000000002434 13: 1.148698354997035, 1.148698354997035, 0.000000000000000 14: 1.148698354997035, 1.148698354997035, 0.000000000000000 15: 1.148698354997035, 1.148698354997035, 0.000000000000000 16: 1.148698354997035, 1.148698354997035, 0.000000000000000 17: 1.148698354997035, 1.148698354997035, 0.000000000000000 18: 1.148698354997035, 1.148698354997035, 0.000000000000000 19: 1.148698354997035, 1.148698354997035, 0.000000000000000 20: 1.148698354997035, 1.148698354997035, 0.000000000000000 Here the bracketing values do indeed get close together so that their difference converges to zero (as you\u0026rsquo;d expect), and also by 13 iterations we have obtained 15 decimal place accuracy. Dowell and Jarratt show that this method has an efficiency index \\(E = 3^{1/3}\\approx 1.442\\); here \\(E=p/C\\) where \\(p\\) is the order of convergence and \\(C\\) is the \u0026ldquo;cost\u0026rdquo; per iteration (measured in terms of arithmetic operations). As we see in the example below, the number of correct significant figures roughly triples every three iterations.\nA more useful output can be obtained by slightly adjusting the above function so that it prints out the most recent values, and their successive absolute differences. Even better, we can use BigFloat to see a few more digits:\nJulia\u0026gt; setprecision(200) Julia\u0026gt; illinois(f,BigFloat(\u0026#34;0.5\u0026#34;),BigFloat(\u0026#34;1.5\u0026#34;)) 1: 0.760330578512396694214876033057851239669421487603305785123967, 7.396694214876033e-01 2: 0.936277160385007078769511771881822637553598228912035496092550, 1.759465818726104e-01 3: 1.113315730198991546263163399473256828664229343275663554233298, 1.770385698139845e-01 4: 1.179659804462764330162645293397126720145901264006900448944524, 6.634407426377278e-02 5: 1.146786019205344856070782528108967774498747386650573905592277, 3.287378525741947e-02 6: 1.148597847114352112643459413647963815311973435038986609382286, 1.811827909007257e-03 7: 1.148787731780183703868308039549154052483519742061776232119911, 1.898846658315912e-04 8: 1.148698339356448159061880000968862518745749828816440530467845, 8.939242373554481e-05 9: 1.148698354994601301571564951679026859974778940076712488922696, 1.563815314250968e-08 10: 1.148698354999467954514826379562077090994393405782639396087335, 4.866652943261428e-12 11: 1.148698354997035006798616637583092713250045478960431687721718, 2.432947716209742e-12 12: 1.148698354997035006798626946777927545774018995974486715443598, 1.030919483483252e-23 13: 1.148698354997035006798626946777927633113682781851136780227430, 8.733966378587665e-35 14: 1.148698354997035006798626946777927589443850889097797505513711, 4.366983189275334e-35 15: 1.148698354997035006798626946777927589443850889097797505513711, 0.000000000000000e+00 16: 1.148698354997035006798626946777927589443850889097797505513712, 1.244603055572228e-60 17: 1.148698354997035006798626946777927589443850889097797505513711, 1.244603055572228e-60 18: 1.148698354997035006798626946777927589443850889097797505513711, 0.000000000000000e+00 19: 1.148698354997035006798626946777927589443850889097797505513712, 1.244603055572228e-60 20: 1.148698354997035006798626946777927589443850889097797505513711, 1.244603055572228e-60 The speed of reaching 60 decimal place accuracy is very much in keeping with the order of convergence being about 1.4. Alternatively, we\u0026rsquo;d expect the number of correct significant figures to roughly triple each three iterations.\nThe Illinois method is disarmingly simple, produces excellent results, and since it\u0026rsquo;s a bracketing method, will be guaranteed to converge. What\u0026rsquo;s not to like? Time to get it back in the textbooks!\n","link":"https://numbersandshapes.net/posts/illinois_method/","section":"posts","tags":["mathematics","julia","computation"],"title":"The Illinois method for solving equations"},{"body":"Carroll originally invented his Doublets in 1877, they were published in \u0026ldquo;Vanity Fair\u0026rdquo; (the magazine, not the Thackeray novel) in 1879. Some years later, in an 1892 letter, Carroll added another rule: that permutations were allowed. This allows very neat chains such as:\nroses, noses, notes, steno, stent, scent Because the words stay the same length here, but more connectivity is allowed, we would expect that not only would the largest connected component of the graph be bigger than before, but that chains would be shorter. And this time we can connect \u0026ldquo;chair\u0026rdquo; with \u0026ldquo;table\u0026rdquo;:\nchair, chain, china, chink, clink, blink, blind, blend, blent, bleat, table To test if two words are permutations, we can create two small function:\nJulia\u0026gt; ssort(x) = join(sort(collect(x))) Julia\u0026gt; scmp(x,y) = cmp(ssort(x),ssort(y)) The first function ssort simply alphabetises a string; the second function scmp compares two sorted strings. The function returns zero if the sorted strings are identical.\nWe can then create the graph. As before, we\u0026rsquo;ll start with the \u0026ldquo;medium\u0026rdquo; word list, and its sublist of five-letter words.\nJulia\u0026gt; nw = length(words5) Julia\u0026gt; G5 = Graph(nw) Julia\u0026gt; for i in 1:nw for j in i+1:nw wi = words5[i] wj = words5[j] if (Hamming()(wi,wj) == 1) | (scmp(wi,wj) == 0) add_edge!(G5,i,j) end end end This graph G5 has 4388 vertices, 11107 edges. As before, find the largest connected component:\nJulia\u0026gt; CC = connected_components(G5) Julia\u0026gt; CL = map(length, CC) Julia\u0026gt; mx, indx = findmax(CL) Julia\u0026gt; C1, vmap1 = induced_subgraph(G5,CC[indx]) This new graph has 3665 vertices and 10946 edges. This is larger than the graph using only the Hamming distance, which had 4072 vertices.\nRather than just find aloof words, we\u0026rsquo;ll find the number of connected components of all sizes; it turns out that there are only a small number of different such sizes:\nJulia\u0026gt; u = sort(unique(CL), rev = true) Julia\u0026gt; show(u) [3665, 12, 6, 5, 4, 3, 2, 1] Julia\u0026gt; freqs = zeros(Int16,2,length(u)) Julia\u0026gt; for i in 1:length(u) freqs[1,i] = u[i] freqs[2,i] = count(x -\u0026gt; x == u[i], CL)) end Julia\u0026gt; display(freqs) 2×8 Matrix{Int16}: 3665 12 6 5 4 3 2 1 1 1 2 4 4 18 62 485 We see that there are 485 aloof words (less than before), and various small components. The components between 4 and 12 words are:\nalive, voice, alike, olive, voile allow, local, loyal, royal, vocal, aglow, allay, alley, allot, alloy, focal, atoll\ngroup, tutor, trout, croup, grout\nradio, ratio, patio, radii\namber, embed, ebbed, ember, umbel, umber\nchief, thief, fiche, niche\nfizzy, fuzzy, dizzy, tizzy\nmoron, bacon, baron, baton, boron\nocean, canoe, canon, capon\npupil, papal, papas, pupae, pupal\nbuddy, giddy, muddy, ruddy, biddy, middy And now for the longest ladder:\nJulia\u0026gt; eccs = eccentricity(CL); Julia\u0026gt; mx = maximum(eccs) Julia\u0026gt; inds = findall(x -\u0026gt; x==mx, eccs) Julia\u0026gt; show(inds) [83, 2984, 3024] These correspond to the words \u0026ldquo;court\u0026rdquo;, \u0026ldquo;cabby\u0026rdquo;, \u0026ldquo;gabby\u0026rdquo;. The last two words are adjacent in the graph, but each of the other pairs produces a ladder of maximum length of 27. Here\u0026rsquo;s one of them:\nJulia\u0026gt; ld = ladder(subwords[inds[1]], subwords[inds[2]]) Julia\u0026gt; println(join(ld,\u0026#34;, \u0026#34;),\u0026#34;\\nlength is \u0026#34;,length(ld)) court, count, mount, mound, wound, would, could, cloud, clout, flout, flour, floor, flood, blood, brood, broad, board, boars, boors, boobs, booby, bobby, hobby, hubby, tubby, tabby, cabby length is 27 ","link":"https://numbersandshapes.net/posts/doublets_with_permutations/","section":"posts","tags":["programming","julia"],"title":"  Carroll's \"improved\" Doublets: allowing permutations\n  "},{"body":"","link":"https://numbersandshapes.net/tags/programming/","section":"tags","tags":null,"title":"Programming"},{"body":"Apparently there\u0026rsquo;s a version of Doublets (see previous post) which allows you to add or delete a letter each turn. Thus we can go from WHEAT to BREAD as\nWHEAT, HEAT, HEAD, READ, BREAD\nwhich is shorter than the ladder given in that previous post. However, we can easily adjust the material from that post to implement this new version. There are two major differences:\nWe have to use all the words in the list, since with additions and deletions, all words are potential elements in a ladder. That is, we can\u0026rsquo;t restrict words by length. The distance between words is no long the Hamming distance. For this version we need the Levenshtein distance. This counts the number of additions, deletions, and replacements, to go from one string to another. The new version of Doublets thus requires that the Levenshtein distance between two words is 1. Other than that it\u0026rsquo;s all the same as previously. Starting with the list medium.wds, constructing the graph (which has 59577 vertices and 78888 edges), and determining the eccentricities of the largest connected component now take a longer time: Constructing the graph took over 24 minutes, and the eccentricities took a bit under 4 1/2 minutes on my machine. The largest connected component has 23801 words; the next largest has 41 words (see below). There are also 14688 aloof words; here\u0026rsquo;s a random 10 of them:\nevading, foliage, irksome, thalami, discrediting, embodiment, absorption, persisted, supplementing, dispatch We do see some reductions of ladder lengths. Getting \u0026ldquo;scent\u0026rdquo; from \u0026ldquo;roses\u0026rdquo; took 11 words, but in this new version it\u0026rsquo;s quicker:\nroses, roes, res, rest, rent, cent, scent And the longest word ladder has 42 words:\nhammerings, hammering, hampering, pampering, papering, capering, catering, cantering, bantering, battering, bettering, fettering, festering, pestering, petering, peering, peeing, pieing, piing, ping, pine, pane, pale, paled, pealed, peeled, peered, petered, pestered, festered, fettered, bettered, battered, bantered, cantered, catered, capered, tapered, tampered, hampered, hammered, yammered Interestingly, we might expect that most words are connected, but in fact there are 25337 connected components. One of them consists of\nmetabolise, metabolised, metabolises, metabolism, metabolisms, metabolite, metabolites, metabolize, metabolized, metabolizes Many of the connected components seem to be like this: all based around one particular word with its various grammatical forms. The two second biggest connected component contain 41 words each. Here\u0026rsquo;s one:\ncomplete, completed, completes, compose, composed, composes, compute, computed, computer, computers, computes, comfort, compete, competed, competes, composer, composers, comforted, comforts, commune, communed, communes, commute, commuted, commuter, commuters, commutes, completer, completest, complexes, compost, composted, composts, comforter, comforters, complected, comport, comported, comports, compote, compotes ","link":"https://numbersandshapes.net/posts/super_doublets_more_word_ladders_with_julia/","section":"posts","tags":["programming","julia"],"title":"Super Doublets: more word ladders with Julia"},{"body":"Lewis Carroll\u0026rsquo;s game of Doublets Such a long time since my last post! Well, that\u0026rsquo;s the working life for you.\nAnyway, recently I was reading about Lewis Carroll - always one of my favourite people - and was reminded of his word game \u0026ldquo;Doublets\u0026rdquo; in which one word is turned into another by changing one letter at a time, each new word being English.\nYou can read Carroll\u0026rsquo;s original description here. Note his last sentence:\n\u0026ldquo;It is, perhaps, needless to state that it is de rigueur that the links should be English words, such as might be used in good society.\u0026rdquo;\nCarroll, it seemed, frowned on slang words, or \u0026ldquo;low\u0026rdquo; words - very much in keeping with his personality and with his social and professional positions. One of his examples was\n\u0026ldquo;Change WHEAT into BREAD\u0026rdquo;\nwhich has an answer: WHEAT CHEAT CHEAP CHEEP CREEP CREED BREED BREAD.\nClearly we would want the length of the chain of words to be as short as possible; with a lower bound being the Hamming distance between the words: the number of places in which letters are different. This distance is 3 for WHEAT and BREAD and so the minimum changes will be 3. But in practice chains are longer, as the one above. English simply doesn\u0026rsquo;t contain all possible words of 5 letters, and so we can\u0026rsquo;t have, for example:\nWHEAT WHEAD WREAD BREAD\nThis form of word puzzle, so simple and addictive, has been resurrected many times, often under such names as \u0026ldquo;word ladders\u0026rdquo;, or \u0026ldquo;laddergrams\u0026rdquo;.\nObtaining word lists Every computer system will have a spell-check list on it; on a Linux system these are usually found under /usr/share/dict . My system, running Arch Linux has these lists:\n$ wc -l /usr/share/dict/* 123115 /usr/share/dict/american-english 127466 /usr/share/dict/british-english 189057 /usr/share/dict/catalan 54763 /usr/share/dict/cracklib-small 88328 /usr/share/dict/finnish 221377 /usr/share/dict/french 304736 /usr/share/dict/german 92034 /usr/share/dict/italian 76258 /usr/share/dict/ogerman 56329 /usr/share/dict/spanish 123115 /usr/share/dict/usa These all come from the package words, which is a collection of spell-check dictionaries.\nAlthough the sizes of the English word lists may seem impressive, there are bigger lists available. One of the best is at SCOWL (Spell Checker Oriented Word Lists). You can download the compressed SCOWL file, and when uncompressed you\u0026rsquo;ll find it contains a directory called final. In this directory, the largest files are those of the sort english-words.*, and here\u0026rsquo;s how big they are:\n$ wc -l english-words.* 4373 english-words.10 7951 english-words.20 36103 english-words.35 6391 english-words.40 23796 english-words.50 6233 english-words.55 13438 english-words.60 33270 english-words.70 139209 english-words.80 219489 english-words.95 490253 total These lists contain increasingly more abstruse and unusual words. Thus english-words.10 contains words that most adept speakers of English would know:\n$ shuf -n 10 english-words.10 errors hints green connections still mouth category\u0026#39;s pi won\u0026#39;s varied At the other end, english-words.95 consists of unusual, obscure words unlikely to be in a common vocabulary; many of them are seldom used, have very specific meanings, or are technical:\n$ shuf -n 10 english-words.95 deutschemark\u0026#39;s disingenious retanner advancer\u0026#39;s shlimazl unpontifical nonrequirement peccancy\u0026#39;s photozinco nonuniting This is the list which contains some splendid biological terms: \u0026ldquo;bdelloid\u0026rdquo;, which is a class of microscopic water animals called rotifers; \u0026ldquo;ctenizidae\u0026rdquo;, a small class of spiders (but the list does not contain \u0026ldquo;ctenidae\u0026rdquo; the much larger class of wandering spiders, including the infamous Brazilian wandering spider); \u0026ldquo;cnidocyst\u0026rdquo; which forms the stinging mechanism in the cells of the tentacles of jellyfish.\nPutting the lists 10, 25, 30 together makes a \u0026ldquo;small\u0026rdquo; list; adding in 40 and 50 gives a \u0026ldquo;medium\u0026rdquo; list; next include 55, 65, 70 for \u0026ldquo;large\u0026rdquo;; \u0026ldquo;80\u0026rdquo; for \u0026ldquo;huge\u0026rdquo;, and 95 for \u0026ldquo;insane\u0026rdquo;. The author of SCOWL claims that 60 is the right level for spell checking, while: \u0026ldquo;The 95 contains just about every English word in existence and then some. Many of the words at the 95 level will probably not be considered valid English words by most people.\u0026rdquo; (This is not true. I\u0026rsquo;ve discovered some words not in any list. One such is \u0026ldquo;buckleys\u0026rdquo;, as in \u0026ldquo;He hansn\u0026rsquo;t got buckleys chance\u0026rdquo;, sometimes spelled with a capital B, and meaning \u0026ldquo;He hasn\u0026rsquo;t got a chance\u0026rdquo;; that is, no chance at all. \u0026ldquo;Buckley\u0026rdquo; is only in the list of proper names, but given this usage it should be at least in the Australian words list. Which it isn\u0026rsquo;t.)\nYou will also see from above that some words contain apostrophes: these words will need to be weeded out. Here\u0026rsquo;s how to make a list and clean it up:\n$ cat english-words.10 english-words.25 english-words.30 \u0026gt; english-words-small.txt $ grep -v \u0026#34;\u0026#39;\u0026#34; english-words-small.txt \u0026gt; small.wds This approach will yield five lists with the following numbers of words:\n$ wc -l *.wds 234563 huge.wds 414365 insane.wds 107729 large.wds 59577 medium.wds 38013 small.wds 854247 total These can be read into Julia. These lists are unsorted, but that won\u0026rsquo;t be an issue for our use of them. But you can certainly include a sort along the way.\nAnother list is available at https://github.com/dwyl/english-words which claims to have \u0026ldquo;over 466k English words\u0026rdquo;. However, this list is not as carefully curated as is SCOWL.\nFinally note that our lists are not disjoint, as are the original lists. Each list includes its predecessor, so that insane.wds contains all of the words in all of the lists.\nUsing graph theory The computation of word ladders can easily be managed using the tools of graph theory, with vertices being the words, and two vertices being adjacent if their Hamming distance is 1. Then finding a word ladder is easily done by a shortest path.\nThere is a problem though, as Donald Knuth discovered when he launched the first computerization of this puzzle, of which an explanation is available in his 1994 book The Stanford GraphBase. This page, you\u0026rsquo;ll notice, contains \u0026ldquo;the 5757 five-letter words of English\u0026rdquo;. However, deciding what is and what isn\u0026rsquo;t an English word can be tricky: American versus English spellings, dialect words, newly created words and so on. I touched on this in an earlier post.\nKnuth also found that there were 671 words which were connected to no others; he called these \u0026ldquo;aloof words\u0026rdquo;, with ALOOF, of course, being one of them.\nUsing Julia Although Python has its mature and powerful NetworkX package for graph theory and network analysis, Python is too slow for this application: we are looking at very large graphs of many thousands of vertices, and computing the edges is a non-trivial task. So our choice of language is Julia. Julia\u0026rsquo;s graph theory packages are in a bit of state of flux, an old package Graphs.jl is unmaintained, as is LightGraphs.jl. However, this latter package is receiving a new lease of life with the unfortunately confusing name of Graphs.jl, and which is designed to be \u0026ldquo;functionally equivalent\u0026rdquo; to LightGraphs.jl.\nThis is the package I\u0026rsquo;ll be using.\nSetting up the graph We\u0026rsquo;ll use the medium.wds dictionary since it\u0026rsquo;s relatively small, and we\u0026rsquo;ll look at five letter words. Using six-letter words or the larger list will then be a simple matter of changing a few parameters.\nWe start by open the list in Julia:\nJulia\u0026gt; f = open(\u0026#34;medium.wds\u0026#34;, \u0026#34;r\u0026#34;) Julia\u0026gt; words = readlines(f) Julia\u0026gt; length(words) 59577 Now we can easily extract the five letter words, and set up the graph, first of all loading the Graphs package. We also need the StringDistances package to find the Hamming distance.\nJulia\u0026gt; using Graphs, StringDistances Julia\u0026gt; words5 = filter(x -\u0026gt; length(x)==5, words) Julia\u0026gt; w5 = length(words5) 4388 Julia\u0026gt; G5 = Graph() Julia\u0026gt; add_vertices!(G5, w5); Now the edges:\nJulia\u0026gt; for i in 1:w5 for j in i+1:w5 if Hamming()(words5[i],words5[j]) == 1 add_edge!(G5,i,j) end end end Julia\u0026gt; Note that there is a Julia package MetaGraph.jl which allows you to add labels to edges. However, it\u0026rsquo;s just as easy to use the vertex numbers as indices into the list of words.\nWe can\u0026rsquo;t use the graph G5 directly, as it is not a connected graph (remember Donald Knuth\u0026rsquo;s \u0026ldquo;aloof\u0026rdquo; words?) We\u0026rsquo;ll do two things: find the aloof words, and choose from G5 the largest connected component. First the aloof words:\nJulia\u0026gt; aloofs = map(x-\u0026gt;words5[x],findall(iszero, degree(G5))) Julia\u0026gt; show(aloofs) I won\u0026rsquo;t include this list, as it\u0026rsquo;s too long - it contains 616 words. But if you do so, you\u0026rsquo;ll see some surprises here: who\u0026rsquo;d have though that such innocuous words as \u0026ldquo;opera\u0026rdquo;, \u0026ldquo;salad\u0026rdquo; or \u0026ldquo;wagon\u0026rdquo; were aloof? But they most certainly are, at least within this set of words.\nAnd now for the connected component:\nJulia\u0026gt; CC = connected_components(G5) Julia\u0026gt; CL = map(length,CC) # size of each component Julia? mx, idx = findmax(CL) Julia\u0026gt; C5,vmap = induced_subgraph(G5, CC[idx]) You will find that the maximum connected component C5 has 3315 vertices, and the value of idx above is 3. Here vmap is a list of length 3315 which is the set of indices into the original list. Or, if the original list of vertices consisted of the numbers 1,2, up to 4388, then vmap is a sublist of length 3315. And we can consider all those numbers as indices into the words5 list.\nNow we can write a little function to produce word ladders; here using the Julia function a_star to find a shortest path:\nJulia\u0026gt; subwords = words5[vmap] Julia\u0026gt; function ladder(w1,w2) i = findfirst(x -\u0026gt; words5[x] == w1, vmap) j = findfirst(x -\u0026gt; words5[x] == w2, vmap) P = a_star(C5,i,j) verts = append!([i],map(dst,P)) subwords[verts] end And of course try it out:\nJulia\u0026gt; show(ladder(\u0026#34;wheat\u0026#34;,\u0026#34;bread\u0026#34;)) [\u0026#34;wheat\u0026#34;, \u0026#34;cheat\u0026#34;, \u0026#34;cleat\u0026#34;, \u0026#34;bleat\u0026#34;, \u0026#34;bleak\u0026#34;, \u0026#34;break\u0026#34;, \u0026#34;bread\u0026#34;] Note that a word ladder can only exist when both words have indices in the chosen largest connected component. For example:\nJulia\u0026gt; show(ladder(\u0026#34;roses\u0026#34;,\u0026#34;scent\u0026#34;)) [\u0026#34;roses\u0026#34;, \u0026#34;ruses\u0026#34;, \u0026#34;rusts\u0026#34;, \u0026#34;rests\u0026#34;, \u0026#34;tests\u0026#34;, \u0026#34;teats\u0026#34;, \u0026#34;seats\u0026#34;, \u0026#34;scats\u0026#34;, \u0026#34;scans\u0026#34;, \u0026#34;scant\u0026#34;, \u0026#34;scent\u0026#34;] but\nJulia\u0026gt; show(ladder(\u0026#34;chair\u0026#34;,\u0026#34;table\u0026#34;)) will produce an error, as neither of those words have indices in the largest connected component. In fact \u0026ldquo;chair\u0026rdquo; sits in a small connected component of 3 words, and \u0026ldquo;table\u0026rdquo; is in another connected component of 5 words.\nThe longest possible ladder We have seen that the length of a word ladder will often exceed the Hamming distance between the start and ending words. But what is the maximum length of such a ladder?\nHere the choice of function is the eccentricities of a graph: for each vertex, find the shortest path to every other vertex. The length of the longest such path is the eccentricity of that vertex.\nJulia\u0026gt; eccs = eccentricity(C5) This command will take a noticeable amount of time - only a few seconds, it\u0026rsquo;s true, but it is far from instantaneous. The intense computation needed here is one of the reasons that I prefer Julia to Python for this experiment.\nNow we can use the eccentricities to find the longest path. Since this is an undirected graph, there must be at least two vertices with equal largest eccentricities.\nJulia\u0026gt; mx = maximum(eccs) Julia\u0026gt; inds = findall(x -\u0026gt; x == mx, eccs) Julia\u0026gt; ld = ladder(subwords[inds[1]], subwords[inds[2]]) Julia\u0026gt; print(join(ld,\u0026#34;, \u0026#34;),\u0026#34;\\nwhich has length \u0026#34;,length(ld)) aloud, cloud, clout, flout, float, bloat, bleat, cleat, cleft, clefs, clews, slews, sleds, seeds, feeds, feels, fuels, furls, curls, curly, curvy, curve, carve, calve, valve, value, vague, vogue, rogue\u0026#34; which has length 29 Who\u0026rsquo;d have guessed?\nResults from other lists With the 12478 five-letter words from the \u0026ldquo;large\u0026rdquo; list, there are 748 aloof words, and the longest ladder is\nrayon, racon, bacon, baron, boron, boson, bosom, besom, besot, beset, beret, buret, curet, cures, curds, surds, suras, auras, arras, arias, arils, anils, anile, anole, anode, abode, abide, amide, amine, amino, amigo which has length 31. We might in general expect a shorter \u0026ldquo;longest ladder\u0026rdquo; than from using smaller list: the \u0026ldquo;large\u0026rdquo; list has far more words, hence greater connectivity, which would lead in many cases to shorter paths.\nThe \u0026ldquo;huge\u0026rdquo; and \u0026ldquo;insane\u0026rdquo; lists have longest ladders of length 28 and respectively:\nessay, assay, asway, alway, allay, alley, agley, aglet, ablet, abled, ailed, aired, sired, sered, seres, serrs, sears, scars, scary, snary, unary, unarm, inarm, inerm, inert, inept, inapt, unapt and\nentry, entsy, antsy, artsy, artly, aptly, apply, apple, ample, amole, amoke, smoke, smote, smite, suite, quite, quote, quott, quoit, qubit, oubit, orbit, orbic, urbic, ureic, ureid, ursid For six-letter words, here are longest ladders in each of the lists:\nSmall, length 32:\nsteady, steamy, steams, steals, steels, steers, sheers, cheers, cheeks, checks, chicks, clicks, slicks, slices, spices, spites, smites, smiles, smiled, sailed, tailed, tabled, tabbed, dabbed, dubbed, rubbed, rubber, rubier, rubies, rabies, babies, babied Medium, length 44:\ntrusty, trusts, crusts, crests, chests, cheats, cleats, bleats, bloats, floats, flouts, flours, floors, floods, bloods, broods, brooks, crooks, crocks, cracks, cranks, cranes, crates, crated, coated, boated, bolted, belted, belied, belies, bevies, levies, levees, levers, lovers, hovers, hovels, hotels, motels, models, modals, morals, morays, forays Large, length 43:\nuneasy, unease, urease, crease, creese, cheese, cheesy, cheeky, cheeks, creeks, breeks, breeds, breads, broads, broods, bloods, bloops, sloops, stoops, strops, strips, stripe, strive, shrive, shrine, serine, ferine, feline, reline, repine, rapine, raping, raring, haring, hiring, siring, spring, sprint, splint, spline, saline, salina, saliva Huge, length 63:\naneath, sneath, smeath, smeeth, smeech, sleech, fleech, flench, clench, clunch, clutch, crutch, crotch, crouch, grouch, grough, trough, though, shough, slough, slouch, smouch, smooch, smooth, smoots, smolts, smalts, spalts, spaits, spains, spaing, spring, sprint, splint, spline, upline, uplink, unlink, unkink, unkind, unkend, unkent, unsent, unseat, unseal, unseel, unseen, unsewn, unsews, unmews, enmews, emmews, emmers, embers, embars, embark, imbark, impark, impart, import, impost, impose, impone Insane, length 49:\nambach, ambash, ambush, embush, embusk, embulk, embull, emball, empall, emparl, empark, embark, embars, embers, emmers, emmews, enmews, endews, enders, eiders, ciders, coders, cooers, cooees, cooeed, coomed, cromed, cromes, crimes, crises, crisis, crisic, critic, iritic, iridic, imidic, amidic, amidin, amidon, amydon, amylon, amylin, amyrin, amarin, asarin, asaron, usaron, uzaron, uzarin The length of time taken for the computations increases with size of the lists, and up to a certain extent, the length of the words. On my somewhat mature laptop (Lenovo X1 Carbon 3rd Generation), computing the eccentricities for six-letter words and the \u0026ldquo;insane\u0026rdquo; list took over 6 minutes.\nA previous experiment with Mathematica You can see this (with some nice graph diagrams) at https://blog.wolfram.com/2012/01/11/the-longest-word-ladder-puzzle-ever/ from about 10 years ago. This experiment used Mathematica\u0026rsquo;s English language list, about which I can find nothing other than it exists. However, in the comments, somebody shows that Mathematica 8 had 92518 words in its English dictionary. And this number is dwarfed by Finnish, with 728498 words. But that might just as much be an artifact of Finnish lexicography. Deciding whether or nor a word is English is very hard indeed, and any lexicographer will decide on whether two words need separate definitions, or whether one can be defined within the other, so to speak - that is, under the same headword. MOUSE, MOUSY, MOUSELIKE - do these require three definitions, or two, or just one?\n","link":"https://numbersandshapes.net/posts/word_ladders_with_julia/","section":"posts","tags":["programming","julia"],"title":"Word ladders with Julia"},{"body":"","link":"https://numbersandshapes.net/tags/education/","section":"tags","tags":null,"title":"Education"},{"body":"Plagiarism, text matching, and academic integrity Every modern academic teacher is in thrall to giant text-matching systems such as Ouriginal or Turnitin. These systems are sold as \u0026ldquo;plagiarism detectors\u0026rdquo;, which they are not - they are text matching systems, and they generally work by providing a report showing how much of a student\u0026rsquo;s submitted work matches text from other sources. It is up to the academic to decide if the level of text matching constitutes plagiarism.\nAlthough Turnitin sells itself as a plagiarism detector, or at any rate a tool for supporting academic integrity, its software is closed source, so, paradoxically, there\u0026rsquo;s no way of knowing if any of its source code has been plagiarized from another source.\nSuch systems work by having access to a giant corpus of material: published articles, reports, text on websites, blogs, previous student work obtained from all over, and so on. The more texts a system can try to match a submission against, the more confident an academic is supposed to have in its findings. (And the more likely an administration will see fit to paying the yearly licence costs.)\nOf course in the arms-race of academic integrity, you\u0026rsquo;ll find plenty of websites offering advice on \u0026ldquo;how to beat Turnitin\u0026rdquo;; but in the interests of integrity I\u0026rsquo;m not going to link to any, but they\u0026rsquo;re not hard to find. And of course Turnitin will presumably up its game to counter these methods, and the sites will be rewritten, and so on.\nMy problem I have been teaching a fully online class; although my university is slowly trying to move back (at least partially) into on-campus delivery after 2 1/2 years of Covid remote learning, some classes will still run online.\nMy students were completing an online \u0026ldquo;exam\u0026rdquo;: a timed test (un-invigilated) in which the questions were randomized so that no students got the same set of questions. They were all \u0026ldquo;Long Answer\u0026rdquo; questions in the parlance of our learning management system; at any rate for each question a text box was given for the student to enter their answer.\nThe test was to be marked \u0026ldquo;by hand\u0026rdquo;. That is, by me.\nMany of my students speak English as a second language, and although they are supposed to have a basic competency sufficient for tertiary study, many of them struggle. And if a question asks them to define, for example, \u0026ldquo;layering\u0026rdquo; in the context of cybersecurity, I have not the slightest problem with them searching for information online, finding it, and copying it into the textbox. If they can search for the correct information and find it, that\u0026rsquo;s good enough for me. This exam is also open book. As far as I\u0026rsquo;m concerned, finding correct information is a useful and valuable skill; testing for the use of what they might remember, and \u0026ldquo;in their own words\u0026rdquo; is pedagogically indefensible.\nSo, working my way grimly through these exams, I had a \u0026ldquo;this seems familiar\u0026hellip;\u0026rdquo; moment. And indeed, searching through some previous submissions I found exactly the same answer submitted by another student. Well, that can happen. What is less likely to happen, at least by chance, is for almost all of the 16 questions to have the same submissions as other students. People working in the area of academic integrity sometimes speak of a \u0026ldquo;spidey sense\u0026rdquo; a sort of sixth sense that alerts you that something\u0026rsquo;s not right, even if you can\u0026rsquo;t quite yet pinpoint the issue. This was that sense, and more.\nIt turned out that the entire test and all answers could be downloaded and saved as a CSV File, and hence loaded into Python as a Pandas DataFrame.\nMy first attempt had me looking at all pairs of students and their test answers, to see if any of the answer text strings matched. And some indeed did. Because of the randomized nature of the test, one student might receive as question 7, say, the same question that another student might see as question 5, or question 8.\nThe data I had to work with consisted of two DataFrames. Once contained all the exam information:\nexamdata.dtypes Username object FirstName object LastName object Q # int64 Q Text object Answer object Score float64 Out Of float64 dtype: object This DataFrame was ordered by student, and then by question number. This meant that every student had up to 16 rows of the DataFrame. I had another DataFrame containing just the names and cohorts (there were two distinct cohorts, and this information was not given in the dump of exam data to the CSV file.)\nnames.dtypes Username object FirstName object LastName object Cohort object dtype: object I added the cohorts by hand. This could then be merged with the exam data:\ndata = examdata.merge(names,on=[\u0026#34;Username\u0026#34;,\u0026#34;FirstName\u0026#34;,\u0026#34;LastName\u0026#34;],how=\u0026#39;left\u0026#39;).reset_index(drop=True) String similarity Since the exam answers in my DataFrame were text strings, any formatting that the student might have given in an answer, such as bullet points or a numbered list, a table, font changes, were ignored. All I had to work in were ascii strings.\nHowever, exact string matching led to very few results. This is because there might have been a difference in starting or ending whitespace or other characters, or even if one student\u0026rsquo;s submission included another student\u0026rsquo;s submission as a substring. Consider for example these two (synthetic) examples:\n\u0026ldquo;A man-in-the-middle attack is a cyberattack where the attacker secretly relays and possibly alters the communications between two parties who believe that they are directly communicating with each other, as the attacker has inserted themselves between the two parties.\u0026rdquo; (from the Wikipedia page on the Man-In-The-Middle attack.)\n\u0026ldquo;I think it\u0026rsquo;s this: A man-in-the-middle attack is a cyberattack where Mallory secretly relays and possibly alters the communications between Alice and Bob who believe that they are directly communicating with each other, as Mallory has inserted himself between them.\u0026rdquo;\nThere are various ways of measuring the distance between strings, or alternatively of their similarity. Two much used methods are the Jaro similarity measure (named for Matthew Jaro, who introduced it in 1989), and the Jaro-Winkler measure, a version named also for William Winkler, who discussed it in 1990. Both of these are defined on their Wikipedia page. Winkler\u0026rsquo;s measure adds to the original Jaro measure a factor based on the equality of any beginning substring.\nIt turns out that the Jaro-Winkler similarity of the two strings above is about 0.78. If the first \u0026ldquo;I think it\u0026rsquo;s this: \u0026quot; is removed from the second string, then the similarity increases to 0.89.\nBoth the Jaro and Jaro-Winkler measures are happily implemented in the Python jellyfish package. This package also includes some other standard measurements of the closeness of two strings.\nMy approach was to find the number of submissions whose Jaro-Winkler similarity exceeded 0.85. And I found this number empirically, by checking a number of (what appeared to me) to be very similar submissions, and computing their similarities.\nSome results In this cohort there were 39 students, divided into two cohorts: 12 were taught by me, and the rest by another teacher. I was only concerned with mine. There were 16 questions, but not every student answered every question, and so the maximum size of my DataFrame would be \\(12\\times 16=192\\); in fact I had a total of 171 different answers. The numbers of questions submitted by the students were:\n11, 16, 14, 16, 16, 16, 15, 13, 12, 12, 16, 14\nand so (to avoid comparing pairs of submissions twice) I aimed to compare every student\u0026rsquo;s submission to the submissions of all students below them in the DataFrame. This makes for 13,383 comparisons. In fact, because I\u0026rsquo;m a lazy programmer, I simply compared every submission to every submission below it in the DataFrame (which meant that I was comparing submissions from a single student), for a total of 14,535 comparisons.\nThis is how (assuming that the jellyfish package as been loaded as jf):\nmatch_list = [] N = my_data.shape[0] for i in range(N): for j in range(i+1,N): jfs = jf.jaro_winkler_similarity(my_data.at[i,\u0026#34;Answer\u0026#34;],my_data.at[j,\u0026#34;Answer\u0026#34;]) if jfs \u0026gt; 0.85: match_list += [[my_data.at[i,\u0026#34;Username\u0026#34;],my_data.at[j,\u0026#34;Username\u0026#34;],my_data.at[i,\u0026#34;Q #\u0026#34;],my_data.at[j,\u0026#34;Q #\u0026#34;],jfs]] I ended up with 33 matches, which I put into a DataFrame:\nmatches = pd.DataFrame(match_list,columns=[\u0026#34;ID 1\u0026#34;,\u0026#34;ID 2\u0026#34;,\u0026#34;Q# 1\u0026#34;,\u0026#34;Q# 2\u0026#34;,\u0026#34;Similarity\u0026#34;]) As you see, each row of the DataFrame contained the two student ID numbers, the relevant question numbers, and the similarity measure. Because of the randomisation of the exam, two students might get the same question but with a different number (as I mentioned earlier).\nTo see if any pair of students appeared more than once, I grouped the DataFrame by their ID numbers:\ndg = matches.groupby([\u0026#34;ID 1\u0026#34;,\u0026#34;ID 2\u0026#34;]).size() dg.values array([ 1, 1, 1, 1, 1, 1, 1, 11, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 2, 1]) Notice something? There\u0026rsquo;s a pair of students who submitted very similar answers to 11 questions! Now this pair can be isolated:\nmaxd = max(dg.values) cheats = dg.loc[dg.values==maxdg].index[0] c0, c1 = cheats The matches can now be listed:\ncollusion = matches.loc[(matches[\u0026#34;ID 1\u0026#34;]==c0) \u0026amp; (matches[\u0026#34;ID 2\u0026#34;]==c1)].reset_index(drop=True) and we can print off these matches as evidence.\n","link":"https://numbersandshapes.net/posts/academic_text_matching/","section":"posts","tags":["Python","Education"],"title":"Every academic their own text-matcher"},{"body":"","link":"https://numbersandshapes.net/tags/python/","section":"tags","tags":null,"title":"Python"},{"body":"What this post is about In the previous post we showed how to set up a simple interactive map using Python and its folium package. As the example, we used a Federal electorate situated within the city of Melbourne, Australia, and the various voting places, or polling places (also known as polling \u0026ldquo;booths\u0026rdquo;) associated with it.\nThis post takes the map a little further, and we show how to use Python\u0026rsquo;s geovoronoi package to create Voronoi regions around each booth. This (we hope) will give a map of where voting for a particular party might be strongest. (We make the assumption that every voter will vote at the booth closest to their home.)\nBecause some voting booths are outside the electorate - this includes early voting centres - we first need to reduce the booths to only those withing the boundary of the electorate. In this case, as the boundary is a simply-connect region, this is straightforward. The we can create the Voronoi regions and map them.\nMessage about the underlying software NOTE: much of the material and discussion here uses the Python package \"folium\", which is a front end to the Javascript package \"leaflet.js\". The lead developer of leaflet.js is Volodymyr Agafonkin, a Ukrainian up until recently living and working in Kyiv. Leaflet version 1.80 was released on April 18, \u0026ldquo;in the middle of war\u0026rdquo;, with \u0026ldquo;air raid sirens sounding outside\u0026rdquo;. See the full statement here.\n(This banner is included here with the kind permission of Mr Agafonkin.) Please consider the Ukrainian people who are suffering under an unjust aggressor, and help them.\nObtaining interior points and Voronoi regions We start by simplifying a few of the variable names:\nhbooths = higgins_booths.copy() We will also need the boundary of the electorate:\nhiggins_crs=higgins.to_crs(epsg=4326) higgins_poly = higgins_crs[\u0026#34;geometry\u0026#34;].iat[0] Now finding the interior points is easy(ish) using GeoPandas:\nhiggins_gpd = gpd.GeoDataFrame( hbooths, geometry=gpd.points_from_xy(hbooths.Longitude, hbooths.Latitude)) higgins_crs = higgins_gpd.set_crs(epsg=4326) interior = higgins_crs[higgins_crs.geometry.within(higgins_poly)].reset_index(drop=True) We should also check if any of the interior points double up. This might be if one location is used say, for both an early voting centre, and a voting booth on election day. The geovoronoi package will throw an error if a location is repeated.\nig = interior.groupby([\u0026#34;Latitude\u0026#34;,\u0026#34;Longitude\u0026#34;]).size() ig.loc[ig.values\u0026gt;1] Latitude Longitude -37.846104 144.998383 2 dtype: int64 The geographical coordinates can be obtained with, say\ndouble = ig.loc[ig.values\u0026gt;1].index[0] This means we can find the offending booths in the interior:\ninterior.loc[(interior[\u0026#34;Latitude\u0026#34;]==double[0]) \u0026amp; (interior[\u0026#34;Longitude\u0026#34;]==double[1])] PollingPlaceNm Latitude Longitude geometry 28 South Yarra HIGGINS PPVC -37.846104 144.998383 POINT (144.99838 -37.84610) 29 South Yarra South -37.846104 144.998383 POINT (144.99838 -37.84610) We\u0026rsquo;ll remove the pre-polling voting centre at row 28:\ninterior = interior.drop(28) Now we\u0026rsquo;re in a position to create the Voronoi regions. We have the external polygon (he boundary of the electorate), and all the internal points we need, with no duplications.\nfrom geovoronoi import voronoi_regions_from_coords, points_to_region interior_coords = np.array(interior[[\u0026#34;Longitude\u0026#34;,\u0026#34;Latitude\u0026#34;]]) polys, pts = voronoi_regions_from_coords(interior_coords, higgins_poly)) Each of the variables polys and pts are given as dictionaries. We want to associate each interior voting booth with a given region. This can be done by creating an index between the points and regions, and then adding the regions to the interior dataframe:\nindex = points_to_region(pts) N = len(polys) geometries = [polys[index[k]] for k in range(N)] interior[\u0026#34;geometry\u0026#34;]=geometries What this has done is replace the original geometry data (which were just the coordinates of each voting booth, given as \u0026ldquo;POINT\u0026rdquo; datatypes), to regions given as \u0026ldquo;POLYGON\u0026rdquo; datatypes.\nAdding some voting data We could simply map the Voronoi regions now:\nfrom geovoronoi.plotting import subplot_for_map, plot_voronoi_polys_with_points_in_area fig, ax = subplot_for_map(figsize=(10,10)) plot_voronoi_polys_with_points_in_area(ax, higgins_poly, polys, interior_coords, pts) plt.savefig(\u0026#39;higgins_voronoi.png\u0026#39;) plt.show() But what we want is a little more control. But first, we\u0026rsquo;ll add some more information to the DataFrame. The simplest information is the \u0026ldquo;two candidate preferred\u0026rdquo; data: these are the number of votes allocated to the two final candidates after the preferential counting. The files are available on the AEC website; they can be downloaded and used:\ntcp = pd.read_csv(\u0026#39;HouseTcpByCandidateByPollingPlaceDownload-27966_2.csv\u0026#39;) higgins_tcp = tcp.loc[tcp[\u0026#34;DivisionNm\u0026#34;]==\u0026#34;Higgins\u0026#34;] Each candidate gets their own row, which means we have to copy the cotes from each candidate into the interior DataFrame. In the case of the 2022 election, the Higgins final candidates represented the Australian Labor Party (ALP), and the Liberal Party (LP). The party is given in the column \u0026ldquo;PartyAb\u0026rdquo; in the higgins_tcp data frame. Adding them to the interior data frame is only a tiny bit fiddly:\nlp_votes = [] alp_votes = [] for index,row in interior.iterrows(): place = row[\u0026#34;PollingPlaceNm\u0026#34;] alp_votes += [higgins_tcp.loc[(higgins_tcp[\u0026#34;PollingPlace\u0026#34;]==place) \u0026amp; (higgins_tcp[\u0026#34;PartyAb\u0026#34;]==\u0026#34;ALP\u0026#34;),\u0026#34;OrdinaryVotes\u0026#34;].iat[0]] lp_votes += [higgins_tcp.loc[(higgins_tcp[\u0026#34;PollingPlace\u0026#34;]==place) \u0026amp; (higgins_tcp[\u0026#34;PartyAb\u0026#34;]==\u0026#34;LP\u0026#34;),\u0026#34;OrdinaryVotes\u0026#34;].iat[0]] interior[\u0026#34;ALP Votes\u0026#34;] = alp_votes interior[\u0026#34;LP Votes\u0026#34;] = lp_votes Creating the map The base map is the same as before:\nhmap2 = folium.Map(location=centre,# crs=\u0026#39;EPSG4283\u0026#39;, tiles=\u0026#39;OpenStreetMap\u0026#39;, min_lat=b1-extent, max_lat=b3+extent, min_long=b0-extent, max_long=b2+extent, width=800,height=800,zoom_start=13,scrollWheelZoom=False) We don\u0026rsquo;t need to draw the boundary or interior as the Voronoi regions will cover it. What we\u0026rsquo;ll do instead is draw each Voronoi region, colouring it red (for a Labor majority) or blue (for a Liberal majority). Like this:\nfor index,row in interior.iterrows(): rloc = [row[\u0026#34;Latitude\u0026#34;],row[\u0026#34;Longitude\u0026#34;]] row_json = gpd.GeoSeries([row[\u0026#34;geometry\u0026#34;]]).to_json() tooltip = (\u0026#34;\u0026lt;b\u0026gt;{s1}\u0026lt;/b\u0026gt;\u0026#34;).format(s1 = row[\u0026#34;PollingPlaceNm\u0026#34;]) if row[\u0026#34;ALP Votes\u0026#34;] \u0026gt; row[\u0026#34;LP Votes\u0026#34;]: folium.GeoJson(data=row_json,style_function=lambda x: {\u0026#39;fillColor\u0026#39;: \u0026#39;red\u0026#39;}).add_to(hmap2) folium.CircleMarker(radius=5,color=\u0026#34;black\u0026#34;,fill=True,location=rloc,tooltip=tooltip).add_to(hmap2) else: folium.GeoJson(data=row_json,style_function=lambda x: {\u0026#39;fillColor\u0026#39;: \u0026#39;blue\u0026#39;}).add_to(hmap2) folium.CircleMarker(radius=5,color=\u0026#34;black\u0026#34;,fill=True,location=rloc,tooltip=tooltip).add_to(hmap2) And to view the map:\nhmap2 ","link":"https://numbersandshapes.net/posts/more_mapping_howto/","section":"posts","tags":["Python","GIS"],"title":"  More mapping \"not quite how-to\" - Voronoi regions\n  "},{"body":"","link":"https://numbersandshapes.net/tags/gis/","section":"tags","tags":null,"title":"GIS"},{"body":"Message about the underlying software NOTE: much of the material and discussion here uses the Python package \"folium\", which is a front end to the Javascript package \"leaflet.js\". The lead developer of leaflet.js is Volodymyr Agafonkin, a Ukrainian up until recently living and working in Kyiv. Leaflet version 1.80 was released on April 18, \u0026ldquo;in the middle of war\u0026rdquo;, with \u0026ldquo;air raid sirens sounding outside\u0026rdquo;. See the full statement here.\n(This banner is included here with the kind permission of Mr Agafonkin.) Please consider the Ukrainian people who are suffering under an unjust aggressor, and help them.\nThe software libraries and data we need The idea of this post is to give a little insight into how my maps were made. All software is of course open-source, and all data is freely available.\nThe language used is Python, along with the libraries:\nPandas for data manipulation and analysis GeoPandas which adds geospatial capabilities to Pandas folium for map creation - this is basically a Python front-end to creating interactive maps with the powerful leaflet.js JavaScript library. geovoronoi for adding Voronoi diagrams to maps Other standard libraries such as numpy and matplotlib are also used.\nThe standard mapping element is a shapefile which encodes a map element: for example the shape of a country or state; the position of a city. In order to use them, they have to be downloaded from somewhere. For Australian Federal elections, the AEC makes available much relevant geospatial information.\nVictorian geospatial information can be obtained from Vicmap Admin.\nCoordinates of polling booths can be obtained again from the AEC for each election. For the recent 2022 election, data is available at their Tallyroom. You\u0026rsquo;ll see that this page contains geospatial data as well as election results. Polling booth locations, using latitude and longitude, are available here.\nBuilding a basic map We can download the shapefiles and polling booth information (unzip any zip files to extract the documents as needed), and read them into Python:\nvic = gpd.read_file(\u0026#34;E_VIC21_region.shp\u0026#34;) booths = pd.read_csv(\u0026#34;GeneralPollingPlacesDownload-27966.csv\u0026#34;) Any Division in Victoria can be obtained and quickly plotted; for example Higgins:\nhiggins = vic.loc[vic[\u0026#34;Elect_div\u0026#34;]==\u0026#34;Higgins\u0026#39;] higgins.plot() We can also get a list of the Polling Places in the electorate, including their locations:\nhiggins_booths = booths.loc[booths[\u0026#34;DivisionNm\u0026#34;]==\u0026#34;Higgins\u0026#34;][[\u0026#34;PollingPlaceNm\u0026#34;,\u0026#34;Latitude\u0026#34;,\u0026#34;Longitude\u0026#34;]] With this shape, we can create an interactive map, showing for example the names of each polling booth.\nTo plot the electorate on a background map we need to first turn the shapefile into a GeoJSON file:\nhiggins_json = folium.GeoJson(data = higgins.to_json()) And to plot it, we can find its bounds (in latitude and longitude) and place it in a map made just a little bigger:\nb0,b1,b2,b3 = higgins.total_bounds extent = 0.1 centre = [(b1+b3)/2,(b0+b2)/2] hmap = folium.Map(location=centre, min_lat=b1-extent, max_lat=b3+extent, min_long=b0-extent, max_long=b2+extent, width=800,height=800, zoom_start=13, scrollWheelZoom=False ) higgins_json.add_to(hmap) hmap The various commands and parameters above should be straightforward: from the latitude and longitude given as bounds, the variable centre is exactly that. Because our area is relatively small, we can treat the earth\u0026rsquo;s surface as effectively flat, and treat geographical coordinates as though they were Cartesian coordinates. Thus for this simple map we don\u0026rsquo;t have to worry about map projections. The defaults will work fine. The variables min_lat and the others define the extent of our map; width and height are given in pixels; and an initial zoom factor is given. The final setting ``scrollWheelZoom=False`` stops the map from being inadvertently zoomed in or out by the mouse scrolling on it (very easy to do). The map can be zoomed by the controls in the upper left:\nWe can color the electorate by adding a style to the JSON variable:\nhiggins_json = folium.GeoJson(data = higgins.to_json(), style_function = lambda feature: { \u0026#39;color\u0026#39;: \u0026#39;green\u0026#39;, \u0026#39;weight\u0026#39;:6, \u0026#39;fillColor\u0026#39; : \u0026#39;yellow\u0026#39;, \u0026#39;fill\u0026#39;: True, \u0026#39;fill_opacity\u0026#39;: 0.4} ) Because folium is a front end to the Javascript leaflet.js package, much information is available on that site. For instance, all the parameters available to change the colors, border etc of the electorate are listed in the description of the leaflet Path.\nAdding interactivity So far the map is pretty static; we can zoom in and out, but that\u0026rsquo;s about it. Let\u0026rsquo;s add the voting booths as circles, each one with with a \u0026ldquo;tooltip\u0026rdquo; giving its name. A tooltip is like a popup which automatically appears when the cursor hovers over the relevant marker on the map. A popup, on the other hand, requires a mouse click to be seen.\nWe can create the points and tooltips from the list of booths in Higgins.\nfor index, row in higgins_booths.iterrows(): loc = [row[\u0026#34;Latitude\u0026#34;],row[\u0026#34;Longitude\u0026#34;]] tooltip = (\u0026#34;\u0026lt;b\u0026gt;{s1}\u0026lt;/b\u0026gt;\u0026#34;).format(s1 = row[\u0026#34;PollingPlaceNm\u0026#34;]) folium.CircleMarker(radius=5,color=\u0026#34;black\u0026#34;,fill=True,location=loc,tooltip=tooltip).add_to(hmap) The map can now be saved:\nhmap.save(\u0026#34;higgins_booths.html\u0026#34;) and viewed in a web browser:\nThis a very simple example of an interactive map. We can do lots more: display markers as large circles with numbers in them; divide the map into regions and make the regions respond to hovering or selection, add all sorts of text (even html iframes) as popups or tooltips, and so on.\n","link":"https://numbersandshapes.net/posts/mapping_howto/","section":"posts","tags":["Python","GIS"],"title":"  A mapping \"not quite how-to\"\n  "},{"body":"In this post we look at two Divisions from the recent Federal election: the inner city seat of Melbourne, and the bayside seat of Macnamara. Up until the recent election, Melbourne was the only Division to have a Greens representative. Macnamara, previously known as \u0026ldquo;Melbourne Ports\u0026rdquo; has been a Labor stronghold for all of its existence.\nThe near miss: Macnamara The contest in Macnamara was curious, and the vote counting took a very long time. Unlike almost all other Divisions, in Macnamara it was a three way contest, with Labor, Liberals and Greens polling very similar numbers at each polling booth. In fact the Greens pulled more votes individually than either Labor or the Liberals, but none of these parties had enough first preference votes to win. The decision thus came down to preferences, which knocked out the Greens and in the end put Labor back as the winner. Whether this is a good thing or not depends on your perspective, but it does show that a relatively high first preference count may not necessarily transfer to a win; one of the other parties might pick up more votes through preferences, and enough to push them over an absolute majority of 50% + 1.\nBut what I decided to do was, similar to my previous mapping post, show the Division of Macnamara with coloured Voronoi regions depending on first preferences. In this map, green shows support for the Greens; red for the Australian Labor Party, and blue for the Liberal Party. (Note that the Liberal Party is not \u0026ldquo;liberal\u0026rdquo; in any dictionary sense; this is a right-wing party, once the support of business and economic interests, it has steadily become more conservative over the years.)\nThis map seems to show that statistically, Greens support is fairly widespread across the Division. There is also a lot of detail left out: for example, many votes were cast at pre-polling Centres, and the numbers are large enough to significantly affect the result. This table compares votes cast in the Division on the day, which is what the map shows, against votes cast at the pre-polling centres:\nType GRN ALP Lib Others On the day 12110 11499 8737 4857 PPVC 8014 8791 8085 3262 Totals 20124 20290 16822 7849 Clearly first preferences show the Greens votes exceed votes both for the Labor and Liberal parties on the day. At pre-polling, Labor did better than the Greens. It does seem though that the Liberal first preferences were very much smaller, so it seems a bit paradoxical that the Greens were knocked out first, giving a final TCP of Labor and Liberal. But that\u0026rsquo;s preferences for you!\nThe win The current leader of the Australian Greens is Adam Bandt, who is the Federal MP for the Division of Melbourne. He has held this seat, increasing his majority at each election, since first winning it in 2010.\nThe next map shows the TCP results from the most recent election, which shows that Bandt has a TCP majority at every polling place except two:\nAs with all such maps, the amount of information here is not huge - but it looks nice. In particular, it shows that statistically, Greens support is fairly widespread across the Division.\nOther similar informational sites There are other sites, showing results at various booths. One is the excellently named PollBludger from the political analyst Willian Bowe, who is described on the site as \u0026ldquo;is a Perth-based election analyst and occasional teacher of political science. His blog, The Poll Bludger, has existed in one form or another since 2004, and is one of the most heavily trafficked websites on Australian politics.\u0026rdquo;\nAnother site is The Tally Room from Ben Raue, who has an adjunct appointment at the University of Sydney. A very nice addition to this site is a tutorial on how to create your own maps, in this case using Google Earth. (I don\u0026rsquo;t know how recent this is, though.)\n(The above site is not to be confused with the AEC\u0026rsquo;s own Tallyroom which also has a lot of results for the downloading.)\nHowever, I don\u0026rsquo;t know of any sites which add Voronoi diagrams around the polling booths so as to give a picture of the voting characteristics of an electorate. Whether this is of any use is of course a moot point.\n","link":"https://numbersandshapes.net/posts/further_mapping/","section":"posts","tags":["GIS","voting"],"title":"Further mapping: a win and a near miss"},{"body":"","link":"https://numbersandshapes.net/tags/voting/","section":"tags","tags":null,"title":"Voting"},{"body":"This continues on from the previous post, trying to make some sense of the voting in my electorate of Wills and the neighbouring electorate of Cooper. Both these electorates (or more formally \u0026ldquo;Divisions\u0026rdquo;), as I mentioned in the previous post, are very similar in their geography, demography, and history.\nLast post I simply showed a map of voting booths, using a circle roughly proportional to the size of the ratio of votes between the two major candidates. This used a local system called Two Candidate Preferred, which indicates the results after all preferences have been distributed. Australian lower house elections use a preferential system formally called instant run-off voting, in which each voter numbers all candidates in order of preference. The candidates with lowest counts are successively removed from the counting; their ballots being passed on to other candidates using the highest available preference on a ballot. This continues until only two candidates remain; the one with the largest number of votes is the winner.\nAlthough the full count can take some weeks - this must include all the pre-poll votes, postal votes, and absentee votes - an indicative TCP is usually available on the evening of an election. There are sometimes a handful of electorates for which an outcome may not be known for some time, especially if the count is very close. In this most recent election, the division of Macnamara took a long time to be counted: it was a three way contest between Liberal, Labor, and the Greens, with very similar first preference counts for all three parties. It was thus a count which relied very heavily on preferences.\nFor this post I was interested in overlaying the electorate with a Voronoi diagram based on the booths. This is a subdivision of the electorate into regions around each booth; each such region consists of the points in the plane which are close to that particular booth than any other. If we make the simplifying (and not unreasonable) assumption that everybody votes in the booth closest to where they live, we can thus subdivide the electorate into Greens/ Labor regions.\nThe idea is to colour each region by its TCP: a booth that favours labour will have its corresponding region red, and a booth that favours the Greens will have its corresponding region green.\nTo obtain the Voronoi diagram we make use of the Python library geovoronoi which returns regions as shapefiles. These can then be easily converted to Json files for including on a folium map.\nHere are the results, first for Wills:\nand for Cooper:\nNaturally these maps cannot tell us everything, and their limitations must be noted: there is no attempt to provide shades of colour for the size of the ratio. That is, a booth with a Labor to Greens voting ratio of 3.5 gets the same shade of red as a booth with a ratio of 1.01. However, the popups show the ratio at each booth.\nThe numbers of votes cast at the booths are not equal. For instance, if you go to the AEC page for Wills and check out the TCP numbers by polling place, numbers of votes cast range from 71 at Strathmore North to 2739 at Brunswick North, to even larger numbers at the pre-poll voting centres, Northcote and Pascoe Vale, with 5210 and 15141 total votes cast respectively. A better map may make adjustments both for the value of the ratio, and the total number of votes cast.\n","link":"https://numbersandshapes.net/posts/post_election_mapping/","section":"posts","tags":["GIS","voting"],"title":"Post-election mapping"},{"body":"So the Australian federal election of 2022 is over as far as the public is concerned; all votes have been cast and now it\u0026rsquo;s a matter of waiting while the Australian Electoral Commission tallies the numbers, sorts all the preferences, and arrives at a result. Because of the complications of the voting system, and of all the checks and balances within it, a final complete result may not be known for some days or even weeks. What is known, though, is that the sitting government has been ousted, and that the Australian Labor Party (ALP) will lead the new government. Whether the ALP amasses enough wins for it to govern with a complete majority is not known; they may have a \u0026ldquo;minority government\u0026rdquo; in coalition with either independent candidates or the Greens.\nIn Australia, there are 151 federal electorates or \u0026ldquo;Divisions\u0026rdquo;; each one corresponds to seat in the House of Representatives; the Lower House of the Federal government. The winner of an election is whichever party or coalition wins the majority of seats; the Prime Minister is simply the leader of the major party in that coalition. Australians thus have no say whatsoever in the Prime Minister; that is entirely a party matter, which is why Prime Ministers have sometimes been replaced in the middle of a term.\nMy concern is the neighbouring electorates of Cooper and Wills. Both are very similar both geographically and politically; both are Labor strongholds, in each of which the Greens have made considerable inroads. Indeed Cooper (called Batman until a few years ago) used to be one of the safest Labor seats in the country; it has now become far less so, and in each election the battle is now between Labor and the Greens. Both are urban seats, in Melbourne; in each of them the southern portion is more gentrified, diverse, and left-leaning, and the Northern part is more solidly working-class, and Labor-leaning. In each of them the dividing line is Bell St, known as the \u0026ldquo;tofu curtain\u0026rdquo;. (Also as the \u0026ldquo;Latte Line\u0026rdquo; or the \u0026ldquo;Hipster-Proof Fence\u0026rdquo;.)\nThus Greens campaigning consists of letting the southerners know they haven\u0026rsquo;t been forgotten, and attempting to reach out to the northerners. This is mainly done with door-knocking by volunteers, and there is never enough time, or enough volunteers, to reach every household.\nAnyway, here are some maps showing the Greens/Labor result at each polling booth. The size of the circle represents the ratio of votes: red for a Labor majority; green for a Greens majority. And the popup tooltip gives the name of the polling booth, and the swing either to Labor or to the Greens.\nI couldn\u0026rsquo;t decide the best way of displaying the swings, so in the end I just displayed the swing to Labor in all booths with a Labor majority, even if that swing was sometimes negative. And similarly for the Greens. Note that a large swing may correspond to a relatively small number of votes being cast.\nThe Division of Cooper The Division of Wills ","link":"https://numbersandshapes.net/posts/post_election_swings/","section":"posts","tags":["GIS","voting"],"title":"Post-election swings"},{"body":"This post illustrates the working of Ramanujan\u0026rsquo;s generating functions for solving Euler\u0026rsquo;s diophantine equation \\(a^3+b^3=c^3+d^3\\) as described by Andrews and Berndt in \u0026ldquo;Ramanujan\u0026rsquo;s Lost Notebook, Part IV\u0026rdquo;, pp 199 - 205 (Section 8.5). The text is available from Springer.\nRamanujan\u0026rsquo;s result is that if\n\\[ f_1(x) = \\frac{1+53x+9x^2}{1-82x-82x^2+x^3} = a_0+a_1x+a_2x^2+a_3x^3+\\cdots = \\alpha_0+\\frac{\\alpha_1}{x}+\\frac{\\alpha_2}{x^2}+\\frac{\\alpha_3}{x^3}+\\cdots\\]\n\\[ f_2(x) = \\frac{2-26x-12x^2}{1-82x-82x^2+x^3} = b_0+b_1x+b_2x^2+b_3x^3+\\cdots = \\beta_0+\\frac{\\beta_1}{x}+\\frac{\\beta_2}{x^2}+\\frac{\\beta_3}{x^3}+\\cdots\\]\n\\[ f_3(x) = \\frac{2+8x-10x^2}{1-82x-82x^2+x^3} = c_0+c_1x+c_2x^2+c_3x^3+\\cdots = \\gamma_0+\\frac{\\gamma_1}{x}+\\frac{\\gamma_2}{x^2}+\\frac{\\gamma_3}{x^3}+\\cdots\\]\nthen for every value of \\(n\\) we have:\n\\(a_n^3+b_n^3=c_n^3+(-1)^3\\)\nand\n\\(\\alpha_n^3+\\beta_n^3=\\gamma_n^3-(-1)^3\\)\nthus providing infinite sequences of solutions to Euler\u0026rsquo;s equations. The values \\(\\alpha_1,\\beta_1,\\gamma_1\\) are \\(9, -12, -10\\) giving rise to\n\\((9)^3+(-12)^3=(-10)^3-(-1)^3\\)\nwhich can be rewritten as\n\\(9^3+10^3=12^2+1^3\\).\nAndrews and Berndt comment that:\nThis is another of those many results of Ramanujan for which one wonders, “How did he ever think of this?”\nAnd we all know the story from Hardy about Ramanujan\u0026rsquo;s comment about the number 1729.\nimport sympy as sy sy.init_printing(num_columns=120) x,a,b,c = sy.var(\u0026#39;x,a,b,c\u0026#39;) n = sy.Symbol(\u0026#39;n\u0026#39;, positive=True, integer=True) Start by entering the three rational functions.\ng = 1-82*x-82*x**2+x**3 f1 = (1+53*x+9*x**2)/g f2 = (2-26*x-12*x**2)/g f3 = (2+8*x-10*x**2)/g display(f1) display(f2) display(f3) \\[ \\frac{9x^2+53x+1}{x^3-82x^2-82x+1} \\]\n\\[ \\frac{-12x^2-26x+2}{x^3-82x^2-82x+1} \\]\n\\[ \\frac{-10x^2+8x+2}{x^3-82x^2-82x+1} \\]\nfp1 = f1.apart(x,full=True).doit() fp2 = f2.apart(x,full=True).doit() fp3 = f3.apart(x,full=True).doit() fs1 = [z.simplify() for z in fp1.args] fs2 = [z.simplify() for z in fp2.args] fs3 = [z.simplify() for z in fp3.args] display(fs1) display(fs2) display(fs3) \\[ \\left[ -\\frac{43}{85x+85}, \\frac{8(101+11\\sqrt{85})}{85(2x-83-9\\sqrt{85})}, \\frac{8(101-11\\sqrt{85})}{85(2x-83+9\\sqrt{85})}\\right] \\]\n\\[ \\left[ -\\frac{16}{85x+85}, \\frac{28(-37-4\\sqrt{85})}{85(2x-83-9\\sqrt{85})}, \\frac{28(-37+4\\sqrt{85})}{85(2x-83+9\\sqrt{85})}\\right] \\]\n\\[ \\left[ -\\frac{16}{85x+85}, \\frac{6(-139-15\\sqrt{85})}{85(2x-83-9\\sqrt{85})}, \\frac{6(-139+15\\sqrt{85})}{85(2x-83+9\\sqrt{85})}\\right] \\]\nNote that the denominators of each fraction is the same (as we\u0026rsquo;d expect).\nNow we use the fact that\n\\[\\frac{a}{bx+c}\\]\nhas the infinite series expansion\n\\[\\frac{a}{c}\\left(1- \\frac{b}{c}x+\\left(\\frac{b}{c}\\right)^2x^2-\\left(\\frac{b}{c}\\right)^3x^3+\\cdots\\right)\\]\nThis means that the coefficient of \\(x^n\\) is\n\\[\\frac{a}{c}\\left(-\\frac{b}{c}\\right)^n\\]\nBeacuse of the denominators, the values of \\(b\\) and \\(c\\) are always the same. We start by considering fs1, which consists of the partial fraction sums of f1.\na1_s = [sy.numer(z) for z in fs1] b1_s = [sy.denom(z).coeff(x) for z in fs1] c1_s = [sy.denom(z).coeff(x,0) for z in fs1] ac1_s = [sy.simplify(s/t) for s,t in zip(a1_s,c1_s)] bc1_s = [sy.simplify(s/t) for s,t in zip(b1_s,c1_s)] display(ac1_s) display(bc1_s) \\[ \\left[-\\frac{43}{85},\\frac{64}{85}-\\frac{8\\sqrt{85}}{85}, \\frac{64}{85}+\\frac{8\\sqrt{85}}{85}\\right] \\]\n\\[ \\left[1, -\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}, -\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right] \\]\nNow we can determine the coefficient of \\(x^n\\) in the power series expansion of \\(f_1(x)\\):\na_n = sum(s*(-t)**n for s,t in zip(ac1_s,bc1_s)) display(a_n) \\[ -\\frac{43(-1)^n}{85} + \\left(\\frac{64}{85}-\\frac{8\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(\\frac{64}{85}+\\frac{8\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\]\nAnd repeat all of the above for \\(f_2(x)\\) and its partial fractions fs2.\na2_s = [sy.numer(z) for z in fs2] b2_s = [sy.denom(z).coeff(x) for z in fs2] c2_s = [sy.denom(z).coeff(x,0) for z in fs2] ac2_s = [sy.simplify(s/t) for s,t in zip(a2_s,c2_s)] bc2_s = [sy.simplify(s/t) for s,t in zip(b2_s,c2_s)] b_n = sum(s*(-t)**n for s,t in zip(ac2_s,bc2_s)) display(b_n) \\[ -\\frac{16(-1)^n}{85} + \\left(\\frac{77}{85}-\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(\\frac{77}{85}+\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\]\nContinuing for \\(f_3(x)\\):\na3_s = [sy.numer(z) for z in fs3] b3_s = [sy.denom(z).coeff(x) for z in fs3] c3_s = [sy.denom(z).coeff(x,0) for z in fs3] ac3_s = [sy.simplify(s/t) for s,t in zip(a3_s,c3_s)] bc3_s = [sy.simplify(s/t) for s,t in zip(b3_s,c3_s)] c_n = sum(s*(-t)**n for s,t in zip(ac3_s,bc3_s)) display(c_n) \\[ -\\frac{16(-1)^n}{85} + \\left(\\frac{93}{85}-\\frac{9\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(\\frac{93}{85}+\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\]\nIn order to see their similarities and differences, we now show them together:\ndisplay(a_n) display(b_n) display(c_n) \\[ -\\frac{43(-1)^n}{85} + \\left(\\frac{64}{85}-\\frac{8\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(\\frac{64}{85}+\\frac{8\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\]\n\\[ -\\frac{16(-1)^n}{85} + \\left(\\frac{77}{85}-\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(\\frac{77}{85}+\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\]\n\\[ -\\frac{16(-1)^n}{85} + \\left(\\frac{93}{85}-\\frac{9\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(\\frac{93}{85}+\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\]\nThe first few coefficients can be checked against a Taylor series expansion:\ns1 = f1.series(n=6) display([s1.coeff(x,k) for k in range(6)]) a_s = [a_n.subs(n,k).simplify() for k in range(6)] display(a_s) \\begin{aligned} \u0026amp;[1,\\quad 135,\\quad 11161,\\quad 926271,\\quad 76869289,\\quad 6379224759]\\\\ \u0026amp;\\\\ \u0026amp;[1,\\quad 135,\\quad 11161,\\quad 926271,\\quad 76869289,\\quad 6379224759] \\end{aligned}\ns2 = f2.series(n=6) display([s2.coeff(x,k) for k in range(6)]) b_s = [b_n.subs(n,k).simplify() for k in range(6)] display(b_s) \\begin{aligned} \u0026amp;[2,\\quad 138,\\quad 11468,\\quad 951690,\\quad 78978818,\\quad 6554290188]\\\\ \u0026amp;\\\\ \u0026amp;[2,\\quad 138,\\quad 11468,\\quad 951690,\\quad 78978818,\\quad 6554290188] \\end{aligned}\ns3 = f3.series(n=6) display([s3.coeff(x,k) for k in range(6)]) c_s = [c_n.subs(n,k).simplify() for k in range(6)] display(c_s) \\begin{aligned} \u0026amp;[2,\\quad 172,\\quad 14258,\\quad 1183258,\\quad 98196140,\\quad\\quad 8149006378]\\\\ \u0026amp;\\\\ \u0026amp;[2,\\quad 172,\\quad 14258,\\quad 1183258,\\quad 98196140,\\quad\\quad 8149006378] \\end{aligned}\nNow, if everything has behaved properly, we should now have\n\\[ a^3 + b^3 = c^3 + (-1)^3 \\]\nand we can check the first few values:\n[s**3+t**3-u**3 for s,t,u in zip(a_s,b_s,c_s)] \\[ [1,\\quad -1,\\quad 1,\\quad -1,\\quad 1,\\quad -1] \\]\nAnd now for the general result:\nsy.powsimp(sy.expand(a_n**3 + b_n**3 - c_n**3),combine=\u0026#39;all\u0026#39;,force=True).factor() \\[ (-1)^n \\]\nWoo hoo!\nNow for the other expansions, in negative powers of \\(x\\); in other words based on the the functions \\(f_k(1/x)\\). We\u0026rsquo;ll rename these functions: \\(g_k(x) = f_k(1/x)\\). After that it\u0026rsquo;s pretty much a carbon copy of the preceding computations.\ng1 = f1.subs(x,1/x).simplify() g2 = f2.subs(x,1/x).simplify() g3 = f3.subs(x,1/x).simplify() display(g1) display(g2) display(g3) \\[ \\frac{x(x^2+53x+9)}{x^3-82x^2-82x+1} \\]\n\\[ \\frac{2x(x^2 - 13x - 6)}{x^3 - 82x^2 - 82x + 1} \\]\n\\[ \\frac{2x(x^2 + 4x - 5)}{x^3 - 82x^2 - 82x + 1} \\]\ngp1 = g1.apart(x,full=True).doit() gp2 = g2.apart(x,full=True).doit() gp3 = g3.apart(x,full=True).doit() gs1 = [z.simplify() for z in gp1.args] gs2 = [z.simplify() for z in gp2.args] gs3 = [z.simplify() for z in gp3.args] display(gs1) display(gs2) display(gs3) \\[ \\left[1, \\frac{43}{85x+85}, \\frac{8(1429+155\\sqrt{85})}{85(2x-83-9\\sqrt{85})}, \\frac{8(1429-155\\sqrt{85})}{85(2x-83+9\\sqrt{85})}\\right] \\]\n\\[ \\left[2, -\\frac{16}{85x+85}, \\frac{14(839+91\\sqrt{85})}{85(2x-83-9\\sqrt{85})}, \\frac{14(8397-91\\sqrt{85})}{85(2x-83+9\\sqrt{85})}\\right] \\]\n\\[ \\left[2, \\frac{16}{85x+85}, \\frac{12(1217+132\\sqrt{85})}{85(2x-83-9\\sqrt{85})}, \\frac{12(1217-132\\sqrt{85})}{85(2x-83+9\\sqrt{85})}\\right] \\]\nFor ease of writing in Python, we\u0026rsquo;ll use d, e and f instead of \\(\\alpha\\), \\(\\beta\\) and \\(\\gamma\\).\nd1_s = [sy.numer(z) for z in gs1] e1_s = [sy.denom(z).coeff(x) for z in gs1] f1_s = [sy.denom(z).coeff(x,0) for z in gs1] df1_s = [sy.simplify(s/t) for s,t in zip(d1_s,f1_s)] ef1_s = [sy.simplify(s/t) for s,t in zip(e1_s,f1_s)] d_n = sum(s*(-t)**n for s,t in zip(df1_s,ef1_s)) d2_s = [sy.numer(z) for z in gs2] e2_s = [sy.denom(z).coeff(x) for z in gs2] f2_s = [sy.denom(z).coeff(x,0) for z in gs2] df2_s = [sy.simplify(s/t) for s,t in zip(d2_s,f2_s)] ef2_s = [sy.simplify(s/t) for s,t in zip(e2_s,f2_s)] e_n = sum(s*(-t)**n for s,t in zip(df2_s,ef2_s)) d3_s = [sy.numer(z) for z in gs3] e3_s = [sy.denom(z).coeff(x) for z in gs3] f3_s = [sy.denom(z).coeff(x,0) for z in gs3] df3_s = [sy.simplify(s/t) for s,t in zip(d3_s,f3_s)] ef3_s = [sy.simplify(s/t) for s,t in zip(e3_s,f3_s)] f_n = sum(s*(-t)**n for s,t in zip(df3_s,ef3_s)) display(d_n) display(e_n) display(f_n) \\[ -\\frac{43(-1)^n}{85} + \\left(-\\frac{64}{85}-\\frac{8\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(-\\frac{64}{85}+\\frac{8\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\]\n\\[ -\\frac{16(-1)^n}{85} + \\left(-\\frac{77}{85}-\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(-\\frac{77}{85}+\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\]\n\\[ -\\frac{16(-1)^n}{85} + \\left(-\\frac{93}{85}-\\frac{9\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(-\\frac{93}{85}+\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\]\nAs before, a quick check:\nds = [d_n.subs(n,k).simplify() for k in range(6)] es = [e_n.subs(n,k).simplify() for k in range(6)] fs = [f_n.subs(n,k).simplify() for k in range(6)] display(ds) display(es) display(fs) \\begin{aligned} \u0026amp;[-1, 9, 791, 65601, 5444135, 451797561]\\\\ \u0026amp;\\\\ \u0026amp;[-2, -12, -1010, -83802, -6954572, -577145658]\\\\ \u0026amp;\\\\ \u0026amp;[-2, -10, -812, -67402, -5593538, -464196268] \\end{aligned}\n[s**3+t**3-v**3 for s,t,v in zip(ds,es,fs)] \\[ [-1,\\quad 1,\\quad -1,\\quad 1,\\quad -1,\\quad 1] \\]\nAnd finally, confirming the general result:\nsy.powsimp(sy.expand(d_n**3 + e_n**3 - f_n**3),combine=\u0026#39;all\u0026#39;,force=True).factor() \\[ -(-1)^n \\]\nAgain, woo hoo!\n","link":"https://numbersandshapes.net/posts/ramanujans_cubes/","section":"posts","tags":["mathematics","computation"],"title":"Ramanujan's cubes"},{"body":"Wordle is a pleasant game, basically Mastermind with words. You choose an English word (although it can also be played in other languages), and then you\u0026rsquo;re told if your letters are incorrect, correct but in the wrong place, or correct and in the right place. These are shown by the colours grey, yellow, and green.\nThe genius is that you can share your result: the grid of coloured squares which shows how quickly or slowly you\u0026rsquo;ve guessed the word. The shared grid shows only the colours, and not the letters, so is not helpful for anybody else. At the time of writing, Twitter seems awash with Wordle grids.\nFor example, today my attempt looked like this:\nbut the grid I could share looked like this:\nSince all words are English, each turn considerably lessens the pool of words from which to choose. Since you have no idea what the hidden word might be, you just need to make a guess at each stage. I don\u0026rsquo;t know what pool of words is used to create the daily wordle, whether its mostly simple English, or whether it includes some unusual words.\nSo the plan was to create a Wordle \u0026ldquo;helper\u0026rdquo; program which would provide the list of allowable words at each stage. I used Julia for speed.\nto start, read in the word list (I used the all5.txt list from the previous post):\nJulia\u0026gt; wds = readlines(open(\u0026#34;all5.txt\u0026#34;)) What we\u0026rsquo;re going to do is to create a simple function which will take a word list, a chosen word, and its Wordle colours, and produce a new word list containing all possible usable words. It will look like:\nwordle_guess(word_list, word, colours)\nand so for the third word down above, we would use something like:\nwordle_guess(current_words,\u0026quot;pylon\u0026quot;,\u0026quot;nnyyy\u0026quot;)\nwhere the characters n, y, g will be used for grey, yellow, and green. The use of n can be taken to stand for \u0026ldquo;no\u0026rdquo; or \u0026ldquo;nil\u0026rdquo; or \u0026ldquo;not\u0026rdquo; or \u0026ldquo;never\u0026rdquo;. (For an entertaining riff on words beginning with \u0026ldquo;n\u0026rdquo; I can\u0026rsquo;t recommend strongly enough the first story in Stanisław Lem\u0026rsquo;s \u0026ldquo;The Cyberiad\u0026rdquo; which is quite possibly the most brilliantly inventive book ever - and certainly in the realm of science fiction.)\nThen the logic of the program will be very simple; we walk through the current word one letter at a time, and reduce the word-list according to the colour:\nfor grey, find all words which do not use that particular letter for yellow, find all words with that letter but in another position for green, find all words with that letter in that position No doubt there are cleverer ways of doing this, but here\u0026rsquo;s what I whipped up:\nfunction wordle_guess(ws,wd,cs) wlist = copy(ws) for i in 1:5 if cs[i] == \u0026#39;n\u0026#39; wlist = [w for w in wlist if isnothing(findfirst(wd[i],w))] elseif cs[i] == \u0026#39;y\u0026#39; wlist = [w for w in wlist if !isnothing(findfirst(wd[i],w)) \u0026amp;\u0026amp; w[i] != wd[i]] else wlist = [w for w in wlist if w[i]==wd[i]] end end return(wlist) end And here\u0026rsquo;s how it was used:\nJulia\u0026gt; wds2 = wordle_guess(wds,\u0026#34;cream\u0026#34;,\u0026#34;nnnnn\u0026#34;) Julia\u0026gt; wds3 = wordle_guess(wds2,\u0026#34;shunt\u0026#34;,\u0026#34;nnnyn\u0026#34;) Julia\u0026gt; wds4 = wordle_guess(wds3,\u0026#34;pylon\u0026#34;,\u0026#34;nnyyy\u0026#34;) 2-element Vector{String}: \u0026#34;knoll\u0026#34; \u0026#34;lingo\u0026#34; At this stage I simply had to flip a coin, and as you see from above, I chose the wrong one!\nThe remarkable thing is how much the word lists are reduced at each turn:\nJulia\u0026gt; [length(x) for x in [wds,wds2,wds3,wds4]] 4-element Vector{Int64}: 6196 937 37 2 An interesting question might be: if you choose a random word at each stage from the current list of usable words, what is the expected number of trials to find the word? And does that value differ between different hidden words?\nAnd also: given a hidden word, and choosing a random word each time, what is the maximum number of trials that could be used?\nThese will need to wait for another day (or maybe they\u0026rsquo;ve been answered already.)\n","link":"https://numbersandshapes.net/posts/wordle/","section":"posts","tags":["Julia"],"title":"Wordle"},{"body":"I was going to make a little post about Wordle, but I go sidetracked exploring five letter words. At the same time, I had a bit of fun with regular expressions and some simple scripting with ZSH.\nThe start was to obtain lists of 5-letter words. One is available at the Stanford Graphbase site; the file sgb-words.txt contains \u0026ldquo;the 5757 five-letter words of English\u0026rdquo;. Others are available through English-language wordlists, such as those in the Linux directory /usr/share/dict/. There are two such files for British and American English.\nSo we can start by gathering all these different lists of words, and also sorting Knuth\u0026rsquo;s file so that it is in alphabetical order.\nHere\u0026rsquo;s how (assuming that we\u0026rsquo;re in a writable directory that will contain these files):\ngrep -E \u0026#39;^[a-z]{5}$\u0026#39; /usr/share/dict/american-english \u0026gt; ./usa5.txt grep -E \u0026#39;^[a-z]{5}$\u0026#39; /usr/share/dict/british-english \u0026gt; ./brit5.txt sort sgb-words.txt \u0026gt; ./knuth5.txt The regular expressions in the first two lines simply ask for words of exactly five letters made from lower-case characters. This eliminates proper names and words with apostrophes.\nNow let\u0026rsquo;s see how many words each file contains:\n$ wc -l *5.txt 5300 brit5.txt 5757 knuth5.txt 5150 usa5.txt 16207 total Note the nice use of ZSH\u0026rsquo;s powerful globbing features - one area which it is more powerful than BASH.\nNow there are too many words to see exactly the differences between them, but to start let\u0026rsquo;s list the words in brit5.txt which are not in usa5.txt, and also the words in usa5.txt which are not in brit5.txt:\n$ grep -f usa5.txt -v brit5.txt \u0026gt; brit-usa_d.txt $ grep -f brit5.txt -v usa5.txt \u0026gt; usa-bri_d.txt I\u0026rsquo;m using a debased version of set difference for the output file, so that brit-usa_d.txt are those words in brit5.txt which are not in usa5.txt. I\u0026rsquo;ve added a _d to make a handle for globbing:\n$ wc -l *_d.txt 188 brit-usa_d.txt 38 usa-brit_d.txt 226 total And now we can look at the contents of these files, using the handy Linux `column` command to print the output with fewer lines:\n$ cat usa-brit_d.txt | column arbor\tchili\tfagot\tfeces\thonor\tmiter\tniter\trigor\tsavor\tvapor ardor\tcolor\tfavor\tfiber\thumor\tmolds\tocher\truble\tslier\tvigor armor\tdolor\tfayer\tfuror\tlabor\tmoldy\todors\trumor\ttumor calks\tedema\tfecal\tgrays\tliter\tmolts\tplows\tsaber\tvalor Notice, as you may expect, that this file contains American spellings: \u0026ldquo;rigor\u0026rdquo; instead of British \u0026ldquo;rigour\u0026rdquo;, \u0026ldquo;liter\u0026rdquo; instead of British \u0026ldquo;litre\u0026rdquo;, and so on. However, the other file difference contains only a few spelling differences, and quite a lot of words not in the American wordlist:\n$ cat brit-usa_d.txt | column abaci\tblent\tcroci\tflyer\thollo\tliras\tnitre\tpupas\tslily\ttogae\twrapt aeons\tblest\tcurst\tfogey\thomie\tlitre\tnosey\trajas\tslyer\ttopis\twrier agism\tblubs\tdados\tfondu\thooka\tloxes\tochre\tranis\tsnuck\ttorah\tyacks ameer\tbocce\tdeers\tfrier\thorsy\tlupin\todour\trecta\tspacy\ttorsi\tyocks amirs\tbocci\tdicky\tgamey\thuzza\tmacks\toecus\trelit\tspelt\ttsars\tyogin amnia\tboney\tdidos\tgaols\tidyls\tmaths\tpanty\tropey\tspick\ttyres\tyucks ampul\tbosun\tditzy\tgayly\tikons\tmatts\tpapaw\tsabre\tspilt\ttzars\tyuppy amuck\tbriar\tdjinn\tgipsy\timbed\tmavin\tpease\tsaree\tstogy\tulnas\tzombi appal\tbrusk\tdrily\tgismo\tindue\tmetre\tpenes\tsheik\tstyes\tvacua aquae\tbunko\tenrol\tgnawn\tjehad\tmiaow\tpigmy\tsherd\tswops\tveldt arses\tburqa\tenure\tgreys\tjinns\tmicra\tpilau\tshlep\tsynch\tvitas aunty\tcaddy\teying\tgybed\tjunky\tmitre\tpilaw\tshoed\ttabus\tvizir aurae\tcalfs\teyrie\tgybes\tkabob\tmomma\tpinky\tshoon\ttempi\tvizor baddy\tcalif\tfaery\thadji\tkebob\tmould\tpodgy\tshorn\tthymi\twelch bassi\tcelli\tfayre\thallo\tkerbs\tmoult\tpodia\tshtik\ttikes\twhirr baulk\tchapt\tfezes\thanky\tkiddy\tmynah\tpricy\tsiree\ttipis\twhizz beaux\tclipt\tfibre\theros\tkopek\tnarks\tprise\tsitup\ttiros\twizes bided\tconey\tfiord\thoagy\tleapt\tnetts\tpryer\tskyed\ttoffy\twooly Of course, some of these words are spelled with a different number of letters in American English: for example the British \u0026ldquo;djinn\u0026rdquo; is the American \u0026ldquo;jinn\u0026rdquo;; the British \u0026ldquo;saree\u0026rdquo; is the American \u0026ldquo;sari\u0026rdquo;.\nNow of course we want to see how the Knuth file differs, as it\u0026rsquo;s the file with the largest number of words:\n$ grep -f usa5.txt -v knuth5.txt \u0026gt; knuth-usa_d.txt $ grep -f brit5.txt -v knuth5.txt \u0026gt; knuth-brit_d.txt $ wc -l knuth*_d.txt 895 knuth-brit_d.txt 980 knuth-usa_d.txt 1875 total Remarkably enough, there are also words in both the original files which are not in Knuth\u0026rsquo;s list:\n$ grep -f knuth5.txt -v usa5.txt \u0026gt; usa-knuth_d.txt $ grep -f knuth5.txt -v brit5.txt \u0026gt; brit-knuth_d.txt $ wc -l *knuth_d.txt 438 brit-knuth_d.txt 373 usa-knuth_d.txt 811 total So maybe our best bet would be to concatenate all the files, and take the all the words, leaving out any duplicates. Something like this:\n$ cat usa5.txt brit5.txt knuth5.txt | sort | uniq -u \u0026gt; allu5.txt $ cat usa5.txt brit5.txt knuth5.txt | sort | uniq -d \u0026gt; alld5.txt $ cat allu5.txt alld5.txt | sort \u0026gt; all5.txt The first line finds all the words which are unique - that is, that appear only once in the concatenated file, and the second line finds all the words which are repeated. These two lists are disjoint, and so may then be concatenated to form a master list, which can be found to contain 6196 words.\nSurely this file is complete? Well, the English language is a great collector of words, and every year we find new words being used, many from other languages and cultures. Here are some words that are not in the all5.txt file:\nAustralian words: galah, dunny, smoko, durry, bogan, chook (there are almost certainly others)\nIndian words: crore, lakhs, dosai, iddli, baati, chaat, kheer, kofta, kulfi, rasam, poori (the first two are numbers, the others are foods)\nScots words: canty, curch, flang, kythe, plack, routh, saugh, teugh, wadna - these are all used by Burns in his poems, which are written in English (admittedly a dialectical form of it).\nNew words: qango, fubar, crunk, noobs, vlogs, rando, vaper (the first two are excellent acronyms; the others are new words)\nAs with the Australian words, none of these lists are exhaustive; the full list of five-letter English words not in the file all5.txt would run probably into the many hundreds, maybe even thousands.\nA note on word structures I was curious about the numbers of vowels and consonants in words. To start, here\u0026rsquo;s a little Julia function which encodes the positions of consonants as an integer between 0 and 31. For example, take the word \u0026ldquo;drive\u0026rdquo;. We can encode this as [1,1,0,1,0] where the 1\u0026rsquo;s are at the positions of the consonants. Then this can be considered as binary digits representing the number 27.\njulia\u0026gt; function cvs(word) vs = indexin(collect(word),collect(\u0026#34;aeiou\u0026#34;)) vs2 = replace(x -\u0026gt; isnothing(x) ? 1 : 0,vs) return(sum(vs2 .* [16,8,4,2,1])) end Now we simply walk through the words in all5.txt determining their values as we go, and keeping a running total:\njulia\u0026gt; wds = readlines(open(\u0026#34;all5.txt\u0026#34;)) julia\u0026gt; cv = zeros(Int16,32) julia\u0026gt; for w in wds c = cvs(w) cv[c+1] = cv[c+1]+1 end julia\u0026gt; hcat(0:31,cv) 32×2 Matrix{Int64}: 0 0 1 0 2 2 3 1 4 4 5 48 6 10 7 19 8 2 9 61 10 96 11 156 12 24 13 262 14 21 15 24 16 1 17 34 18 105 19 585 20 97 21 1514 22 432 23 1158 24 5 25 301 26 249 27 832 28 16 29 96 30 15 31 26 We see that the most common patterns are 21 = 10101, and 23 = 10111. But what about some of the smaller values?\njulia\u0026gt; for w in wds if cvs(w) == 24 println(w) end end stoae wheee whooo xviii xxiii Yes, there are some Roman numerals hanging about, and probably they should be removed. And one more, 30 = 11110:\njulia\u0026gt; for w in wds if cvs(w) == 30 println(w) end end chyme clxvi cycle hydra hydro lycra phyla rhyme schmo schwa style styli thyme thymi xxxvi Again a few Roman numerals. These may need to be removed by hand. One way to do this is by using regular expressions again:\n$ grep -E \u0026#39;[xlcvi]{5}\u0026#39; all5.txt civic civil clvii clxii clxiv clxix clxvi lxvii villi xcvii xviii xxiii xxvii xxxii xxxiv xxxix xxxvi and we see that we have 3 English words, and the rest Roman numerals. These can be deleted.\n","link":"https://numbersandshapes.net/posts/five_letter_words_in_english/","section":"posts","tags":null,"title":"Five letter words in English"},{"body":"What does one do on the first day of a new year but write a blog post, and in it clearly delineate all plans for the coming year? Well, I\u0026rsquo;m doing the first part, but not the second, as I know that any plans will not be fulfilled - something always gets in the way.\nYou will notice a complete absence of posts last year after April. This was due to pressures both at work and outside, and left me with little mental energy to blog. I don\u0026rsquo;t know that this year will be much different.\nHowever, a few things to note (in no particular order):\nI decided to upgrade my VPS, starting with my installation of Nextcloud. I\u0026rsquo;ve been using Nextcloud as a replacement for Dropbox for years. But I managed to stuff-up an upgrade. While it was unavailable to me I used Syncthing to keep my various computers in sync with each other, and I must say this works extremely well. So well, in fact, that it seems hardly worth going back to Nextcloud. And it seems that there are people who are ditching Nextcloud for simpler solutions: Syncthing for syncing, and something else for backups. Rsync could be used here; I\u0026rsquo;ve also been recommended to look at Duplicati.\nThe upgrade of both Nextcloud and my VPS system (Ubuntu 18.04 to Ubuntu 20.04) both went awry, and I ended up with a non-working system. I could ssh into it, but none of the services would work. So I decided to ditch the lot, and start from scratch by re-imaging my VPS. This \u0026ldquo;scorched earth\u0026rdquo; approach meant at one stroke I got rid of years of accumulated rubbish: files and apps I\u0026rsquo;d downloaded, experimented and discarded; any number of old unused docker containers and images. (Although I had made an effort to clean up those.)\nAnd I\u0026rsquo;ve been slowly building everything back up again, with much external help in particular for managing traefik, which I like very much. But it has a configuration which is tricky, at least for me.\nI did manage to attach a USB drive to my home wireless router, and make that drive visible to both Linux and Windows - which took longer than it should have, as I kept misunderstanding descriptions and instructions. But it\u0026rsquo;s working now. So in one sense I have at least local backups. However, I also need \u0026ldquo;off-site\u0026rdquo; backups.\nIn my teaching this last year I used, as for previous years, a mixture of Excel and Geogebra for my numerical methods class, and Excel with its Solver add-in for my linear programming class. These students are all pre-service teachers, and so I use the software they will be most likely to encounter in their professional lives. I have come to quite like Excel, and my students even get to do a bit of VBA programming. (Well, I write the programs, and then they edit them slightly.)\nI have a love-hate relationship with Geogebra. It does many things well, but there\u0026rsquo;s always an annoying limit, or things you can\u0026rsquo;t change. I hope to write up about this. But here\u0026rsquo;s one: Geogebra\u0026rsquo;s default variable names for lists (if you don\u0026rsquo;t give them names yourself) are l1, l2, l3 and so on. But Geogebra uses a sans-serif font in which a lower-case L is indistinguishable from an upper-case I. And you can\u0026rsquo;t change the font. So if you\u0026rsquo;re seeing \u0026ldquo;l1\u0026rdquo; for the first time, you can\u0026rsquo;t distinguish the first character. This is a very poor GUI decision. And it annoys me a lot, because it would be trivial to fix: use an upper-case L instead, or allow users to change the font!\nI taught for the first time a data analytics subject, based around R, which I\u0026rsquo;d never before used. Well, there\u0026rsquo;s nothing like having to teach something to learn it quickly, and I learned it well enough to teach it to a beginning class, and also to enjoy it. Like all languages, R comes in for plenty of criticism, and much of its functionality can be managed now with Python, but R has been a sort of standard for a decade or more, and that alone is a very good reason for learning it. What\u0026rsquo;s more, it seems to be getting a new lease of life with the tidyverse suite of packages byHadley Wickham. And these come with excellent documentation.\nI finished the year looking again at bicentric polygons, which fascinate me. Some years ago, Phil Todd, the creator of the Saltire geometry application, found a bicentric pentagon whose vertices were a subset of the vertices of a regular nonagon. You can see his PDF file here.\nI was wondering if there are other bicentric polygons whose vertices are subsets of a regular \\(n\\)-gon (other than triangles or regularly-spaced vertices), and this led me on a bit of a hunt. Using a very inefficient program (and in Python), I found no other bicentric pentagons, and the only bicentric quadrilaterals were right-angled kites; that is, whose vertices are at\n\\[ (\\pm 1,0),\\quad (\\cos(x),\\pm\\sin(x)) \\]\nfor \\(0\u0026lt;x\u0026lt;\\pi/2\\). This either means there are no others, or there are others I haven\u0026rsquo;t found. I don\u0026rsquo;t know. It would be nice to discover a symmetric but non-regular bicentric hexagon (vertices a subset of an \\(n\\)-gon, for \\(n\u0026gt;7\\).\nSo - software and mathematics - plenty going on!\n","link":"https://numbersandshapes.net/posts/new_year_2022/","section":"posts","tags":null,"title":"A new year (2022)"},{"body":"Steffensen\u0026rsquo;s method is based on Newton\u0026rsquo;s iteration for solving a non-linear equation \\(f(x)=0\\):\n\\[ x\\leftarrow x-\\frac{f(x)}{f\u0026rsquo;(x)} \\]\nNewton\u0026rsquo;s method can fail to work in a number of ways, but when it does work it displays qudratic convergence; the number of correct signifcant figures roughly doubling at each step. However, it also has the disadvntage of needing to compute the derivative as well as the function. This may be difficult for some functions.\nSteffensen\u0026rsquo;s idea was to use the quotient approximation of the derivative:\n\\[ f\u0026rsquo;(x)\\approx\\frac{f(x+h)-f(x)}{h} \\]\nwhen \\(h\\) is small, and since we trying to solve \\(f(x)=0\\), we may assume that in the neighbourhood of the solution \\(f(x)\\) is itself small, so can be used for \\(h\\). This means we can write\n\\[ f\u0026rsquo;(x)\\approx\\frac{f(x+f(x))-f(x)}{f(x)} \\]\nwhich leads to the following version of Newton\u0026rsquo;s method:\n\\[ x\\leftarrow x-\\frac{f(x)^2}{f(x+f(x))-f(x)}. \\]\nThis is a neat idea, and in fact when it works it converges almost as fast as Newton\u0026rsquo;s method. However, it is very senstive to the starting value. For example, suppose we want to find the value of \\(W(10)\\), where \\(W(x)\\) is Lambert\u0026rsquo;s \\(W\\) function; the inverse of \\(y=xe^x\\). Finding \\(W(10)\\) then means solving the equation\n\\[ xe^x-10=0. \\]\nNewton\u0026rsquo;s method uses the iteration\n\\[ x\\leftarrow x-\\frac{xe^x-10}{e^x(x+1)} \\]\nand with a positive starting value not too big will converge; the first 50 places of the solution are:\n\\[ 1.74552800274069938307430126487538991153528812908093 \\]\nStaring with \\(x=2\\) will produce over 1000 correct decimal places in 12 steps.\nIf we apply Steffensen\u0026rsquo;s method starting with \\(x=2\\) we\u0026rsquo;ll see values that wobble about 1.9 for ages 1.9 before converging to the wrong value. Newton\u0026rsquo;s method will work for almost any value (although the larger the initial value, the long the iterations take to \u0026ldquo;settle down\u0026rdquo;); Steffensen\u0026rsquo;s method will only work when the initial value is close to 1.7.\nA slight improvement Using the \u0026ldquo;central\u0026rdquo; approximation of the derivative:\n\\[ f\u0026rsquo;(x)\\approx\\frac{f(x+h)-f(x-h)}{2h} \\]\nmakes a considerable difference; this leads to the iteration\n\\[ x\\leftarrow x-\\frac{2f(x)^2}{f(x+f(x))-f(x-f(x))} \\]\nThis does however require the computation of three function values, rather than just the original two. A slightly faster version of the above is\n\\[ x\\leftarrow x-\\frac{f(x)^2}{f(x+\\frac{1}{2}f(x))-f(x-\\frac{1}{2}f(x))}. \\]\n","link":"https://numbersandshapes.net/posts/steffensen/","section":"posts","tags":["mathematics","computation"],"title":"A note on Steffensen's method for solving equations"},{"body":"As is well known, tanh-sinh quadrature takes an integral\n\\[ \\int_{-1}^1f(x)dx \\]\nand uses the substitution\n\\[ x = g(t) = \\tanh\\left(\\frac{\\pi}{2}\\sinh t\\right) \\]\nto transform the integral into\n\\[ \\int_{-\\infty}^{\\infty}f(g(t))g\u0026rsquo;(t)dt. \\]\nThe reason this works so well is that the derivative \\(g\u0026rsquo;(t)\\) dies away at a double exponentional rate; that is, at the rate of\n\\[ e^{-e^t} \\]\nIn fact,\n\\[ g\u0026rsquo;(t)=\\frac{\\frac{\\pi}{2}\\cosh t}{\\cosh^2(\\frac{\\pi}{2}\\sinh t)} \\]\nand for example, \\(g\u0026rsquo;(10)\\approx 4.4\\times 10^{-15022}\\).\nTo apply this technique, David Bailey, one of the strongest proponents for it, recommends choosing first a small value \\(h\\), some integer \\(N\\) and a working precision so that \\(|g\u0026rsquo;(Nh)f(g(Nh))|\\) is less than the precision you want. So if you want 1000 decimal place accuracy, you need to be sure that your working precision allows you to determine that\n\\[ \\|g\u0026rsquo;(Nh)f(g(Nh))\\| \u0026lt; 10^{-1000}. \\]\nIf you start with say \\(h=0.5\\) you can then halve this value at each step until you have approximations that agree to your precision. Bailey claims that \\(h=2^{-12}\\) is \u0026ldquo;sufficient to evaluate most integrals to 1000 place accuracy\u0026rdquo;.\nThe computation then requires calculating the value of\n\\[ h\\sum_{j=-N}^N g\u0026rsquo;(hj)f(g(hj)). \\]\n(I\u0026rsquo;m using exactly the same notation as in Bailey\u0026rsquo;s description.)\nThe elegant thing is that the nodes or absisscae (values at the which the function is evaluated) are based on values of \\(hj\\), each step of which includes all values from the previous step. So at each stage we only have to compute the \u0026ldquo;new\u0026rdquo; values.\nFor example, if \\(Nh=2\\), for example, the first positive values of \\(hj\\) are\n\\[ 0,\\frac{1}{2},1,\\frac{3}{2},2 \\]\nand the next step will have\n\\[ 0,\\frac{1}{4},\\frac{1}{2},\\frac{3}{4},1,\\frac{5}{4},\\frac{3}{2},\\frac{7}{4},2 \\]\nof which the new values are\n\\[ \\frac{1}{4},\\frac{3}{4},\\frac{5}{4},\\frac{7}{4}. \\]\nAt the next step, the new values will be\n\\[ \\frac{1}{8},\\frac{3}{8},\\frac{5}{8},\\frac{7}{8},\\frac{9}{8},\\frac{11}{8},\\frac{13}{8},\\frac{15}{8} \\]\nand so on. In fact at each step, with \\(h=2^{-k}\\) we only have to perform computations at the values\n\\[ \\frac{1}{2^k},\\frac{3}{2^k},\\frac{5}{2^k},\\ldots,\\frac{2N-1}{2^k}. \\]\nWe can express this sequence as\n\\[ h+2kh, h = 0,1,2,\\ldots N-1 \\]\nand as the value of \\(h\\) halves each step, the value of \\(N\\) doubles each step.\nA test in Julia I decided to see how easy this might be by attempting to evaluate\n\\[ \\int_0^1\\frac{\\arctan x}{x}dx \\]\nThis is a good test integral, as the integrand has a removable singularity at one end. By integrating the Taylor series term by term (and assuming convergence), we obtain:\n\\[ 1-\\frac{1}{3^2}+\\frac{1}{5^2}-\\frac{1}{7^2}+\\frac{1}{9^2}-\\cdots \\]\nand this particular sum is known as Catalan\u0026rsquo;s constant and denoted \\(G\\). It is unknown whether \\(G\\) is irrational, let alone transcendental. Because the integral is equal to a known value, the results can be checked against published values including one publication providing 1,500,000 decimal places.\nOver the interval \\([-1,1]\\) the integral transforms to\n\\[ \\int_{-1}^1\\frac{\\arctan((x+1)/2)}{x+1}dx \\]\nI decided to try for 1000 decimal places, and go up to \\(Nh=8\\), given that \\(g\u0026rsquo;(8)\\approx 2.5\\times 10^{-2030}\\). (This is overkill, of course.) We then need enough precision so that near the left endpoint, the function value is never given as -1. For example, in Julia:\nsetprecision(4000) function g(t)::BigFloat bt = big(t) bp = big(pi) return tanh(bp/2*sinh(bt)) end function gt(t)::BigFloat bt = big(t) bp = big(pi) return bp/2*cosh(bt)/cosh(bp/2*sinh(bt))^2 end function at(x)::BigFloat y = big(x) return atan((y+1)/2)/(y+1) end The trouble is at this precision, the value of \\(g(-8)\\) is given as \\(-1.0\\), at which the function is undefined. If we increase the precision to 7000, we find that\nsetprecision(7000) using Printf @printf \u0026#34;%.8e\u0026#34; 1+g(-8) 5.33290917e-2034 This small difference can\u0026rsquo;t be managed at only 4000 bits. Now we can have a crack at the first iteration of computation:\nh = big\u0026#34;0.5\u0026#34; N = 16 inds = [k*h for k=1:N] xs = [g(z) for z in inds] ws = [gd(z) for z in inds] ts = h*gd(0)*at(g(0)) ts += h*sum(ws[i]*at(xs[i]) for i = 1:N) ts += h*sum(ws[i]*at(-xs[i]) for i = 1:N) @printf \u0026#34;%.50e\u0026#34; ts 9.15969525022017573265491207994328001754668713901325e-01 and this is correct to about 6 decimal places. We are here exploiting the fact that the weight function \\(g\u0026rsquo;(t)\\) is even, and the subsitution function \\(g(t)\\) is odd, so we only have to compute values for positive \\(t\\).\nThe promise of tanh-sinh quadrature is that the accuracy roughly doubles for each halving of the step size. We can test this, by repeatedly iterating the current value with new ordinates, and compare each to a downloaded value of a few thousand decimal places of the constant:\nfor j = 2:10 h = h/2 inds = [h+2*k*h for k in 0:N-1] xs = [g(z) for z in inds] ws = [gd(z) for z in inds] ts = ts/2 + h*sum(ws[i]*at(xs[i]) for i in 1:N) ts = ts + h*sum(ws[i]*at(-xs[i]) for i in 1:N) print(j,\u0026#34; \u0026#34;) @printf \u0026#34;%.8e\u0026#34; abs(ts-catalan) println() N = N*2 end 2 6.01994061e-10 3 6.03834702e-20 4 8.07587315e-38 5 1.15722093e-74 6 9.05835440e-148 7 7.95770023e-294 8 2.44238219e-585 9 4.47198995e-1167 10 1.24233864e-1355 At this stage we have \\(h=0.5^{10}\\) and a result that is well over our intended accuracy of 1000 decimal places.\nIf we were instead to compare the absolute differences of successive results, we would see:\n2 3.93144679e-06 3 6.01994061e-10 4 6.03834702e-20 5 8.07587315e-38 6 1.15722093e-74 7 9.05835440e-148 8 7.95770023e-294 9 2.44238219e-585 10 4.47198995e-1167 and we could infer that we have at least 1167 decimal places accuracy. (If we were to go one more step, the difference becomes about \\(1.004\\times 10^{-2304}\\) which is the limit of the set precision.)\n","link":"https://numbersandshapes.net/posts/tanh-sinh-quadrature/","section":"posts","tags":["mathematics","computation"],"title":"Exploring Tanh-Sinh quadrature"},{"body":"An article by Bailey, Jeybalan and LI, \u0026ldquo;A comparison of three high-precision quadrature schemes\u0026rdquo;, and available online here, compares Gauss-Legendre quadrature, tanh-sinh quadrature, and a rule where the nodes and weights are given by the error function and its integrand respectively.\nHowever, Nick Trefethen of Oxford has shown experimentally that Clenshaw-Curtis quadrature is generally no worse than Gaussian quadrature, with the added advantage that the nodes and weights are easier to obtain.\nI\u0026rsquo;m using here the slightly sloppy convention of taking \u0026ldquo;Clenshaw-Curtis quadrature\u0026rdquo; to be the generic name for any integration rule over the interval \\([-1,1]\\) whose nodes are given by cosine values (more particularly, the zeros of Chebyshev polynomials). However, rules very similar to Clenshaw-Curtis were described by the Hungarian mathematician Lipót Fejér in the 1930s; some authors like to be very clear about the distinctions between the \u0026ldquo;Fejér I\u0026rdquo;, \u0026ldquo;Fejér II\u0026rdquo;, and \u0026ldquo;Clenshaw-Curtis\u0026rdquo; rules, all of which have very slightly different nodes.\nThe quadrature rule The particular quadrature rule may be considered to be an \u0026ldquo;open rule\u0026rdquo; in that, like Gauss-Legendre quadrature, it doesn\u0026rsquo;t use the endpoints. An \\(n\\)-th order rule will have the nodes\n\\[ x_k = \\cos\\left(\\frac{2k+1}{2n}\\pi\\right), k= 0,1,\\ldots,n-1. \\]\nThe idea is that we create an interpolating polynomial \\(p(x)\\) through the points \\((x_k,f(x_k)\\), and use that polynomial to obtain the integral approximation\n\\[ \\int_{-1}^1f(x)\\approx\\int_{-1}^1p(x)dx. \\]\nHowever, such a polynomial can be written in Lagrange form as\n\\[ p(x) = \\sum_{k=0}^{n-1}f(x_k)p_k(x) \\]\nwith\n\\[ p_k(x)=\\frac{\\prod(x-x_i)}{\\prod (x_k-x_i)} \\]\nwhere the products are taken over \\(0\\le i\\le n-1\\) excluding \\(k\\). This means that the integral approximation can be written as\n\\[ \\int_{-1}^1\\sum_{k=0}^{n-1}f(x_k)p_k(x)=\\sum_{k=0}^{n-1}\\left(\\int_{-1}^1p_k(x)dx\\right)f(x_k). \\]\nThus writing\n\\[ w_k = \\int_{-1}^1p_k(x)dx \\]\nwe have an integration rule\n\\[ \\int_{-1}^1f(x)dx\\approx\\sum w_kf(x_k). \\]\nAlternatively, as for a Newton-Cotes rule, we can determine the weights as being the unique values for which\n\\[ \\int_{-1}^1x^mdx = \\sum w_k(x_k)^m \\]\nfor all \\(m=0,1,\\ldots,n-1\\). The integral \\(I_n\\) on the left is equal to 0 for odd \\(n\\) and equal to \\(2/(n+1)\\) for even \\(n\\), so we have an \\(n\\times n\\) linear system consisting of equations\n\\[ w_0x_0^n+w_1x_1^n+\\cdots w_{n-1}x_{n-1}^m=I_m \\]\nFor example, here is how the weights could be constructed in Python:\nimport numpy as np N = 9 xs = [np.cos((2*(N-k)-1)*np.pi/(2*N)) for k in range(N)] A = np.array([[xs[i]**j for i in range(N)] for j in range(N)]) b = [(1+(-1)**k)/(k+1) for k in range(N)] ws = np.linalg.solve(A,b) ws[:,None] array([[0.05273665], [0.17918871], [0.26403722], [0.33084518], [0.34638448], [0.33084518], [0.26403722], [0.17918871], [0.05273665]]) Then we can approximate an integral, say\n\\[ \\int_{-1}^1e^{-x^2}dx \\]\nby\nf = lambda x: np.exp(-x*x) sum(ws * np.vectorize(f)(xs)) 1.4936477751634403 and this is correct to six decimal places.\nUse of the DCT The above method for obtaining the weights is conceptually easy, but computationally expensive. A far neater method, described in an article by Alvise Sommariva, is to use the discrete cosine transform - in particular the DCT III, which is available in most standard implementations as idct.\nDefine a vector \\(m\\) (called by Sommariva the weighted modified moments) of length \\(N\\) by\n\\begin{aligned} m_0\u0026amp;=\\sqrt{2}\\\\ m_k\u0026amp;=2/(1-k^2)\\text{ if \\(k\\ge 2\\) is even}\\\\ m_k\u0026amp;=0\\text{ if \\(k\\) is odd} \\end{aligned}\nThen the weights are given by the DCT of \\(m\\). Again in Python:\nfrom scipy.fftpack import idct m = [np.sqrt(2),0]+[(1+(-1)^k)/(1-k^2) for k in range(2,N)] ws = np.sqrt(2/N)*idct(m,norm=\u0026#39;ortho\u0026#39;) ws[:,None] array([[0.05273665], [0.17918871], [0.26403722], [0.33084518], [0.34638448], [0.33084518], [0.26403722], [0.17918871], [0.05273665]]) which is exactly the same as before.\nArbitrary precision Our choice here (in Python) will be the mpmath library, in spite of some criticisms against it. Our use of it though will be confined to simple arithmetic (sometimes of matrices) rather than more complicated routines such as quadrature. Note also that the criticisms were more in the nature of comparisons; the writer is in fact the principal author of mpmath.\nAs a start, here\u0026rsquo;s how we might develop the above nodes and weights for 30 decimal places. First the nodes:\nfrom mpmath import mp mp.dps = 30 xs = [mp.cos((2*(N-k)-1)*mp.pi/(2*N)) for k in range(N) for i in range(N): print(xs[i]) -0.984807753012208059366743024589 -0.866025403784438646763723170753 -0.642787609686539326322643409907 -0.342020143325668733044099614682 8.47842766036889964395870146939e-32 0.342020143325668733044099614682 0.642787609686539326322643409907 0.866025403784438646763723170753 0.984807753012208059366743024589 Next the weights. As mpmath doesn\u0026rsquo;t have its own DCT routine, we can construct a DCT matrix and multiply by it:\ndct = mp.matrix(N,N) dct[:,0] = mp.sqrt(1/N) for i in range(N): for j in range(1,N): dct[i,j] = mp.sqrt(2/N)*mp.cos(mp.pi/2/N*j*(2*i+1)) m = mp.matrix(N,1) m[0] = mp.sqrt(2) for k in range(2,N,2): m[k] = mp.mpf(2)/(1-k**2) ws = dct*m*mp.sqrt(2/N) ws matrix( [[\u0026#39;0.0527366499099067816401650387891\u0026#39;], [\u0026#39;0.179188712522045851600122685138\u0026#39;], [\u0026#39;0.264037222541004397180813776173\u0026#39;], [\u0026#39;0.330845175168136422780278417119\u0026#39;], [\u0026#39;0.346384479717813038086088934304\u0026#39;], [\u0026#39;0.330845175168136422780278417119\u0026#39;], [\u0026#39;0.264037222541004397180813776173\u0026#39;], [\u0026#39;0.179188712522045851600122685138\u0026#39;], [\u0026#39;0.0527366499099067816401650387892\u0026#39;]]) Given the nodes and the weights, evaluating the integral is straightforward:\nf = lambda x: mp.exp(-x*x) fs = [f(x) for x in xs] ia = sum([a*b for a,b in zip(ws,fs)]) print(ia) 1.49364777516344031419344222023 This no more accurate than the previous result as we are still using only 9 nodes and weights. But suppose we increase both the precision and the number of nodes:\nmp.dps = 100 N = 128 If we carry out all the above computations with these values, we can check the accuracy of the approximate result against the exact value which is \\(\\sqrt{\\pi}\\phantom{.}\\text{erf}(1)\\):\nie = mp.sqrt(mp.pi)*mp.erf(1) mp.nprint(abs(ia-ie),10) 2.857468478e-101 and we see the we have accuracy to 100 decimal places.\nOnwards and upwards With the current integral\n\\[ \\int_{-1}^1e^{-x^2}dx \\]\nhere are some values with different precisions and values of N, starting with the two we have already:\ndps N absolute difference 30 9 4.904614138e-7 100 128 2.857468478e-101 500 256 8.262799923e-298 1000 512 8.033083996e-667 Note that the code above is in no way optimized; there is no hint of a fast DCT, for example. Thus for large values of N and for high precision there is a time-cost; the last computation in the table took 44 seconds. (I have an IBM ThinkPad X1 Carbon 3rd Generation laptop running Arch Linux. It\u0026rsquo;s several years old, but its proved to be a terrific workhorse.)\nBut this shows that a Clenshaw-Curtis type quadrature approach is quite appropriate for high precision integration.\n","link":"https://numbersandshapes.net/posts/high_precision_clenshaw_curtis/","section":"posts","tags":["mathematics","computation"],"title":"High precision quadrature with Clenshaw-Curtis"},{"body":"Note: This blog post is mainly computational, with a hint of proof-oriented mathematics here and there. For a more in-depth analysis, read the excellent article \u0026ldquo;Gauss, Landen, Ramanujan, the Arithmetic-Geometric Mean, Ellipses, pi, and the Ladies Diary\u0026rdquo; by Gert Akmkvist and Bruce Berndt, in The American Mathematical Monthly, vol 95 no. 7 (August-September 1988), pages 585-608, and happily made available online for free at https://www.maa.org/sites/default/files/pdf/upload_library/22/Ford/Almkvist-Berndt585-608.pdf\nThis article was the 1989 winner of the MAA Halmos-Ford award \u0026ldquo;For outstanding expository papers in The American Mathematical Monthly\u0026rdquo; and you should certainly read it.\nIntroduction It\u0026rsquo;s one of the delights or annoyances of mathematics, which ever way you look at it, that there\u0026rsquo;s no simple formula for the cicrcumference of an ellipse comparable to \\(2\\pi r\\) for a circle.\nIndeed, for an ellipse with semi axes \\(a\\) and \\(b\\), the circumference can be expressed as the integral\n\\[ 4\\int_0^{\\pi/2}\\sqrt{a^2\\cos^2\\theta + b^2\\sin^2\\theta}\\,d\\theta \\]\nwhich is one of a class of integrals called elliptic integrals, and which cannot be expressed using algebraic or standard transcendental functions.\nHowever, it turns out that there are ways of very quickly computing the circumference on an ellipse to any desired accuracy, using methods which originate with Carl Friedrich Gauss.\nThe arithmetic-geometric mean Given \\(a\u0026gt;b\u0026gt;0\\), we can define two sequences by:\n\\[ a_0 = a,\\qquad a_{k+1}=(a_k+b_k)/2 \\]\nand\n\\[ b_0=b,\\qquad b_{k+1}=\\sqrt{a_kb_k}. \\]\nThus the \\(a\\) values are the arithmetic means of the previous pair; the \\(b\\) values the geometric mean of the pair.\nSince\n\\[ b\u0026lt;\\sqrt{ab}\u0026lt;\\frac{a+b}{2}\u0026lt;a \\]\nand since \\(a_{k+1}\u0026lt;a_k\\) and \\(b_{k+1}\u0026gt;b_k\\), it follows that the \\(a\\) values are decreasing and bounded below, and the \\(b\\) values are increasing and bounded above, so they both converge. Also, if \\(c_k=\\sqrt{a_k^2-b_k^2}\\) then (see Almqvist \\\u0026amp; Berndt pp 587-588):\n\\begin{aligned} c_{k+1}\u0026amp;=\\sqrt{a_{k+1}^2-b_{k+1}^2}\\\\ \u0026amp;=\\sqrt{\\frac{1}{4}(a_k+b_k)^2-a_kb_k}\\\\ \u0026amp;=\\frac{1}{2}(a_k-b_k)\\\\ \u0026amp;=\\frac{c_k^2}{4a_{k+1}}\\\\ \u0026amp;\u0026lt;\\frac{c_k^2}{4M(a,b)}. \\end{aligned}\nThis shows that not only do the sequences converge to the same limit, but that the sequences converge quadratically; each iteration being double the precision of the previous.\nThe common limit is called the arithmetic-geometric mean of \\(a\\) and \\(b\\) and will be denoted here as \\(M(a,b)\\).\nTo give an indication of this speed, in Python:\nfrom mpmath import mp mp.dps = 50 a,b = mp.mpf(\u0026#39;3\u0026#39;), mp.mpf(\u0026#39;2\u0026#39;) for i in range(10): a,b = (a+b)/2, mp.sqrt(a*b) print(\u0026#39;{0:52s} {1:52s}\u0026#39;.format(str(a),str(b))) 2.5 2.4494897427831780981972840747058913919659474806567 2.4747448713915890490986420373529456959829737403283 2.4746160019198827700554874647235766528956885806854 2.4746804366557359095770647510382611744393311605069 2.474680435816873015671798992747485357556612143505 2.4746804362363044626244318718928732659979716520059 2.4746804362363044625888873352899297125054403531594 2.4746804362363044626066596035914014892517060025827 2.4746804362363044626066596035914014892516421855508 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 You see that we have reached the limit of precision in six steps. To better demonstrate the speed of convergence, use greater precision and display the difference \\(a_k-b_k\\):\nmp.dps = 2000 a,b = mp.mpf(\u0026#39;3\u0026#39;), mp.mpf(\u0026#39;2\u0026#39;) for i in range(10): a,b = (a+b)/2, mp.sqrt(a*b) mp.nprint(a-b,10) 0.05051025722 0.0001288694717 8.388628939e-10 3.55445366e-20 6.381703188e-41 2.057141145e-82 2.137563719e-165 2.307963982e-331 2.690598786e-663 3.656695286e-1327 You can see that the precision does indeed roughly double at each step.\nElliptic integrals and the AGM This integral:\n\\[ K(k) = \\int_0^{\\pi/2}\\frac{1}{\\sqrt{1-k^2\\sin^2\\theta}}d\\theta \\]\nis called a complete elliptic integral of the first kind. If\n\\[ k^2=1-\\frac{b^2}{a^2} \\]\nthen\n\\[ \\int_0^{\\pi/2}\\frac{1}{\\sqrt{1-k^2\\sin^2\\theta}} d\\theta = a\\int_0^{\\pi/2} \\frac{1}{\\sqrt{a^2\\cos^2\\theta +b^2\\sin^2\\theta}} d\\theta \\]\nand the integral on the right is denoted \\(I(a,b)\\). Gauss observed (and proved) that\n\\[ I(a,b)=\\frac{\\pi}{2M(a,b)}. \\]\nThis is in fact equivalent to the assertion that\n\\[ K(x) = \\frac{\\pi}{2M(1-x,1+x)}. \\]\nGauss started with assuming that \\(M(1-x,1+x)\\) was an even function, so could be expressed as a power series in even powers of \\(x\\). He then compared the power series representing \\(K(x)\\) with the integral computed by expanding the integrand as a power series using the binomial theorem and integrating term by term, to obtain\n\\[ \\frac{1}{M(1-x,1+x)}=\\frac{2}{\\pi}K(x)=1+\\left(\\frac{1}{2}\\right)^{2}x^2+ \\left(\\frac{1\\cdot 3}{2\\cdot 4}\\right)^{2}x^4+ \\left(\\frac{1\\cdot 3\\cdot 5}{2\\cdot 4\\cdot 6}\\right)^{2}x^6+\\cdots \\]\nand this power series can be written as\n\\[ \\sum_{n=0}^{\\infty}\\frac{(2n)!^2}{2^{4n}(n!)^4}x^{2n}. \\]\nAnother proof, apparently originally due to Newman, uses first the substition \\(t=a\\tan\\theta\\) to rewrite the integral:\n\\[ I(a,b)=\\int_0^{\\pi/2}\\frac{1}{\\sqrt{a^2\\cos^2\\theta+b^2\\sin^2\\theta}} d\\theta = \\int^{\\infty}_{-\\infty}\\frac{1}{\\sqrt{(a^2+t^2)(b^2+t^2)}} dt \\]\nand make the substitution\n\\[ u = \\frac{1}{2}\\left(t-\\frac{ab}{t}\\right) \\]\nwhich produces (after \u0026ldquo;some care\u0026rdquo;, according to the Borwein brothers in their book Pi and the AGM):\n\\[ I(a,b) = \\int^{\\infty}_{-\\infty}\\frac{1}{\\sqrt{(\\left(\\frac{a+b}{2}\\right)^2+t^2)(ab+t^2)}} dt =I(\\frac{a+b}{2},\\sqrt{ab}) = I(a_1,b_1). \\]\nContinuing this process we see that \\(I(a,b) = I(a_k,b_k)\\) for any \\(k\\), and taking the limit, that \\(I(a,b)=I(M,M)\\). Finally\n\\[ I(M,M) = \\int_0^{\\pi/2}\\frac{1}{\\sqrt{M^2\\cos^2\\theta+M^2\\sin^2\\theta}} d\\theta = \\int_0^{\\pi/2}\\frac{1}{M} d\\theta = \\frac{\\pi}{2M}. \\]\nSo we now know that the complete elliptic integral of the first kind is computable by means of the AGM. But what about the complete elliptic integral of the second kind?\nComplete elliptic integrals of the second kind These are defined as\n\\begin{aligned} E(k)\u0026amp;=\\int_0^{\\pi/2}\\sqrt{1-k^2\\sin^2\\theta} d\\theta\\\\ \u0026amp;=\\int_0^1\\sqrt{\\frac{1-k^2t^2}{1-t^2}}\\,dt. \\end{aligned}\nNote: An alternative convention is to write:\n\\begin{aligned} E(m)\u0026amp;=\\int_0^{\\pi/2}\\sqrt{1-m\\sin^2\\theta} d\\theta\\\\ \u0026amp;=\\int_0^1\\sqrt{\\frac{1-mt^2}{1-t^2}}\\,dt. \\end{aligned}\nThis elliptic integral is the one we want, since the perimeter of an ellipse given in cartesion form as\n\\[ \\frac{x^2}{a^2}+\\frac{y^2}{b^2}=1 \\]\nis equal to\n\\[ 4aE\\left(\\sqrt{1-\\frac{b^2}{a^2}}\\right) \\]\n(using the first definition). This can be written as\n\\[ 4aE(e) \\]\nwhere \\(e\\) is the eccentricity of the ellipse.\nThe computation of elliptic integrals goes back as far as Euler, taking in Gauss and then Legendre along the way. Adrien-Marie Legendre (1752-1833) is better known now than he was in his lifetime; his contributions to elliptic integrals are now seen as fundamental, and in paving the way for the greater work of Abel and Jacobi.\nIn particular, Legendre showed that\n\\[ \\frac{K(k)-E(k)}{K(k)}=\\frac{1}{2}(c_0^2+2c_1^2+4c_2^2+\\cdots + 2^nc_n^2+\\cdots ) \\]\nwhere\n\\[ c_n^2=a_n^2-b_n^2. \\]\nThe above equation can be rewritten to provide a fast-converging series for \\(E(k)\\):\n\\begin{aligned} E(k)\u0026amp;=K(k)(1-\\frac{1}{2}(c_0^2+2c_1^2+4c_2^2+\\cdots + 2^nc_n^2+\\cdots))\\\\ \u0026amp;=\\frac{\\pi}{2M(a,b)}(1-\\frac{1}{2}(c_0^2+2c_1^2+4c_2^2+\\cdots + 2^nc_n^2+\\cdots)). \\end{aligned}\nIn the sequences for the AGM, let \\(M(a,b)\\) be approximated by \\(a_n\\). This produces a sequence that converges very quickly to \\(E(k)\\):\n\\begin{aligned} e_0\u0026amp;=\\frac{\\pi}{2a}(1-\\frac{1}{2}c_0^2)\\\\ e_1\u0026amp;=\\frac{\\pi}{2a_1}(1-\\frac{1}{2}(c_0^2+2c_1^2))\\\\ e_2\u0026amp;=\\frac{\\pi}{2a_2}(1-\\frac{1}{2}(c_0^2+2c_1^2+4c_2^2)) \\end{aligned}\nThis can easily be managed recursively, given \\(a\\) and \\(b\\), by setting\n\\[ a_0=a,\\qquad,b_0=b,\\qquad p_0=1,\\qquad s_0=a^2-b^2 \\]\nand then iterating by\n\\begin{aligned} a_{k+1}\u0026amp;=(a_k+b_k)/2\\\\ b_{k+1}\u0026amp;=\\sqrt{a_kb_k}\\\\ p_{k+1}\u0026amp;=2p_k\\\\ s_{k+1}\u0026amp;=s_k+p_k(a_{k+1}^2-b_{k+1}^2) \\end{aligned}\nThen the values\n\\[ e_k = \\frac{2\\pi}{a_k}(a_0^2-\\frac{1}{2}s_k) \\]\napproach the perimeter of the ellipse.\nWe can demonstrate this in Python for \\(a,b=3,2\\), again using the multiprecision library mpmath:\nmp.dps = 100 a = mp.mpf(3) b = mp.mpf(2) p = 1 s = a**2-b**2 a0 = a e = 2*mp.pi/a*(a0*a0-1/2*s) for i in range(11): a, b, p = (a+b)/2, np.sqrt(a*b), p*2 s += p*(a**2-b**2) print(2*mp.pi/a*(a0*a0-1/2*s)) 15.70796326794896619231321691639751442098584699687552910487472296153908203143104499314017412671058534 15.86502654157467714117218441586859215641446367453641805450510958042565274839788072998452444741042018 15.86543958660157016021207778974106592713197464956329054113604818310136964968401110893643543265751673 15.86543958929058979121772312468572564968472056328417484804713380278017532700185422580941088121856672 15.86543958929058979133166302778307249672987827943566144626073834574026314097232621543195947183210587 15.86543958929058979133166302778307249673008284832650068966726311774248223910968899525488214058871233 15.86543958929058979133166302778307249673008284832650068966726311774248223910968899591430967903912196 15.865439589290589791331663027783072496730082848326500689667263117742482239109688995914309679039122 15.86543958929058979133166302778307249673008284832650068966726311774248223910968899591430967903912207 15.86543958929058979133166302778307249673008284832650068966726311774248223910968899591430967903912222 15.86543958929058979133166302778307249673008284832650068966726311774248223910968899591430967903912252 We see that we have reached the limits of this precision very quickly. And as above, we can look instead at the differences between succesive approximations:\nmp.dps=2000 a = mp.mpf(3) b = mp.mpf(2) p = 1 s = a**2-b**2 a0 = a e = 2*mp.pi/a*(a0*a0-1/2*s) for i in range(11): a, b, p = (a+b)/2, np.sqrt(a*b), p*2 s += p*(a**2-b**2) e1 = 2*mp.pi/a*(a0*a0-1/2*s) mp.nprint(e1-e,10) e = e1 2.094395102 0.1570632736 0.0004130450269 2.689019631e-9 1.139399031e-19 2.045688908e-40 6.594275385e-82 6.852074221e-165 7.398301331e-331 8.624857552e-663 1.172173128e-1326 The convergence is seen to be quadratic.\nUsing Excel It might seem utterly regressive to use Excel - or indeed, any spreadsheet - for computations such as this, but in fact Excel can be used very easily for recursive computations, as long as you\u0026rsquo;re happy to use only IEEE double precision.\nIn Cells A1-E2, enter:\nA B C D E 1 3 2 1 =A1^2-B1^2 2 =(A1+B1)/2 =SQRT(A1*B1) =2*C1 =D1+C2*(A2^2-B2^2) =2*PI()/A2*($A$1^2-1/2*D2) and then copy A2-E2 down a few rows. Because of overflow errors, if you show 14 decimal places in column E, they\u0026rsquo;ll never quite settle down. But in fact you reach the limits of double precision by row 5.\n","link":"https://numbersandshapes.net/posts/circumference_ellipse/","section":"posts","tags":["mathematics","computation"],"title":"The circumference of an ellipse"},{"body":"","link":"https://numbersandshapes.net/tags/algebra/","section":"tags","tags":null,"title":"Algebra"},{"body":"In all the previous discussions of voting power, we have assumed that all winning coalitions are equally likely. But in practice that is not necessarily the case. Two or more voters may be opposed on so many issues that they would never vote the same way on any issues: such a pair of voters may be said to be quarrelling.\nTo see how this may make a difference, consider the voting game \\[ [51; 49,48,3] \\] The winning coalitions for which a party is critical are \\[ [49,48],\\; [49,3],\\; [48,3] \\] and since each party is critical to the same number of winning coalitions, the Banzhaf indices are equal.\nBut suppose that the two largest parties are quarrelling, and so the only winning coalitions are then \\[ [49,3],\\; [48,3]. \\] This means that the smaller party has twice the power of each of the larger ones!\nWe can set this up in Sage, using polynomial rings, with an ideal that represents the quarrel. In this case:\nsage: q = 51 sage: w = [49,48,3] sage: R.\u0026lt;x,y,z\u0026gt; = PolynomialRing(QQ) sage: qu = R.ideal(x*y) sage: p = ((1+x^49)*(1+y^48)*(1+z^3)).reduce(qu) sage: p \\[ x^{49}z^3+y^{48}z^3+x^{49}+y^{48}+z^3+1 \\] From this polynomial, we choose the monomials whose degree sums are not less than the quota:\nsage: wcs = [m.degrees() for m in p.monomials() if sum(m.degrees()) \u0026gt;= q] sage: wcs \\[ [(49,0,3),(0,48,3)] \\] To determine the initial (not normalized) Banzhaf values, we traverse this list of tuples, checking in each tuple if the value is critical to its coalition:\nsage: beta = [0,0,0] sage: for c in wcs: sc = sum(c) for i in range(3): if sc-c[i] \u0026lt; q: beta[i] += 1 beta \\[ [1,1,2] \\] or \\[ [0.24, 0.25, 0.5] \\] for normalized values.\nGeneralizing Of course the above can be generalized to any number of voters, and any number of quarrelling pairs. For example, consider the Australian Senate, of which the current party representation is:\nAlignment Party Seats Government Liberal 31 National 5 Opposition Labor 26 Crossbench Greens 9 One Nation 2 Centre Alliance 1 Lambie Alliance 1 Patrick Team 1 Although the current government consists of two separate parties, they act in effect as one party with a weight of 36. (This is known formally as the \u0026ldquo;Liberal-National Coalition\u0026rdquo;.) With no quarrels then, we have \\[ [39;36,26,9,2,1,1,1] \\] of which the Banzhaf values have been found to be 52, 12, 12, 10, 4, 4, 4 and the Banzhaf power indices \\[ 0.5306, 0.1224, 0.1224, 0.102, 0.0408, 0.0408, 0.0408. \\] Suppose that the Government and Opposition are quarrelling, as are also the Greens and One Nation. (This is a reasonable assumption, given current politics and the platforms of the respective parties.)\nsage: q = 39 sage: w = [36,26,9,2,1,1,1] sage: n = len(w) sage: R = PolynomialRing(QQ,\u0026#39;x\u0026#39;,n) sage: xs = R.gens() sage: qu = R.ideal([xs[0]*xs[1],xs[2]*xs[3]]) sage: pr = prod(1+xs[j]^w[j] for j in range(n)).reduce(qu) As before, we extract the degrees of those monomials whose sum is at least \\(q\\), and determine which party in each is critical:\nsage: wcs = [m.degrees() for m in pr.monomials() if sum(m.degrees()) \u0026gt;= q] sage: beta = [0]*n sage: for c in wcs: sc = sum(c) for i in range(n): if sc-c[i] \u0026lt; q: beta[i] += 1 sage: beta \\([16,0,7,6,2,2,2]\\)\nwhich can be normalized to\n\\([0.4571, 0.0, 0.2, 0.1714, 0.0571, 0.0571, 0.0571]\\).\nThe remarkable result is that Labor - the Opposition party, with the second largest number of seats in the Senate - loses all its power! This means that Labor cannot afford to be in perpetual quarrel with the government parties.\nA simple program And of course, all of the above can be written into a simple Sage program:\ndef qbanzhaf(q,w,r): n = len(w) R = PolynomialRing(QQ,\u0026#39;x\u0026#39;,n) xs = R.gens() qu = R.ideal([xs[y[0]]*xs[y[1]] for y in r]) pr = prod(1+xs[j]^w[j] for j in range(n)).reduce(qu) wcs = [m.degrees() for m in pr.monomials() if sum(m.degrees()) \u0026gt;= q] beta = [0]*n for c in wcs: sc = sum(c) for i in range(n): if sc-c[i] \u0026lt; q: beta[i] += 1 return(beta) All the quarrelling pairs are given as a list, so that the Australian Senate computation could be entered as\nsage: qbanzahf(q,w,[[0,1],[2,3]]) ","link":"https://numbersandshapes.net/posts/voting_power_quarreling/","section":"posts","tags":["voting","algebra","python"],"title":"Voting power (7): Quarreling voters"},{"body":"As we have seen previously, it\u0026rsquo;s possible to compute power indices by means of polynomial generating functions. We shall extend previous examples to include the Deegan-Packel index, in a way somewhat different to that of Alonso-Meijide et al (see previous post for reference).\nAgain, suppose we consider the voting game\n\\[ [30;28,16,5,4,3,3] \\]\nWhat we\u0026rsquo;ll do here though, rather than using just one variable, we\u0026rsquo;ll have a variable for each voter. We\u0026rsquo;ll use Sage for this, as it\u0026rsquo;s open source, and provides a very rich environment for computing with polynomial rings.\nWe first create the polynomial\n\\[ p = \\prod_{k=0}^5(1+x_k^{w_k}) \\]\nwhere \\(w_i\\) are the weights given above. We are using the Python (and hence Sage) convention that indices are numbered startng at zero.\nsage: q = 30 sage: w = [28,16,5,4,3,3] sage: n = len(w) sage: R = PolynomialRing(QQ,\u0026#39;x\u0026#39;,n) sage: xs = R.gens() sage: pr = prod(1+xs[i]^w[1] for i in range(n) Now we can extract all the monomials, and consider only those for which the degree sum is not less than the quota \\(q\\):\nsage: pm = pr.monomials() sage: pw = [x for x in pr if sum(x.degrees()) \u0026gt;= q] sage: pw = pw[::-1]; pw \\begin{aligned} \u0026amp;\\left[x_{1}^{16} x_{2}^{5} x_{3}^{4} x_{4}^{3} x_{5}^{3},\\; x_{0}^{28} x_{5}^{3},\\; x_{0}^{28} x_{4}^{3},\\; x_{0}^{28} x_{3}^{4},\\; x_{0}^{28} x_{2}^{5},\\; x_{0}^{28} x_{4}^{3} x_{5}^{3},\\; x_{0}^{28} x_{3}^{4} x_{5}^{3},\\; x_{0}^{28} x_{3}^{4} x_{4}^{3},\\right.\\cr \u0026amp;\\qquad x_{0}^{28} x_{2}^{5} x_{5}^{3},\\; x_{0}^{28} x_{2}^{5} x_{4}^{3},\\; x_{0}^{28} x_{2}^{5} x_{3}^{4},\\; x_{0}^{28} x_{3}^{4} x_{4}^{3} x_{5}^{3},\\; x_{0}^{28} x_{2}^{5} x_{4}^{3} x_{5}^{3},\\; x_{0}^{28} x_{2}^{5} x_{3}^{4} x_{5}^{3},\\cr \u0026amp;\\qquad x_{0}^{28} x_{2}^{5} x_{3}^{4} x_{4}^{3},\\; x_{0}^{28} x_{2}^{5} x_{3}^{4} x_{4}^{3} x_{5}^{3},\\; x_{0}^{28} x_{1}^{16},\\; x_{0}^{28} x_{1}^{16} x_{5}^{3},\\; x_{0}^{28} x_{1}^{16} x_{4}^{3},\\; x_{0}^{28} x_{1}^{16} x_{3}^{4},\\cr \u0026amp;\\qquad x_{0}^{28} x_{1}^{16} x_{2}^{5},\\; x_{0}^{28} x_{1}^{16} x_{4}^{3} x_{5}^{3},\\; x_{0}^{28} x_{1}^{16} x_{3}^{4} x_{5}^{3},\\; x_{0}^{28} x_{1}^{16} x_{3}^{4} x_{4}^{3},\\; x_{0}^{28} x_{1}^{16} x_{2}^{5} x_{5}^{3},\\cr \u0026amp;\\qquad x_{0}^{28} x_{1}^{16} x_{2}^{5} x_{4}^{3},\\; x_{0}^{28} x_{1}^{16} x_{2}^{5} x_{3}^{4},\\; x_{0}^{28} x_{1}^{16} x_{3}^{4} x_{4}^{3} x_{5}^{3},\\; x_{0}^{28} x_{1}^{16} x_{2}^{5} x_{4}^{3} x_{5}^{3},\\cr \u0026amp;\\qquad \\left. x_{0}^{28} x_{1}^{16} x_{2}^{5} x_{3}^{4} x_{5}^{3},\\; x_{0}^{28} x_{1}^{16} x_{2}^{5} x_{3}^{4} x_{4}^{3},\\; x_{0}^{28} x_{1}^{16} x_{2}^{5} x_{3}^{4} x_{4}^{3} x_{5}^{3}\\right] \\end{aligned}\nAs we did with subsets, we can winnow out the monomials which are multiples of others, by writing a recursive function:\ndef mwc(q,t,p): if len(p)==0: return(t) else: for x in p[1:]: if p[0].divides(x): p.remove(x) return(mwc(q,t+[p[0]],p[1:])) We can apply it:\nsage: pw2 = pw.copy() sage: mwc1 = mwc(q,[],pw2) sage: mwc1 \\[ \\left[x_{1}^{16} x_{2}^{5} x_{3}^{4} x_{4}^{3} x_{5}^{3},\\; x_{0}^{28} x_{5}^{3},\\; x_{0}^{28} x_{4}^{3},\\; x_{0}^{28} x_{3}^{4},\\; x_{0}^{28} x_{2}^{5},\\; x_{0}^{28} x_{1}^{16}\\right] \\]\nNow it\u0026rsquo;s just a matter of working out the indices from the variables, and this is most easily done in Python with a dictionary, with the variables as keys and their indices as values.\nsage: dp = {xs[i]:0 for i in range(n)} sage: for m in mcw1: mv = m.variables() nm = len(mv) for x in mv: dps[x] += 1/nm sage: dp \\[ \\left\\{x_{0} : \\frac{5}{2}, x_{1} : \\frac{7}{10}, x_{2} : \\frac{7}{10}, x_{3} : \\frac{7}{10}, x_{4} : \\frac{7}{10}, x_{5} : \\frac{7}{10}\\right\\} \\]\nAnd this of course can be normalized:\nsage: s = sum(dps.values()) for x in dps.keys(): dps[x] = dps[x]/s \\[ \\left\\{x_{0} : \\frac{5}{12}, x_{1} : \\frac{7}{60}, x_{2} : \\frac{7}{60}, x_{3} : \\frac{7}{60}, x_{4} : \\frac{7}{60}, x_{5} : \\frac{7}{60}\\right\\} \\]\nAnd these are the values we found in the previous post, using Julia and working with subsets.\nBack to Banzhaf Recall that the Banzhaf power indices could be computed from polynomials in two steps. For voter \\(i\\), define\n\\[ f_i(x) = \\prod_{j\\ne i}(1+x^{w_j}) \\]\nand suppose that the coefficient of \\(x^k\\) is \\(a_k\\). Then define\n\\[ \\beta_i=\\sum_{j=q-w_i}^{q-1}a_j \\]\nand these values are normalized for the Banzhaf power indices.\nSuppose we define, as for the Deegan-Packel indices,\n\\[ p = \\prod_{k=1}^n(1+x_i^{w_i}). \\]\nThen reduce \\(p\\) modulo \\(x_i^{w_i}\\). This has the effect of setting \\(x_i^{w_i}=0\\) in \\(p\\) so simply removes all monomials containing \\(x_i^{w_i}\\). Then \\(\\beta_i\\) is computed by adding the coefficients of all monomials whose degree sum lies between \\(q-w_i\\) and \\(q-1\\).\nFirst define the polynomial ring and the polynomial \\(p\\):\nsage: w = [28,16,5,4,3,3] sage: q = 30 sage: n = len(w) sage: R = PolynomialRing(QQbar,\u0026#39;x\u0026#39;,n) sage: xs = R.gens() sage: p = prod(1+xs[i]^w[i] for i in range(n)) Now for the reeducation and computation:\nsage: ids = [R.ideal(xs[i]^w[i]) for i in range(n)] sage: for i in range(n): pri = pr.reduce(ids[i]) ms = pri.monomials() cs = pri.coefficients() cds = [(x,sum(y.degrees())) for x,y in zip(cs,ms)] beta += [sum(x for (x,y) in cds if y\u0026gt;=(q-w[i]) and (y\u0026lt;q))] sage: print(beta) \\[ [30,2,2,2,2,2] \\]\nfrom which the Banzhaf power indices can be computed as\n\\[ \\left[\\frac{3}{4},\\;\\frac{1}{20},\\;\\frac{1}{20},\\;\\frac{1}{20},\\;\\frac{1}{20},\\;\\frac{1}{20}\\right]. \\]\nNote that this is of course unnecessarily clumsy, as the Banzhaf indices can be easily and readily computed using univariate polynomials. We may thus consider this approach as a proof of concept, rather than a genuine alternative.\nIn Sage, Banzhaf indices can be computed with polynomials very neatly, given weights and a quota:\nsage: def banzhaf(q,w): n = len(w) beta = [] for i in range(n): pp = prod(1+x^w[j] for j in range(n) if j != i) cc = pp.list() beta += [sum(cc[j] for j in range(len(cc)) if j\u0026gt;=(q-w[i]) and (j\u0026lt;q))] return(beta) These values can then be easily normalized to produce the Banzhaf power indices.\n","link":"https://numbersandshapes.net/posts/voting_power_polynomial_rings/","section":"posts","tags":["voting","algebra","python"],"title":"Voting power (6): Polynomial rings"},{"body":"We have explored the Banzhaf and Shapley-Shubik power indices, which both consider the ways in which any voter can be pivotal, or critical, or necessary, to a winning coalition.\nA more recent power index, which takes a different approach, was defined by Deegan and Packel in 1976, and considers only minimal winning coalitions. A winning coalition \\(S\\) is minimal if every member of \\(S\\) is critical to it, or alternatively, that \\(S\\) does not contain any winning coalition as a proper subset. It is easy to see that these are equivalent, for if \\(i\\in S\\) was not critical, then \\(S-\\{i\\}\\) would be a winning coalition which is a proper subset.\nGiven \\(N=\\{1,2,\\ldots,n\\}\\), let \\(W\\subset 2^N\\) be the set of all minimal winning coalitions, and let \\(W_i\\subset W\\) be those that contain the voter \\(i\\). Then we define \\[ DP_i=\\sum_{S\\in W_i}\\frac{1}{|S|} \\] where \\(|S|\\) is the cardinality of \\(S\\).\nFor example, consider the voting game \\[ [16;10,9,6,5] \\] Using the indicies 1 to 4 for the voters, the minimal winning coalitions are \\[ \\{1,2\\},\\{1,3\\},\\{2,3,4\\}. \\] and hence\n\\begin{aligned} DP_1 \u0026amp;= \\frac{1}{2}+\\frac{1}{2} = 1 \\\\cr DP_2 \u0026amp;= \\frac{1}{2}+\\frac{1}{3} = \\frac{5}{6} \\\\cr DP_3 \u0026amp;= \\frac{1}{2}+\\frac{1}{3} = \\frac{5}{6} \\\\cr DP_4 \u0026amp;= \\frac{1}{3} \\end{aligned}\nand these values can be normalized so that their sum is unity: \\[ [1/3,5/18,5/18,1/9]\\approx [0.3333, 0.2778,0.2778, 0.1111]. \\] In comparison, both the Banzhaf and Shapley-Shubik indices return \\[ [0.4167, 0.25, 0.25, 0.0833]. \\]\nAllied to the Deegan-Packel index is Holler\u0026rsquo;s public-good index, also called the Holler-Packel index, defined as \\[ H_i=\\frac{|W_i|}{\\sum_{j\\in N}|W_j|}. \\] In other words, this index first counts the number of minimal wining coalitions that contain \\(i\\), and then normalises those values for the sum to be unity.\nIn the example above, we have voters 1, 2, 3 and 4 being members of 2, 2, 2, 1 minimal winning coalitions respectively, and so the power indices are\n\\[ [2/7, 2/7, 2/7, 1/7] \\approx [0.2857,0.2857,0.2857,0.1429]. \\]\nImplementation (1): Python We can implement the Deegan-Packel in Python, either by using itertools, or simply rolling our own little functions:\ndef all_subsets(X): T = [[]] for x in X: T += [t+[x] for t in T] return(T) def is_subset(A,B): out = True for a in A: if B.count(a) == 0: out = False break return(out) def mwc(q,S,T): if len(T) == 0: return(S) else: if sum(T[0]) \u0026gt;= q: S += [T[0]] temp = T.copy() for t in temp: if is_subset(T[0],t): temp.remove(t) return(mwc(q,S,temp)) else: return(mwc(q,S,T[1:])) def prod(A): m = len(A) n = len(A[0]) p = 1 for i in range(m): for j in range(n): p *= A[i][j] return(p) Of the three functions above, the first simply returns all subsets (as lists); the second tests whether one list is a subset of another (treating both as sets), and the final routine returns all minimal winning coalitions using an elementary recursion. The function starts off considering all subsets of the set of weights, and goes through the list until it finds one whose sum is at least equal to the quota. Then it removes all other subsets which are supersets of the found one. The calls the routine on this smaller list.\nFor example:\n\u0026gt;\u0026gt;\u0026gt; wts = [10,9,6,5] \u0026gt;\u0026gt;\u0026gt; T = all_subsets(wts)[1:] \u0026gt;\u0026gt;\u0026gt; q = 16 \u0026gt;\u0026gt;\u0026gt; mwc(q,[],T) [[10, 9], [10, 6], [9, 6, 5]] It is an easy matter now to obtain the Deegan-Packel indices:\ndef dpi(q,wts): m = mwc(q,[],all_subsets(wts)[1:]) dp = [] for w in wts: d = 0 for x in m: d += x.count(w)/len(x) dp += [d] return(dp) And as an example:\n\u0026gt;\u0026gt;\u0026gt; wts = [10,9,6,5] \u0026gt;\u0026gt;\u0026gt; q = 16 \u0026gt;\u0026gt;\u0026gt; dpi(q,wts) [1.0, 0.8333333333333333, 0.8333333333333333, 0.3333333333333333] and of course these can be normalized so that their sum is unity.\nImplementation (2): Julia Now we\u0026rsquo;ll use Julia, and its Combinatorics library. Because Julia implements JIT compiling, its speed is generally faster than that of Python.\nJust to be different, we\u0026rsquo;ll develop two functions, one which first produces all winning coalitions, and the second which winnows that set to just the minimal winning coalitions:\nusing Combinatorics function wcs(q,w) S = powerset(w) out = [] for s in S if sum(s) \u0026gt;= q append!(out,[s]) end end return(out) end function mwc(q,out,wc) if isempty(wc) return(out) else f = wc[1] popfirst!(wc) temp = [] # temp finds all supersets of f = wc[1] for w in wc if issubset(f,w) append!(temp,[w]) end end return(mwc(q,append!(out,[f]),setdiff(wc,temp))) end end Now we can try it out:\njulia\u0026gt; q = 16; julia\u0026gt; w = [10,9,6,5]; julia\u0026gt; cs = wcs(q,w) 7-element Array{Any,1}: [10, 9] [10, 6] [10, 9, 6] [10, 9, 5] [10, 6, 5] [9, 6, 5] [10, 9, 6, 5] julia\u0026gt; mwc(q,[],cs) 3-element Array{Any,1}: [10, 9] [10, 6] [9, 6, 5] Repeated elements Although both Julia and Python work with multisets, this becomes tricky in terms of the power indices. A simple expedient is to change repeated indices by small amounts so that they are all different, but that the sum will not affect any quota. If we have for example four indices which are the same, we can add 0.1, 0.2, 0.3 to three of them.\nSo we consider the example\n\\[ [30;28,16,5,4,3,3] \\]\ngiven as an example of a polynomial method in the article \u0026ldquo;Computation of several power indices by generating functions\u0026rdquo; by J. M. Alonso-Meijide et al; you can find the article on Science Direct.\nSo:\njulia\u0026gt; q = 30; julia\u0026gt; w = [28,16,5,4,3.1,3]; julia\u0026gt; cs = wcs(q,w); julia\u0026gt; ms = mwc(q,[],cs) 6-element Array{Any,1}: [28.0, 16.0] [28.0, 5.0] [28.0, 4.0] [28.0, 3.1] [28.0, 3.0] [16.0, 5.0, 4.0, 3.1, 3.0] From here it\u0026rsquo;s an easy matter to compute the Deegan-Packel power indices:\njulia\u0026gt; dp = [] for i = 1:6 x = 0//1 for m in mw x = x + count(j -\u0026gt; j == w[i],m)//length(m) end append!(dp, [x]) end julia\u0026gt; print(dp) Any[5//2, 7//10, 7//10, 7//10, 7//10, 7//10] julia\u0026gt; print([x/sum(dp) for x in dp]) Rational{Int64}[5//12, 7//60, 7//60, 7//60, 7//60, 7//60] and these are the values obtained by the authors (but with a lot less work).\n","link":"https://numbersandshapes.net/posts/voting_power_deegan-packel_holler/","section":"posts","tags":["voting","algebra","python","julia"],"title":"Voting power (5): The Deegan-Packel and Holler power indices"},{"body":"","link":"https://numbersandshapes.net/tags/cad/","section":"tags","tags":null,"title":"CAD"},{"body":"Recently I friend and I wrote a semi-serious paper called \u0026ldquo;The geometry of impossible objects\u0026rdquo; to be delivered at a mathematics technology conference. The reviewer was not hugely complimentary, saying that there was nothing new in the paper. Well, maybe not, but we had fun pulling together some information about impossible shapes and how to draw them.\nYou can see some of our programs hosted at repl.it.\nBut my interest here is to look at the three-dimensional versions of such objects; how to use a programmable CAD to create 3d objects which, from a particular perspective, look impossible.\nHere\u0026rsquo;s an example in Perth, Western Australia:\n(I think the sides are a bit too thin to give a proper feeling for the shape, though.)\nAnother image, neatly showing how such a Penrose triangle can be created, is this of a \u0026ldquo;Unique vase\u0026rdquo;:\nWith this in mind, we can easily create such a shape in any 3D CAD program we like. I\u0026rsquo;ve spoken before about programmable CAD but then using OpenJSCAD. However, I decided to switch to Python\u0026rsquo;s CADquery, because it offered the option of orthographic viewing instead of just perspective viewing. This meant that all objects retained their size even if further away from the viewpoint, which made cutting out from the foreground objects much easier.\nFor making shapes available online, the best solution I found was the online and free viewer and widget provided by LAI4D and its online Laboratory. This provides not only a nice environment to explore 3D shapes, but the facility to export the shape as an html widget for viewing in an iframe.\nPenrose triangle In fact, this first shape is created with OpenJSCAD which I was using before discovering Cadquery. The only problem with OpenJSCAD - of which there\u0026rsquo;s a new version called simply JSCAD - is that it doesn\u0026rsquo;t seem to allow orthographic viewing. This is preferable to perspective viewing for impossible figures, as it\u0026rsquo;s much easier to work out how to line everything up as necessary.\nYou can see the version of my OpenJSCAD Penrose triangle by opening up up this link:\nhttp://bit.ly/3q5OAer\nThis should open up OpenJSCAD with the shape in it. You should be able to move it around until you get the impossible effect. You\u0026rsquo;ll see the JavaScript file that produced it on the right.\nThis shows the starting shape and what you should aim to produce:\nAnd here is the same construction in CadQuery, but working with an orthographic projection. As before, we create the cut-away beam as a two-dimensional shape given by the coordinates of its vertices, then \u0026ldquo;extrude\u0026rdquo; it perpendicular to its plane. The remaining beam is a rectangular box.\npts = [ (0,0), (50,0), (50,10), (49.9,10), (49.9,0.1), (39.9,0.1), (30,10), (10,10), (10,40), (0,40) ] L_shape = cq.Workplane(\u0026#34;front\u0026#34;).polyline(pts).close().extrude(10) upright = cq.Workplane(\u0026#34;front\u0026#34;).transformed(offset=(5,45,-15)).box(10,10,50) pt = Assembly( [ Part(L_shape, \u0026#34;L shape\u0026#34;), Part(upright, \u0026#34;Upright\u0026#34;), ], \u0026#34;Penrose triangle\u0026#34;) exportSTL(pt,\\\u0026#34;penrose_triangle.stl\\\u0026#34;, linear_deflection=0.01, angular_deflection=0.1) show(pt) The saved STL file can then be imported into the LAI4D online widget, and saved as an iframe for viewing.\nThe Reutersvärd triangle, which is named for the Swedish graphics designer Oscar Reutersvärd seems to predate the Penrose tirangle, and is a thing of great beauty.\nAnyway, to see it in LAI4d, go to:\nSTL Reutersvard triangle here\nThere are many hundreds of clever and witty impossible figures at the Impossible Figure Library, which I strongly recommend you visit!\nPenrose Staircase There\u0026rsquo;s a nice video on youtube here which I used as the basis for my model, but since then I\u0026rsquo;ve disovered other CAD models, for example on yeggi.com. Anyway, the model is made up of square prisms of different heights and bases, with a final shape jutting out at the top:\nThe Python code is :\n{{\u0026lt; highlight python \u0026gt;}} transforms = [(0,0,0),(0,-1,0),(0,-2,5),(0,-3,5),(1,-3,5),(2,-3,5),(3,-3,5),(4,-3,5), (4,-2,5),(4,-1,5),(3,-1,5),(2,-1,20)] vscale = 0.2 vtransforms = [(t[0],t[1],t[2]*vscale) for t in transforms]\nheights = [11,12,8,9,10,11,12,13,14,15,16,2]\nboxes = [cq.Workplane(\u0026ldquo;front).transformed(\\ offset = vtransforms[i]).box(1,1,heights[i]*vscale,centered=(True,True,False)) for i in range(11)]\\n\u0026rdquo;,\npts1 = [(0,0),(0,1),(-1,1),(-1,0.2),(-0.2,0.2)] box12 = cq.Workplane(\u0026ldquo;front\u0026rdquo;).polyline(pts1).close().extrude(0.4)\npts2 = [(0.2,0),(0.6,0),(0.2,0.4)] prism = cq.Workplane(\u0026ldquo;YZ\u0026rdquo;).transformed(offset=(0,0,-1)).polyline(pts2).close().extrude(0.8) shape = box12.cut(prism).translate((2.5,-1.5,4))\nboxes += [shape]\nps = Assembly( [Part(boxes[i],\u0026ldquo;Box \u0026ldquo;+str(i),\u0026quot;#daa520\u0026rdquo;) for i in range(12)], \u0026ldquo;Penrose staircase\u0026rdquo;)\nexportSTL(ps,\\\u0026ldquo;penrose_staircase2.stl\\\u0026rdquo;, linear_deflection=0.01, angular_deflection=0.1)\n{{\u0026lt; /highlight \u0026gt;}}\nBelow is an iframe containing the LAI4D widget, it may take a few seconds to load, and you may need to refresh the page:\nIf you get the shape into a state from which you can\u0026rsquo;t get the stair affect working, click on the little circular arrows in the bottom left, which will reset the object to its initial orientation.\nImpossible box This was just a matter of creating the edges, finding a nice view, and using some trial and error to slice through the front most beams so that it seemed that hhe read beams were at the front.\nHere is another widget, again you may have to wait for it to load.\n","link":"https://numbersandshapes.net/posts/three_d_impossible_cad/","section":"posts","tags":["geometry","CAD"],"title":"Three-dimensional impossible CAD"},{"body":"Introduction and recapitulation Recall from previous posts that we have considered two power indices for computing the power of a voter in a weighted system; that is, the ability of a voter to influence the outcome of a vote. Such systems occur when the voting body is made up of a number of \u0026ldquo;blocs\u0026rdquo;: these may be political parties, countries, states, or any other groupings of people, and it is assumed that within every bloc all members will vote the same way. Indeed, in some legislatures, voting according to \u0026ldquo;the party line\u0026rdquo; is a requirement of membership.\nExamples include the American Electoral College, in which the \u0026ldquo;voters\u0026rdquo; are the states; the Australian Senate, the European Union, the International Monetary Fund, and many others.\nGiven a set \\(N=\\{1,2,\\ldots,n\\}\\) of voters and their weights \\(w_i\\), and a quota \\(q\\) required to pass any motion, we have represented this as\n\\[ [q;w_1,w_2,\\ldots,w_n] \\]\nand we define a winning coalition as any subset \\(S\\subset N\\) of voters for which\n\\[ \\sum_{i\\in S}w_i\\ge q. \\]\nIt is convenient to define a characteristic function \\(v\\) on all subsets of \\(N\\) so that \\(v(S)=1\\) if \\(S\\) is a winning coalition, and 0 otherwise. Given a winning coalition \\(S\\), a voter \\(i\\in S\\) is necessary if \\(v(S)-v(S-\\{i\\})=1\\).\nFor any voter \\(i\\), the number of winning coalitions for which that voter is necessary is\n\\[ \\beta_i = \\sum_S\\left(v(S)-v(S-\\{i\\})\\right) \\]\nwhere the sum is taken over all winning coalitions. Then the Banzhaf power index is this value normalized so that the sum of all indices is unity:\n\\[ B_i=\\frac{\\beta_i}{\\sum_{i=1}^n \\beta_i}. \\]\nThe Shapley-Shubik power index is defined by considering all permutations \\(p\\) of \\(N\\). Taken cumulative sums from the left, a voter \\(p_k\\) is pivotal if this is the first voter for which te cumulative sum is at least \\(q\\). For each voter \\(i\\), let \\(\\sigma_i\\) be the number of permutations for which \\(i\\) is pivotal. Then\n\\[ S_i=\\frac{1}{n!}\\sigma_i \\]\nwhich ensures that the sum of all indices is unity.\nAlthough these two indices seem very different, there is in fact a deep connection. Consider any permutation \\(p\\) and suppose that \\(i=p_k\\) is the pivotal voter, This voter will also be pivotal in all permutations for which \\(i=p_k\\) and the values to the right and left of \\(i\\) stay there: there will be \\((k-1)!(n-k)!\\) such permutations. However, we can consider the values up to and including \\(i=p_k\\) as a winning coalition for which \\(i\\) is necessary, which means we can write\n\\[ S_i=\\sum_S\\frac{(n-k)!(k-1)!}{n!}\\left(v(S)-v(S-\\{i\\})\\right) \\]\nwhich can be compared to the Banzhaf index above as being similar and with a different weighting function. Note that the above expression can be written as\n\\[ S_i=\\sum_S\\left(k{\\dbinom n k}\\right)^{-1}\\left(v(S)-v(S-\\{i\\})\\right) \\]\nwhich uses smaller numbers. For example, if \\(n=50\\) then \\(n!\\approx 3.0149\\times 10^{64}\\) but the largest binomial value is only approximately \\(1.264\\times 10^{14}\\).\nComputing with polynomials We have also seen that if we define\n\\[ f_i(x) = \\prod_{m\\ne i}(1+x^{w_m}) \\]\nthen\n\\[ \\beta_i = \\sum_{j=q-w_i}^{q-1}a_j \\]\nwhere \\(a_j\\) is the coefficient of \\(x_j\\) in \\(f_i(x)\\).\nSimilarly, if\n\\[ f_i(x,y) = \\prod_{m\\ne i}(1+yx^{w_m}) \\]\nthen\n\\[ S_i=\\sum_{k=0}^{n-1}\\frac{k!(n-1-k)!}{n!}\\sum_{j=q-w_i}^{q-1}c_{jk} \\]\nwhere \\(c_{jk}\\) is the coefficient of \\(x^jy^k\\) in the expansion of \\(f_i(x,y)\\).\nIn a previous post we have shown how to implement these in Python, using the Sympy library. However, Python can be slow, and using Cython is not trivial. We thus here show how to use Julia and its Polynomials package.\nUsing Julia for speed The Banzhaf power indices can be computed almost directly from the definition:\nusing Polynomials function px(n) return(Polynomial([0,1])^n) end function banzhaf(q,w) n = length(w) inds = vec(zeros(Int64,1,n)) for i in 1:n p = Polynomial([1]) for j in 1:i-1 p = p * px(v[j]) end for j in i+1:n p = p*px(v[j]) end inds[i] = sum(coeffs(p)[k] for k in q-w[i]+1:q) end return(inds) end This will actually return the vector of \\(\\beta_i\\) values which can then be easily normalized.\nThe function px is a \u0026ldquo;helper function\u0026rdquo; that simply returns the polynomial \\(x^n\\).\nFor the Shapley-Shubik indices, the situation is a bit trickier. There are indeed some Julia libraries for multivariate polynomials, but they seem (at the time of writing) to be not fully functional. However, consider the polynomials\n\\[ f_i(x,y)=\\prod_{m\\ne i}(1+yx^{w_m}) \\]\nfrom above. We can consider this as a polynomial of degree \\(n-1\\) in \\(y\\), whose coefficients are polynomials in \\(x\\). So if\n\\[ f_i(x,y) = 1 + p_1(x)y + p_2(x)y^2 +\\cdots + p_{n-1}(x)y^{n-1} \\]\nthen \\(f_i(x,y)\\) can be represented as a vector of polynomials\n\\[ [1,p_1(x),p_2(x),\\ldots,p_{n-1}(x)]. \\]\nWith this representation, we need to perform a multiplication by \\(1+yx^p\\) and also determine coefficients.\nMultiplication is easy, noting at once that \\(1+yx^p\\) is linear in \\(y\\), and so we use the expansion of the product\n\\[ (1+ay)(1 + b_1y + b_2y^2 + \\cdots + b_{n-1}y^{n-1}) \\]\nto\n\\[ 1 + (a+b_1)y + (ab_1+b_2)y^2 + \\cdots + (ab_{n-2}+b_{n-1})y^{n-1} + ab_{n-1}y^n. \\]\nThis can be readily programmed as:\nfunction mulp1(n,p) p0 = Polynonial(0) px = Polynomial([0,1]) c1 = cat(p,p0,dims=1) c2 = cat([p0,p,dims=1) return(c1 + px^n .* c2) end The first two lines aren\u0026rsquo;t really necessary, but they do make the procedure easier to read.\nAnd we need a little program for extracting coefficients, with a result of zero if the power is greater than the degree of the polynomial (Julia\u0026rsquo;s coeff function simply produces a list of all the coefficients in the polynomial.)\nfunction one_coeff(p,n) d = degree(p) pc = coeffs(p) if n \u0026lt;= d return(pc[n+1]) else return(0) end end Now we can put all of this together in a function very similar to the Python function for computing the Shapley-Shubik indices with polynomials:\nfunction shapley(q,w) n = length(w) inds = vec(zeros(Float64,1,n)) for i in 1:n p = Polynomial(1) for j in 1:i-1 p = mulp1(w[j],p) end for j in i+1:n p = mulp1(w[j],p) end B = vec(zeros(Float64,1,n)) for j in 1:n B[j] = sum(one_coeff(p[j],k) for k in q-w[i]:q-1) end inds[i] = sum(B[j+1]/binomial(n,j)/(n-j) for j in 0:n-1) end return(inds) end And a quick test (with timing) of the powers of the states in the Electoral College; here ecv is the number of electors of all the states, in alphabetical order (the first states are Alabama, Alaska, Arizona, and the last states are West Virginia, Wisconsin, Washington):\necv = [9, 3, 11, 6, 55, 9, 7, 3, 3, 29, 16, 4, 4, 20, 11, 6, 6, 8, 8, 4, 10, 11, 16, 10, 6, 10, 3, 5, 6, 4, 14, 5, 29, 15, 3, 18, 7, 7, 20, 4, 9, 3, 11, 38, 6, 3, 13, 12, 5, 10, 3] @time(s = shapley(270,ecv)); 0.722626 seconds (605.50 k allocations: 713.619 MiB, 7.95% gc time) This is running on a Lenovo X1 Carbon, 3rd generation, using Julia 1.5.3. The operating system is a very recently upgraded version of Arch Linux, and currently using kernel 5.10.3.\n","link":"https://numbersandshapes.net/posts/voting_power_speeding/","section":"posts","tags":["voting","algebra","julia"],"title":"Voting power (4): Speeding up the computation"},{"body":"As we all know, American Presidential elections are done with a two-stage process: first the public votes, and then the Electoral College votes. It is the Electoral College that actually votes for the President; but they vote (in their respective states) in accordance with the plurality determined by the public vote. This unusual system was devised by the Founding Fathers as a compromise between mob rule and autocracy, of which both they were determined to guard against. The Electoral College is not now an independent body: in all states but two all electoral college votes are given to the winner in that state. This means that the Electoral College may \u0026ldquo;amplify\u0026rdquo; the public vote; or it may return a vote which differs from the public vote, in that a candidate may receive a majority of public votes, and yet still lose the Electoral College vote. This means that there are periodic calls for the Electoral College to be disbanded, but in reality that seems unlikely. And in fact as far back as 1834 the then President, Andrew Jackson, was demanding its disbanding: a President, according to Jackson, should be a \u0026ldquo;man of the people\u0026rdquo; and hence elected by the people, rather than by an elite \u0026ldquo;College\u0026rdquo;. This is one of the few instances where Jackson didn\u0026rsquo;t get his way.\nThe initial idea of the Electoral College was that voters in their respective states would vote for Electors who would best represent their interests in a Presidential vote: these Electors were supposed to be wise and understanding men who could be relied on to vote in a principled manner. Article ii, Section 1 of the USA Constitution describes how this was to be done. When it became clear that electors were not in fact acting impartially, but only at the behest of the voters, some of the Founding Fathers were horrified. And like so many political institutions the world over, the Electoral College does not now live up to its original expectations, but is also too entrenched in the political process to be removed.\nThe purpose of this post is to determine the voting power of the \u0026ldquo;swing states\u0026rdquo;, in which most of a Presidential campaign is conducted. It has been estimated that something like 75% of Americans are ignored in a campaign; this might be true, but that\u0026rsquo;s just plain politics. For example California (with 55 Electoral College cotes) is so likely to return a Democrat candidate that it may be considered a \u0026ldquo;safe state\u0026rdquo; (at least, for the Democrats); it would be a waste of time for a candidate to spend too much time there. Instead, a candidate should stump in Florida, for example, which is considered a swing state, and may go either way: we have seen how close votes in Florida can be.\nFor discussion about measuring voting power using power indices check out the previous two blog posts.\nThe American Electoral College According to the excellent site 270 to win and their very useful election histories, we can determine which states have voted \u0026ldquo;the same\u0026rdquo; for any election post 1964. Taking 2000 as a reasonable starting point, we have the following results. Some states have voted the same in every election from 2000 onwards; others have not.\nSafe Democrat Safe Republican Swing California 55 Alabama 9 Colorado 9 Connecticut 7 Alaska 3 Florida 29 Delaware 3 Arizona 11 Idaho 4 DC 3 Arkansas 6 Indiana 11 Hawaii 4 Georgia 16 Iowa 6 Illinois 20 Kansas 6 Michigan 16 Maine 3 Kentucky 8 Nevada 6 Maryland 10 Louisiana 8 New Hampshire 4 Massachusetts 11 Mississippi 6 New Mexico 5 Minnesota 10 Missouri 10 North Carolina 15 New Jersey 14 Montana 3 Ohio 18 New York 29 Nebraska 4 Pennsylvania 20 Oregon 7 North Dakota 3 Virginia 13 Rhode Island 4 Oklahoma 7 Wisconsin 10 Vermont 3 South Carolina 9 Maine CD 2 1 Washington 12 South Dakota 3 Nebraska CD 2 1 Tennessee 11 Texas 38 Utah 6 West Virginia 5 Wyoming 3 195 175 168 From the table, we see that since 2000, we can count on 195 \u0026ldquo;safe\u0026rdquo; Electoral College votes for the Democrats, and 175 \u0026ldquo;safe\u0026rdquo; Electoral College votes for the Republicans. Thus of the 168 undecided votes, for a Democrat win the party must obtain at least 75 votes, and for a Republican win, the party needs to amass 95 votes.\nNote that according to the site, of the votes in Maine and Nebraska, all but one are considered safe - remember that these are the only two states to apportion votes by Congressional district. Of Maine\u0026rsquo;s 4 Electoral College votes, 3 are safe Democrat and one is a swing vote; for Nebraska, 4 of its votes are safe Republican, and 1 is a swing vote.\nAll this means is that a Democrat candidate should be campaigning considering the power given by\n\\[ [75; 9,29,4,11,6,16,6,4,5,15,18,20,13,10,1,1] \\]\nand a Republican candidate will be working with\n\\[ [95; 9,29,4,11,6,16,6,4,5,15,18,20,13,10,1,1] \\]\nA Democrat campaign So let\u0026rsquo;s imagine a Democrat candidate who wishes to maximize the efforts of the campaign by concentrating more on states with the greatest power to influence the election.\nIn [1]: q = 75; w = [9,29,4,11,6,16,6,4,5,15,18,20,13,10,1,1] In [2]: b = banzhaf(q,w); bn = [sy.Float(x/sum(b)) for x in b]; [sy.N(x,4) for x in bn] Out[2]: [0.05192, 0.1867, 0.02271, 0.06426, 0.03478, 0.09467, 0.03478, 0.02271, 0.02870, 0.08801, 0.1060, 0.1196, 0.07515, 0.05800, 0.005994, 0.005994] In [3]: s = shapley(q,w); [sy.N(x/sum(s),4) for x in s] out[3]: [0.05102, 0.1902, 0.02188, 0.06375, 0.03367, 0.09531, 0.03367, 0.02188, 0.02770, 0.08833, 0.1073, 0.1217, 0.07506, 0.05723, 0.005662, 0.005662] The values are not the same, but they are in fact quite close, and in this case they are comparable to the numbers of Electoral votes in each state. To compare values, it will be most efficient to set up a DataFrame using Python\u0026rsquo;s data analysis library pandas. We shall also convert the Banzhaf and Shapley-Shubik values from sympy floats int ordinary python floats.\nIn [4]: import pandas as pd In [5]: bf = [float(x) for x in bn] In [6]: sf = [float(x) for x in s] In [5]: d = {\u0026#34;States\u0026#34;:states, \u0026#34;EC Votes\u0026#34;:ec_votes, \u0026#34;Banzhaf indices\u0026#34;:bf, \u0026#34;Shapley-Shubik indices:sf} In [6]: swings = pd.DataFrame(d) In [7]: swings.sort_values(by = \u0026#34;EC Votes\u0026#34;, ascending = False) In [8]: ssings.sort_values(by = \u0026#34;Banzhaf indices\u0026#34;, ascending = False) In [9]: swings.sort_values(by = \u0026#34;Shapley-Shubik indices\u0026#34;, ascending = False) We won\u0026rsquo;t show the results of the last three expressions, but they all give rise to the same ordering.\nWe can still get some information by not looking so much at the values of the power indices, but their relative values to the number of Electoral votes. To do this we need a new column which normalizes the Electoral votes so that their sum is unity:\nIn [10]: swings[\u0026#34;Normalized EC Votes\u0026#34;] = swings[\u0026#34;EC Votes\u0026#34;]/168.0 In [11]: swings[\u0026#34;Ratio B to N\u0026#34;] = swings[\u0026#34;Banzhaf indices\u0026#34;]/swings[\u0026#34;Normalized EC Votes\u0026#34;] In [12]: swings[\u0026#34;Ratio S to N\u0026#34;] = swings_d[\u0026#34;Shapley-Shubik indices\u0026#34;]/swings[\u0026#34;Normalized EC Votes\u0026#34;] In [13]: swings.sort_values(by = \u0026#34;EC Votes\u0026#34;, ascending = False) The following table shows the result.\nStates EC Votes Banzhaf indices Shapley-Shubik indices EC Votes Normalized Ratio B to N Ratio S to N Florida 29 0.186702 0.190164 0.172619 1.081585 1.101640 Pennsylvania 20 0.119575 0.121732 0.119048 1.004430 1.022552 Ohio 18 0.106034 0.107289 0.107143 0.989655 1.001360 Michigan 16 0.094671 0.095309 0.095238 0.994049 1.000743 North Carolina 15 0.088014 0.088330 0.089286 0.985754 0.989293 Virginia 13 0.075149 0.075057 0.077381 0.971155 0.969966 Indiana 11 0.064261 0.063752 0.065476 0.981447 0.973660 Wisconsin 10 0.058004 0.057227 0.059524 0.974471 0.961422 Colorado 9 0.051922 0.051017 0.053571 0.969215 0.952318 Iowa 6 0.034777 0.033670 0.035714 0.973770 0.942774 Nevada 6 0.034777 0.033670 0.035714 0.973770 0.942774 New Mexico 5 0.028695 0.027704 0.029762 0.964169 0.930862 Idaho 4 0.022714 0.021877 0.023810 0.953972 0.918823 New Hampshire 4 0.022714 0.021877 0.023810 0.953972 0.918823 Maine CD 2 1 0.005994 0.005662 0.005952 1.007058 0.951282 Nebraska CD 2 1 0.005994 0.005662 0.005952 1.007058 0.951282 We can thus infer that a Democrat candidate should indeed campaign most vigorously in the states with the largest number of Electoral votes. This might seem to be obvious, but as we have shown in previous posts, there is not always a correlation between voting weight and voting power, and that a voter with a low weight might end up having considerable power.\nA Republican candidate Going through all of the above, but with a quota of 95, produces in the end the following:\nStates EC Votes Banzhaf indices Shapley-Shubik indices EC Votes Normalized Ratio B to N Ratio S to N Florida 29 0.186024 0.190086 0.172619 1.077658 1.101190 Pennsylvania 20 0.119789 0.121871 0.119048 1.006230 1.023718 Ohio 18 0.106258 0.107258 0.107143 0.991741 1.001075 Michigan 16 0.094453 0.095156 0.095238 0.991756 0.999140 North Carolina 15 0.088106 0.088410 0.089286 0.986789 0.990194 Virginia 13 0.075362 0.074940 0.077381 0.973906 0.968460 Indiana 11 0.064064 0.063568 0.065476 0.978439 0.970862 Wisconsin 10 0.058073 0.057394 0.059524 0.975628 0.964219 Colorado 9 0.052133 0.051209 0.053571 0.973140 0.955892 Iowa 6 0.034692 0.033612 0.035714 0.971363 0.941142 Nevada 6 0.034692 0.033612 0.035714 0.971363 0.941142 New Mexico 5 0.028776 0.027715 0.029762 0.966885 0.931235 Idaho 4 0.022912 0.021963 0.023810 0.962300 0.922436 New Hampshire 4 0.022912 0.021963 0.023810 0.962300 0.922436 Maine CD 2 1 0.005877 0.005621 0.005952 0.987357 0.944289 Nebraska CD 1 0.005877 0.005621 0.005952 0.987357 0.944289 and we see a similar result as for the Democrat version, an obvious difference though being that Michigan has decreased its relative power, at least as measured using the Shapley-Shubik index. \\\n","link":"https://numbersandshapes.net/posts/voting_power_swing_states/","section":"posts","tags":["voting","algebra"],"title":"Voting power (3): The American swing states"},{"body":"Naive implementation of Banzhaf power indices As we saw in the previous post, computation of the power indices can become unwieldy as the number of voters increases. However, we can very simply write a program to compute the Banzhaf power indices simply by looping over all subsets of the weights:\ndef banzhaf1(q,w): n = len(w) inds = [0]*n P = [[]] # these next three lines creates the powerset of 0,1,...,(n-1) for i in range(n): P += [p+[i] for p in P] for S in P[1:]: T = [w[s] for s in S] if sum(T) \u0026gt;= q: for s in S: T = [t for t in S if t != s] if sum(w[j] for j in T)\u0026lt;q: inds[s]+=1 return(inds) And we can test it:\nIn [1]: q = 51; w = [49,49,2] In [2]: banzhaf(q,w) Out[2]: [2, 2, 2] In [3]: banzhaf(12,[4,4,4,2,2,1]) Out[3]: [10, 10, 10, 6, 6, 0 The origin of the Banzhaf power indices was when John Banzhaf explored the fairness of a local voting system where six bodies had votes 9, 9, 7, 3, 1, 1 and a majority of 16 was required to pass any motion:\nIn [4]: banzhaf(16,[9,9,7,3,1,1]) Out[4]: [16, 16, 16, 0, 0, 0] This result led Banzhaf to campaign against this system as being manifestly unfair.\nImplementation using polynomials In 1976, the eminent mathematical political theorist Steven Brams, along with Paul Affuso, in an article \u0026ldquo;Power and Size: A New Paradox\u0026rdquo; (in the Journal of Theory and Decision) showed how generating functions could be used effectively to compute the Banzhaf power indices.\nFor example, suppose we have\n\\[ [6; 4,3,2,2] \\]\nand we wish to determine the power of the first voter. We consider the formal polynomial\n\\[ q_1(x) = (1+x^3)(1+x^2)(1+x^2) = 1 + 2x^2 + x^3 + x^4 + 2x^5 + x^7. \\]\nThe coefficient of \\(x^j\\) is the number of ways all the other voters can combine to form a weight sum equal to \\(j\\). For example, there are two ways voters can join to create a sum of 5: voters 2 and 3, or voters 2 and 4. But there is only one way to create a sum of 4: with voters 3 and 4.\nThen the number of ways in which voter 1 will be necessary can be found by adding all the coefficients of \\(x^{6-4}\\) to \\(x^5\\). This gives a value p(1) = 6. In general, define\n\\[ q_i(x) = \\prod_{j\\ne i}(1-x^{w_j}) = a_0 + a_1x + a_2x^2 +\\cdots + a_kx^k. \\]\nThen it is easily shown that\n\\[ p(i) = \\sum_{j=q-w_i}^{q-1}a_j. \\]\nAs another example, suppose we use this method to compute Luxembourg\u0026rsquo;s power in the EEC:\n\\[ q_6(x) = (1+x^4)^3(1+x^2)^2 = 1 + 2x^2 + 4x^4 + 6x^6 + 6x^8 + 6x^{10} + 4x^{12} + 2x^{14} + x^{16} \\]\nand we find \\(b(6)\\) by adding the coefficients of \\(x^{12-w_6}\\) to \\(x^{12-1}\\), which produces zero.\nThis can be readily implemented in Python, using the sympy library for symbolic computation.\nimport sympy as sy def banzhaf(q,w): sy.var(\u0026#39;x\u0026#39;) n = len(w) inds = [] for i in range(n): p = 1 for j in range(i): p *= (1+x**w[j]) for j in range(i+1,n): p *= (1+x**w[j]) p = p.expand() inds += [sum(p.coeff(x,k) for k in range(q-w[i],q))] return(inds) Computation of Shapley-Shubik index The use of permutations will clearly be too unwieldy. Even for say 15 voters, there are \\(2^{15}=32768\\) subsets, but \\(1,307,674,368,000\\) permutations, which is already too big for enumeration (except possibly on a very fast machine, or in parallel, or using a clever algorithm).\nThe use of polynomials for computations in fact precedes the work of Brams and Affuso; it was published by Irwin Mann and Lloyd Shapley in 1962, in a \u0026ldquo;memorandum\u0026rdquo; called Values of Large Games IV: Evaluating the Electoral College Exactly which happily you can find as a PDF file here.\nBuilding on some previous work, they showed that the Shapley-Shubik index corresponding to voter \\(i\\), could be defined as\n\\[ \\Phi_i=\\sum_{k=0}^{n-1}\\frac{k!(n-1-k)!}{n!}\\sum_{j=q-w_i}^{q-1}c_{jk} \\]\nwhere \\(c_{jk}\\) is the coefficient of \\(x^jy^k\\) in the expansion of\n\\[ f_i(x,y)=\\prod_{m\\ne i}(1+x^{w_m}y). \\]\nThis of course has many similarities to the polynomial definition of the Banzhaf power index, and can be computed similarly:\ndef shapley(q,w): sy.var(\u0026#39;x,y\u0026#39;) n = len(w) inds = [] for i in range(n): p = 1 for j in range(i): p *= (1+y*x**w[j]) for j in range(i+1,n): p *= (1+y*x**w[j]) p = p.expand() B = [] for j in range(n): pj = p.coeff(y,j) B += [sum(pj.coeff(x,k) for k in range(q-w[i],q))] inds += [sum(sy.Float(B[j]/sy.binomial(n,j)/(n-j)) for j in range(n))] return(inds) A few simple examples The Australian (federal) Senate consists of 76 members, of which a simple majority is required to pass a bill. It is unusual for the current elected government (which will have a majority in the lower house: the House of Representatives) also to have a majority in the Senate. Thus it is quite possible for a party with small numbers to wield significant power.\nA case in point is that of the \u0026ldquo;Australian Democrats\u0026rdquo; party, founded in 1977 by a disaffected ex-Liberal politician called Don Chipp, with the uniquely Australian slogan \u0026ldquo;Keep the Bastards Honest\u0026rdquo;. For nearly two decades they were a vital force in Australian politics; they have pretty much lost all power they once had, although the party still exists.\nHere\u0026rsquo;s a little table showing the Senate composition in various years:\nParty 1985 2000 2020 Government 34 35 36 Opposition 33 29 26 Democrats 7 9 Independent 1 1 1 Nuclear Disarmament 1 Greens 1 9 One Nation 1 2 Centre Alliance 1 Lambie Party 1 This composition in 1985 can be described as\n\\[ [39; 34,33,7,1,1]. \\]\nAnd now:\nIn [1]: b = banzhaf(39,[34,33,7,1,1]) [sy.N(x/sum(b),4) for x in b] Out[1]: [0.3333, 0.3333, 0.3333, 0, 0] In [2]: s = shapley(39,[34,33,7,1,1]) [sy.N(x,4) for x in s] Out[2]: [0.3333, 0.3333, 0.3333, 0, 0] Here we see that both power indices give the same result: that the Democrats had equal power in the Senate to the two major parties, and the other two senate members had no power at all.\nIn 2000, we have \\([39;35,29,9,1,1,1]\\) and:\nIn [1]: b = banzhaf(39,[31,29,9,1,1,1]) [sy.N(x/sum(b),4) for x in b] Out[1]: [0.34, 0.3, 0.3, 0.02, 0.02, 0.02] In [2]: s = shapley(39,[31,29,9,1,1,1]) [sy.N(x,4) for x in s] Out[2]: [0.35, 0.3, 0.3, 0.01667, 0.01667, 0.01667] We see here that the two power indices give two slightly different results, but in each case the power of the Democrats was equal to that of the opposition, and this time the parties with single members had real (if small) power.\nBy 2020 the Democrats have disappeared as a political force, their place being more-or-less taken (at least numerically) by the Greens:\nIn [1]: b = banzhaf(39,[36,26,9,2,1,1,1] [sy.N(x/sum(b),4) for x in b] Out[1]: [0.5306, 0.1224, 0.1224, 0.102, 0.04082, 0.04082, 0.04082] In [2]: s = shapley(39,[36,26,9,2,1,1,1] [sy.N(x,4) for x in s] Out[2]: [0.5191, 0.1357, 0.1357, 0.1024, 0.03571, 0.03571, 0.03571] This shows a very different sort of power balance to previously: the Government has much more power in the Senate, partly to having close to a majority and partly because of the fracturing of other Senate members through a host of smaller parties. Note that the Greens, while having more members that the Democrats did in 1985, have far less power. Note also that One Nation, while only having twice as many members as the singleton parties, has far more power: 2.5 times by Banzhaf, 2.8667 times by Shapley-Shubik.\n","link":"https://numbersandshapes.net/posts/voting_power_computation/","section":"posts","tags":["voting","algebra"],"title":"Voting power (2): computation"},{"body":"After the 2020 American Presidential election, with the usual post-election analyses and (in this case) vast numbers of lawsuits, I started looking at the Electoral College, and trying to work out how it worked in terms of power. Although power is often conflated simply with the number of votes, that\u0026rsquo;s not necessarily the case. We consider power as the ability of any state to affect the outcome of an election. Clearly a state with more votes: such as California with 55, will be more powerful than a state with fewer, for example Wyoming with 3. But often power is not directly correlated with size.\nFor example, imagine a version of America with just 3 states, Alpha, Beta, and Gamma, with electoral votes 49, 49, 2 respectively, and 51 votes needed to win.\nThe following table shows the ways that the states can join to reach (or exceed) that majority, and in each case which state is \u0026ldquo;necessary\u0026rdquo; for the win:\nWinning Coalitions Votes won Necessary States Alpha, Beta 98 Alpha, Beta Alpha, Gamma 51 Alpha, Gamma Beta, Gamma 51 Beta, Gamma Alpha, Beta, Gamma 100 No single state By \u0026ldquo;necessary states\u0026rdquo; we mean a state whose votes are necessary for the win. And in looking at that table, we see that in terms of influencing the vote, Gamma, with only 2 electors, is equally as powerful as the other two states.\nTo give another example, the Treaty Of Rome in the 1950\u0026rsquo;s established the first version of the European Common Market, with six member states, each allocated a number of votes for decision making:\nMember Votes 1 France 4 2 West Germany 4 3 Italy 4 4 The Netherlands 2 5 Belgium 2 6 Luxembourg 1 The treaty determined that a quota of 12 votes was needed to pass any resolution. At first this table might seem manifestly unfair: West Germany with a population of over 55 million compared with Luxembourg\u0026rsquo;s roughly 1/3 of a million, thus with something like 160 times the population, West Germany got only 4 times the number of votes of Luxembourg.\nBut in fact it\u0026rsquo;s even worse: since 12 votes are required to win, and all the other numbers of votes are even, there is no way that Luxembourg can influence any vote at all: its voting power was zero. If another state joined, also with a vote of 1, then it and Luxembourg together can influence a vote, and so Luxembourg\u0026rsquo;s voting power would increase.\nA power index is some numerical value attached to a weighted vote which describes its power in this sense. Although there are many such indices, there are two which are most widely used. The first was developed by Lloyd Shapley (who would win the Nobel Prize for Economics in 2012) and Martin Shubik in 1954; the second by John Banzhaf in 1965.\nBasic definitions First, some notation. In general we will have \\(n\\) voters each with a weight \\(w_i\\), and a quota \\(q\\) to be reached. For the American Electoral College, the voters are the states, the weights are the numbers of Electoral votes, and \\(q\\) is the number of votes required: 238. This is denoted as\n\\[ [q; w_1, w_2,\\ldots,w_n]. \\]\nThe three state example above is thus denoted\n\\[ [51; 49, 49, 2] \\]\nand the EEC votes as\n\\[ [12; 4,4,4,2,2,1]. \\]\nThe Shapley-Shubik index Suppose we have \\(n\\) votes with weights \\(w_1\\), \\(w_2\\) up to \\(w_n\\), and a quote \\(q\\) required. Consider all permutations of \\(1,2, \\ldots,n\\). For each permutation, add up the weights starting at the left, and designate as the pivot voter the first voter who causes the cumulative sum to equal or exceed the quota. For each voter \\(i\\), let \\(s_i\\) be the number of times that voter has been chosen as a pivot. Then its power index is \\(s_i/n!\\). This means that the sum of all power indices is unity.\nConsider the three state example above, where \\(w_1=w_2=49\\) and \\(w_3=2\\), and where we compute cumulative sums only up to reaching or exceeding the quota:\nPermutation Cumulative sum of weights Pivot Voter 1 2 3 49, 98 2 1 3 2 49, 51 3 2 1 3 49, 98 1 2 3 1 49, 51 3 3 1 2 2, 51 1 3 2 1 2, 51 2 We see that \\(s_1=s_2=s_3=2\\) and so the Shapley-Shubik power indices are all \\(1/3\\).\nThe Banzhaf index For the Banzhaf index, we consider the winning coalitions: these are any subset \\(S\\) of voters for which the sum of weights is not less than \\(q\\). It\u0026rsquo;s convenient to define a function for this:\n\\[ v(S) = \\begin{cases} 1 \u0026amp; \\text{if } \\sum_{i\\in S}w_i\\ge q \\cr 0 \u0026amp; \\text{otherwise} \\end{cases} \\]\nA voter \\(i\\) is necessary for a winning coalition \\(S\\) if \\(S-\\{i\\}\\) is not a winning coalition; that is, if \\(v(S)-v(S-\\{i\\})=1\\). If we define\n\\[ p(i) =\\sum_S v(S)-v(s-\\{i\\}) \\]\nthen \\(b(i)\\) is a measure of power, and the (normalized) Banzhaf power indices are defined as\n\\[ b(i) = \\frac{p(i)}{\\sum_i p(i)} \\]\nso that the sum of all indices (as for the Shapley-Shubik index) is again unity.\nConsidering the first table above, we see that \\(p(1)=p(2)=p(3)=2\\) and the Banzhf power indices are all \\(1/3\\). For this example the Banzhaf and Shapley-Shubik values agree. This is not always the case.\nFor the EEC example, the winning coalitions are, with necessary voters:\nWinning Coalition Votes Necessary voters 1,2,3 12 1,2,3 1,2,4,5 12 1,2,4,5 1,3,4,5 12 1,3,4,5 2,3,4,5 12 2,3,4,5 1,2,3,6 13 1,2,3 1,2,4,5,6 13 1,2,4,5 1,3,4,5,6 13 1,3,4,5 2,3,4,5,6 13 2,3,4,5 1,2,3,4 14 1,2,3 1,2,3,5 14 1,2,3 1,2,3,4,6 15 1,2,3 1,2,3,5,6 15 1,2,3 1,2,3,4,5 16 No single voter 1,2,3,4,5,6 17 No single voter Counting up the number of times each voter appears in the rightmost column, we see that\n\\[ p(1) = p(2) = p(3) = 10,\\quad p(4) = p(5) = 6,\\quad p(6) = 0 \\]\nand so\n\\[ b(1) = b(2) = b(3) = \\frac{5}{21},\\quad b(4) = b(5) = \\frac{1}{7}. \\]\nNote that the power of the three biggest states is in fact only 5/3 times that of the smaller states, in spite of having twice as many votes. This is a striking example of how power is not proportional to voting weight.\nNote that computing the Shapley-Shubik index could be unwieldy; there are\n\\[ \\frac{6!}{3!2!} = 60 \\]\ndifferent permutations of the weights, and clearly as the number of weights increases, possibly with very few repetitions, the number of permutations will be excessive. For the Electoral College, with 51 members, and a few states with the same numbers of voters, the total number of permutations will be\n\\[ \\frac{51!}{(2!)^4(3!)^3(4!)(5!)(6!)(8!)} = 5368164393879631593058456306349344975896576000000000 \\]\nwhich is clearly far too large for enumeration. But as we shall see, there are other methods.\n","link":"https://numbersandshapes.net/posts/voting_power/","section":"posts","tags":["voting"],"title":"Voting power"},{"body":"Every four years (barring death or some other catastrophe), the USA goes through the periodic madness of a presidential election. Wild behaviour, inaccuracies, mud-slinging from both sides have been central since George Washington\u0026rsquo;s second term. And the entire business of voting is muddied by the Electoral College, the 538 members of which do the actual voting: the public, in their own voting, merely instruct the College what to do. Although it has been said that the EC \u0026ldquo;magnifies\u0026rdquo; the popular vote, this is not always the case, and quite often a president will be elected with a majority (270 or more) of Electoral College votes, in spite of losing the popular vote. This dichotomy encourages periodic calls for the College to be disbanded.\nAs you probably know, each of the 50 states and the District of Columbia has Electors allocated to it, roughly proportional to population. Thus California, the most populous state, has 55 electors, and several smaller states (and DC) only 3.\nIn all states except Maine and Nebraska, the votes are allocated on a \u0026ldquo;winner takes all\u0026rdquo; principle: that is, all the Electoral votes will be allocated to whichever candidate has obtained a plurality in that state. For only two candidates then, if a states\u0026rsquo; voters produce a simple majority of votes for one of them, that candidate gets all the EC votes.\nMaine and Nebraska however, allocate their EC votes by congressional district. In each state, 2 EC votes are allocated to the winner of the popular vote in the state, and for each congressional district (2 in Maine, 3 in Nebraska), the other votes are allocated to the winner in that district.\nIt\u0026rsquo;s been a bit of a mathematical game to determine the theoretical lowest bound on a popular vote for a president to be elected. To show how this works, imagine a miniature system with four states and 14 electoral college votes:\nState Population Electors Abell 100 3 Bisly 100 3 Champ 120 4 Dairy 120 4 Operating on the winner takes all principle in each state, 8 EC votes are required for a win. Suppose that in each state, the votes are cast as follows, for the candidates Mr Man and Mr Guy:\nState Mr Man Mr Guy EC Votes to Man EC Votes to Guy Abell 0 100 0 3 Bisly 0 100 0 3 Champ 61 59 4 0 Dairy 61 59 4 0 Total 122 310 8 6 and Mr Man wins with 8 EC votes but only about 27.3% of the popular vote. Now you might reasonably argue that this situation would never occur in practice, and probably you\u0026rsquo;re right. But extreme examples such as this are used to show up inadequacies in voting systems. And sometimes very strange things do happen.\nSo: what is the smallest percentage of the popular vote under which a president could be elected? To experiment, we need to know the number of registered voters in each state (and it appears that the percentage of eligible citizens enrolled to vote differs markedly between the states), and the numbers of electors. The first I ran to ground here and the few states not accounted for I found information on their Attorney Generals' sites. The one state for which I couldn\u0026rsquo;t find statistics was Illinois, so I used the number 7.8 million, which has been bandied about on a few news sites. The numbers of electors per state is easy to find, for example on the wikipedia page.\nI make the following simplifying assumptions: all registered voters will vote; and all states operate on a winner takes all principle. Thus, for simplicity, I am not using the apportionment scheme of Maine and Nebraska. (I suspect that taking this into account wouldn\u0026rsquo;t effect the result much anyway.)\nSuppose that the registered voting population of each state (including DC) is \\(v_i\\) and the number of EC votes is \\(c_i\\). For any state, either the winner will be chosen by a bare majority, or all the votes will go to the loser. This becomes then a simple integer programming problem; in fact a knapsack problem. For each state, define\n\\[ m_i = \\lfloor v_i/2\\rfloor +1 \\]\nfor the majority votes needed.\nWe want to minimize\n\\[ V = \\sum_{i=1}^{51}x_im_i \\]\nsubject to the constraint\n\\[ \\sum_{k=1}^{51}c_ix_i \\ge 270 \\]\nand each \\(x_i\\) is zero or one.\nNow all we need to is set up this problem in a suitable system and solve it! I chose Julia and its JuMP modelling language, and for actually doing the dirty work, GLPK. JuMP in fact can be used with pretty much any optimisation software available, including commercial systems.\nusing JuMP, GLPK states = [\u0026#34;Alabama\u0026#34;,\u0026#34;Alaska\u0026#34;,\u0026#34;Arizona\u0026#34;,\u0026#34;Arkansas\u0026#34;,\u0026#34;California\u0026#34;,\u0026#34;Colorado\u0026#34;,\u0026#34;Connecticut\u0026#34;,\u0026#34;Delaware\u0026#34;,\u0026#34;DC\u0026#34;,\u0026#34;Florida\u0026#34;,\u0026#34;Georgia\u0026#34;,\u0026#34;Hawaii\u0026#34;, \u0026#34;Idaho\u0026#34;,\u0026#34;llinois\u0026#34;,\u0026#34;Indiana\u0026#34;,\u0026#34;Iowa\u0026#34;,\u0026#34;Kansas\u0026#34;,\u0026#34;Kentucky\u0026#34;,\u0026#34;Louisiana\u0026#34;,\u0026#34;Maine\u0026#34;,\u0026#34;Maryland\u0026#34;,\u0026#34;Massachusetts\u0026#34;,\u0026#34;Michigan\u0026#34;,\u0026#34;Minnesota\u0026#34;, \u0026#34;Mississippi\u0026#34;,\u0026#34;Missouri\u0026#34;,\u0026#34;Montana\u0026#34;,\u0026#34;Nebraska\u0026#34;,\u0026#34;Nevada\u0026#34;,\u0026#34;New Hampshire\u0026#34;,\u0026#34;New Jersey\u0026#34;,\u0026#34;New Mexico\u0026#34;,\u0026#34;New York\u0026#34;,\u0026#34;North Carolina\u0026#34;, \u0026#34;North Dakota\u0026#34;,\u0026#34;Ohio\u0026#34;,\u0026#34;Oklahoma\u0026#34;,\u0026#34;Oregon\u0026#34;,\u0026#34;Pennsylvania\u0026#34;,\u0026#34;Rhode Island\u0026#34;,\u0026#34;South Carolina\u0026#34;,\u0026#34;South Dakota\u0026#34;,\u0026#34;Tennessee\u0026#34;,\u0026#34;Texas\u0026#34;,\u0026#34;Utah\u0026#34;, \u0026#34;Vermont\u0026#34;,\u0026#34;Virginia\u0026#34;,\u0026#34;Washington\u0026#34;,\u0026#34;West Virginia\u0026#34;,\u0026#34;Wisconsin\u0026#34;,\u0026#34;Wyoming\u0026#34;] reg_voters = [3560686,597319,4281152,1755775,22047448,4238513,2375537,738563,504043,14065627,7233584,795248,1010984,7800000,4585024, 2245092,1851397,3565428,3091340,1063383,4141498,4812909,8127040,3588563,2262810,4213092,696292,1252089,1821356,913726,6486299, 1350181,13555547,6838231,540302,8080050,2259113,2924292,9091371,809821,3486879,578666,3931248,16211198,1857861,495267,5975696, 4861482,1268460,3684726,268837] majorities = [Int(floor(x/2+1)) for x in reg_voters] ec_votes = [9,3,11,6,55,9,7,3,3,29,16,4,4,20,11,6,6,8,8,4,10,11,16,10,6,10,3,5,6,4,14,5,29,15,3,18,7,7,20,4,9,3,11,38,6,3,13,12,5,10,3] potus = Model(GLPK.Optimizer) @variable(potus, x[i=1:51], Bin) @constraint(potus, sum(ec_votes .* x) \u0026gt;= 270) @objective(potus, Min, sum(majorities .* x)); Solving the problem is now easy:\noptimize!(potus) Now let\u0026rsquo;s see what we\u0026rsquo;ve got:\nvx = value.(x) sum(ec_votes .* x) 270 votes = Int(objective_value(potus)) 46146767 votes*100/sum(reg_voters) 21.584985938021866 and we see we have elected a president with slightly less than 21.6% of the popular vote.\nDigging a little further, we first find the states in which a bare majority voted for the winner:\nf = findall(x -\u0026gt; x == 1.0, vx) for i in f print(states[i],\u0026#34;, \u0026#34;) end Alabama, Alaska, Arizona, Arkansas, California, Connecticut, Delaware, DC, Hawaii, Idaho, llinois, Indiana, Iowa, Kansas, Louisiana, Maine, Minnesota, Mississippi, Montana, Nebraska, Nevada, New Hampshire, New Mexico, North Dakota, Oklahoma, Oregon, Rhode Island, South Carolina, South Dakota, Tennessee, Utah, Vermont, West Virginia, Wisconsin, Wyoming, and the other states, in which every voter voted for the loser:\nnf = findall(x -\u0026gt; x == 0.0, vx) for i in nf print(states[i],\u0026#34;, \u0026#34;) end Colorado, Florida, Georgia, Kentucky, Maryland, Massachusetts, Michigan, Missouri, New Jersey, New York, North Carolina, Ohio, Pennsylvania, Texas, Virginia, Washington, In point of history, the election in which the president-elect did worst was in 1824, when John Quincy Adams was elected over Andrew Jackson; this was in fact a four-way contest, and the decision was in the end made by the House of Representatives, who elected Adams by one vote. And Jackson, never one to neglect an opportunity for vindictiveness, vowed that he would destroy Adams\u0026rsquo;s presidency, which he did.\nMore recently, since the Electoral College has sat at 538 members, in 2000 George W. Bush won in spite of losing the popular vote by 0.51%, and in 2016 Donald Trump won in spite of losing the popular vote by 2.09%.\nPlenty of numbers can be found on wikipedia and elsewhere.\n","link":"https://numbersandshapes.net/posts/electing_a_president/","section":"posts","tags":["voting","linear-programming","julia"],"title":"Electing a president"},{"body":"","link":"https://numbersandshapes.net/tags/linear-programming/","section":"tags","tags":null,"title":"Linear-Programming"},{"body":"The rational numbers are well known to be countable, and one standard method of counting them is to put the positive rationals into an infinite matrix \\(M=m_{ij}\\), where \\(m_{ij}=i/j\\) so that you end up with something that looks like this:\n\\[ \\left[\\begin{array}{ccccc} \\frac{1}{1}\u0026amp;\\frac{1}{2}\u0026amp;\\frac{1}{3}\u0026amp;\\frac{1}{4}\u0026amp;\\dots\\\\\\[1ex] \\frac{2}{1}\u0026amp;\\frac{2}{2}\u0026amp;\\frac{2}{3}\u0026amp;\\frac{2}{4}\u0026amp;\\dots\\\\\\[1ex] \\frac{3}{1}\u0026amp;\\frac{3}{2}\u0026amp;\\frac{3}{3}\u0026amp;\\frac{3}{4}\u0026amp;\\dots\\\\\\[1ex] \\frac{4}{1}\u0026amp;\\frac{4}{2}\u0026amp;\\frac{4}{3}\u0026amp;\\frac{4}{4}\u0026amp;\\dots\\\\\\[1ex] \\vdots\u0026amp;\\vdots\u0026amp;\\vdots\u0026amp;\\vdots\u0026amp;\\ddots \\end{array}\\right] \\]\nIt is clear that not only will each positive rational appear somewhere in this matrix, but its value will appear an infinite number of times. For example \\(2 / 3\\) will appear also as \\(4 / 6\\), as \\(6 / 9\\) and so on.\nThen we can enumerate all the elements of this matrix by traversing all the SW\u0026ndash;NE diagonals:\nThis provides an enumeration of all the positive rationals: \\[ \\frac{1}{1}, \\frac{1}{2}, \\frac{2}{1}, \\frac{3}{1}, \\frac{2}{2}, \\frac{1}{3}, \\frac{1}{4}, \\frac{2}{3},\\ldots \\] To enumerate all rationals (positive and negative), we simply place the negative of each value immediately after it: \\[ \\frac{1}{1}, -\\frac{1}{1}, \\frac{1}{2}, -\\frac{1}{2}, \\frac{2}{1}, -\\frac{2}{1}, \\frac{3}{1}, -\\frac{3}{1}, \\frac{2}{2}, -\\frac{2}{2}, \\frac{1}{3}, -\\frac{1}{3}, \\frac{1}{4}, \\\\frac{1}{4}, \\frac{2}{3}, -\\frac{2}{3}\\ldots \\] This is all standard, well-known stuff, and as far as countability goes, pretty trivial.\nOne might reasonably ask: is there a way of enumerating all rationals in such a way that no rational is repeated, and that every rational appears naturally in its lowest form?\nIndeed there is; in fact there are several, of which one of the newest, most elegant, and simplest, is using the Calkin-Wilf tree. This is named for its discoverers (or creators, depending on which philosophy of mathematics you espouse), who described it in an article happily available on the archived web site of the second author. Herbert Wilf died in 2012, but the Mathematics Department at the University of Pennsylvania have maintained the page as he left it, as an online memorial to him.\nThe Calkin-Wilf tree is a binary tree with root \\(a / b = 1 / 1\\). From each node \\(a / b\\) the left child is \\(a / (a+b)\\) and the right child is \\((a+b) / b\\). From each node \\(a / b\\), the path back to the root contains the fractions which encode, as it were, the Euclidean algorithm for determining the greatest common divisor of \\(a\\) and \\(b\\). It is not hard to show that every fraction in the tree is in its lowest terms, and appears only once; also that every rational appears in the tree.\nThe enumeration of the rationals can thus be made by a breadth-first transversal of the tree; in other words listing each level of the tree one after the other:\n\\[ \\underbrace{\\frac{1}{1}}_{\\text{The root}},\\; \\underbrace{\\frac{1}{2},\\; \\frac{2}{1}}_{\\text{first level}},\\; \\underbrace{\\frac{1}{3},\\; \\frac{3}{2},\\; \\frac{2}{3},\\; \\frac{3}{1}}_{\\text{second level}},\\; \\underbrace{\\frac{1}{4},\\; \\frac{4}{3},\\; \\frac{3}{5},\\; \\frac{5}{2},\\; \\frac{2}{5},\\; \\frac{5}{3},\\; \\frac{3}{4},\\; \\frac{4}{1}}_{\\text{third level}}\\;\\ldots \\]\nNote that the denominator of each fraction is the numerator of its successor (again, this is not hard to prove in general); thus given the sequence\n\\[ b_i=0,1,1,2,1,3,2,3,1,4,3,5,2,5,3,4,\\ldots \\]\n(indexed from zero), the rationals are enumerated by \\(b_i/b_{i+1}\\). This sequence pre-dates Calkin and Wilf; is goes back to an older enumeration now called the Stern-Brocot tree named for the mathematician Moritz Stern and the clock-maker Achille Brocot (who was investigating gear ratios), who discovered this tree independently in the early 1860\u0026rsquo;s.\nThe sequence \\(b_i\\) is called Stern\u0026rsquo;s diatomic sequence and can be generated recursively:\n\\[ b_i=\\left\\{\\begin{array}{ll} i,\u0026amp;\\text{if $i\\le 1$}\\\\ b_{i/2},\u0026amp;\\text{if $i$ is even}\\\\ b_{(i-1)/2}+b_{(i+1)/2},\u0026amp;\\text{if $i$ is odd} \\end{array} \\right. \\]\nAlternatively: \\[ b_0=0,\\;b_1=1,\\;b_{2i}=b_i,b_{2i+1}=b_i+b_{i+1}\\text{ for }i\\ge 1. \\] This is the form in which it appears as sequence 2487 in the OEIS.\nSo we can generate Stern\u0026rsquo;s diatomic sequence \\(b_i\\), and then the successive fractions \\(b_i/b_{i+1}\\) will generate each rational exactly once.\nIf that isn\u0026rsquo;t remarkable enough, sometime prior to 2003, Moshe Newman showed that the Calkin-Wilf enumeration of the rationals can in fact be done directly: \\[ x_0 = 1,\\quad x_{i+1}=\\frac{1}{2\\lfloor x_i\\rfloor -x_i +1}\\;\\text{for}\\;i\\ge 1 \\] will generate all the rationals. I can\u0026rsquo;t find anything at all about Moshe Newman; he is always just mentioned as having \u0026ldquo;shown\u0026rdquo; this result. Never where, or to whom. There is a proof for this in an article \u0026ldquo;New Looks at Old Number Theory\u0026rdquo; by Aimeric Malter, Dierk Schleicher and Don Zagier, published in The American Mathematical Monthly , Vol. 120, No. 3 (March 2013), pp. 243-264. The part of the article relating to enumeration of rationals is based on a prize-winning mathematical essay by the first author (who at the time was a high school student in Bremen, Germany), when he was only 13. Here is the skeleton of Malter\u0026rsquo;s proof:\nIf \\(x\\) is any node, then its left and right hand children are \\(L = x / (x+1)\\) and \\(R = 1+x = 1 / (1-L)\\) respectively. And clearly \\(R = 1/(2\\lfloor L\\rfloor -L +1)\\). Suppose now that \\(A\\) is a right child, and \\(B\\) is its successor rational. Then \\(A\\) and \\(B\\) will have a common ancestor \\(z=p/q\\), say \\(k\\) generations ago. To get from \\(z\\) to \\(A\\) will require one left step and \\(k-1\\) right steps. It is easy to show (by induction if you like), that\n\\[ A = k-1+\\frac{p}{p+q} \\] and for its successor \\(B\\), obtained by one right step from \\(z\\) and \\(k-1\\) left steps: \\[ B = \\frac{1}{\\frac{q}{p+q}+k-1}. \\] Since \\(k-1=\\lfloor A\\rfloor\\), and since \\[ \\frac{p}{p+q} = A-\\lfloor A\\rfloor \\] it follows that \\[ B=\\frac{1}{1-(A-\\lfloor A\\rfloor))+\\lfloor A\\rfloor}=\\frac{1}{2\\lfloor A\\rfloor-A+1}. \\] The remaining case is moving from the end of one row to the beginning of the next, that is, from \\(n\\) to \\(1 / (n+1)\\). And this is trivial.\nWhat\u0026rsquo;s more, we can write down the isomorphisms between this sequence of positive rationals and in positive integers. Define \\(N:\\Bbb{Q}\\to\\Bbb{Z}\\) as follows:\n\\[ N(p/q)=\\left\\{\\begin{array}{ll} 1,\u0026amp;\\text{if $p=q$}\\\\ 2 N(p/(q-p)),\u0026amp;\\text{if $p\\lt q$}\\\\ 2 N((p-q)/q)+1,\u0026amp;\\text{if $p\\gt q$} \\end{array} \\right. \\]\nWithout going through a formal proof, what this does is simply count the number of steps taken to perform the Euclidean algorithm on \\(p\\) and \\(q\\). The extra factors of 2 ensure that rationals in level \\(k\\) have values between \\(2^k\\) and \\(2^{k+1}\\), and the final \u0026ldquo;\\(+1\\)\u0026rdquo; differentiates left and right children. This function assumes that \\(p\\) and \\(q\\) are relatively prime; that is, that the fraction \\(p/q\\) is in its lowest terms.\n(The isomorphism in the other direction is given by \\(k\\mapsto b_k/b_{k+1}\\) where \\(b_k\\) are the elements of Stern\u0026rsquo;s diatomic sequence discussed above.)\nThis is just the sort of mathematics I like: simple, but surprising, and with depth. What\u0026rsquo;s not to like?\n","link":"https://numbersandshapes.net/posts/enumerating_the_rationals/","section":"posts","tags":["mathematics"],"title":"Enumerating the rationals"},{"body":"A few posts ago I showed how to do this in Python. Now it\u0026rsquo;s Julia\u0026rsquo;s turn. The data is the same: spread of influenza in a British boarding school with a population of 762. This was reported in the British Medical Journal on March 4, 1978, and you can read the original short article here.\nAs before we use the SIR model, with equations\n\\begin{aligned} \\frac{dS}{dt}\u0026amp;=-\\frac{\\beta IS}{N}\\\\ \\frac{dI}{dt}\u0026amp;=\\frac{\\beta IS}{N}-\\gamma I\\\\ \\frac{dR}{dt}\u0026amp;=\\gamma I \\end{aligned}\nwhere \\(S\\), \\(I\\), and \\(R\\) are the numbers of susceptible, infected, and recovered people. This model assumes a constant population - so no births or deaths - and that once recovered, a person is immune. There are more complex models which include a changing population, as well as other disease dynamics.\nThe above equations can be written without the population; since it is constant, we can just write \\(\\beta\\) instead of \\(\\beta/N\\). The values \\(\\beta\\) and \\(\\gamma\\) are the parameters which affect the working of this model, their values and that of their ratio \\(\\beta/\\gamma\\) provide information of the speed of the disease spread.\nAs with Python, our interest will be to see if we can find values of \\(\\beta\\) and \\(\\gamma\\) which model the school outbreak.\nWe will do this in three functions. The first sets up the differential equations:\nusing DifferentialEquations function SIR!(du,u,p,t) S,I,R = u β,γ = p du[1] = dS = -β*I*S du[2] = dI = β*I*S - γ*I du[3] = dR = γ*I end The next function determines the sum of squares between the data and the results of the SIR computations for given values of the parameters. Since we will put all our functions into one file, we can create the constant values outside any functions which might need them:\ndata = [1, 3, 6, 25, 73, 222, 294, 258, 237, 191, 125, 69, 27, 11, 4] tspan = (0.0,14.0) u0 = [762.0,1.0,0.0] function ss(x) prob = ODEProblem(SIR!,u0,tspan,(x[1],x[2])) sol = solve(prob) sol_data = sol(0:14)[2,:] return(sum((sol_data - data) .^2)) end Note that we don\u0026rsquo;t have to carefully set up the problem to produce values at each of the data points, which in our case are the integers from 0 to 14. Julia will use a standard numerical technique with a dynamic step size, and values corresponding to the data points can then be found by interpolation. All of this functionality is provided by the DifferentialEquations package. For example, R[10] will return the 10th value of the list of computed R values, but R(10) will produce the interpolated value of R at \\(t=10\\).\nFinally we use the Optim package to minimize the sum of squares, and the Plots package to plot the result:\nusing Optim using Plots function run_optim() opt = optimize(ss,[0.001,0.01],NelderMead()) beta,gamma = opt.minimizer prob = ODEProblem(SIR!,u0,tspan,(beta,gamma)) sol = solve(prob) plot(sol, linewidth=2, xaxis=\u0026#34;Time in days\u0026#34;, label=[\u0026#34;Susceptible\u0026#34; \u0026#34;Infected\u0026#34; \u0026#34;Recovered\u0026#34;]) plot!([0:14],data,linestyle=:dash,marker=:circle,markersize=4,label=\u0026#34;Data\u0026#34;) end Running the last function will produce values\n\\begin{aligned} \\beta\u0026amp;=0.0021806887934782853\\\\ \\gamma\u0026amp;=0.4452595474326912 \\end{aligned}\nand the final plot looks like this:\n!Julia SIR plot\nThis was at least as easy as in Python, and with a few extra bells and whistles, such as interpolation of data points. Nice!\n","link":"https://numbersandshapes.net/posts/fitting_sir_to_data_in_julia/","section":"posts","tags":["mathematics","julia"],"title":"Fitting the SIR model of disease to data in Julia"},{"body":"The purpose of this post will be to see if we can implement the algorithm in Julia, and thus leverage Julia\u0026rsquo;s very fast execution time.\nWe are working with polynomials defined on nilpotent variables, which means that the degree of any generator in a polynomial term will be 0 or 1. Assume that our generators are indexed from zero: \\(x_0,x_1,\\ldots,x_{n-1}\\), then any term in a polynomial will have the form \\[ cx_{i_1}x_{i_2}\\cdots x_{i_k} \\] where \\(\\{x_{i_1}, x_{i_2},\\ldots, x_{i_k}\\}\\subseteq\\{0,1,2,\\ldots,n-1\\}\\). We can then express this term as an element of a dictionary {k =\u0026gt; v} where \\[ k = 2^{i_1}+2^{i_2}+\\cdots+2^{i_k}. \\] So, for example, the polynomial term \\(7x_2x_3x_5\\) would correspond to the dictionary term\n44 =\u0026gt; 7\nsince \\(44 = 2^2+2^3+2^5\\). Two polynomial terms {k1 =\u0026gt; v1} and {k2 =\u0026gt; v2} with no variables in common can then be multiplied simply by adding the k terms, and multiplying the v values, to obtain {k1+k2 =\u0026gt; v1*v2} . And we can check if k1 and k2 have a common variable easily by evaluating k1 \u0026amp; k2; a non-zero value indicates a common variable. This leads to the following Julia function for multiplying two such dictionaries:\nfunction poly_dict_mul(p1, p2) p3 = Dict{BigInt,BigInt}() for (k1, v1) in p1 for (k2, v2) in p2 if k1 \u0026amp; k2 \u0026gt; 0 continue else if k1 + k2 in keys(p3) p3[k1+k2] += v1 * v2 else p3[k1+k2] = v1 * v2 end end end end return (p3) end As you see, this is a simple double loop over the terms in each polynomial dictionary. If two terms have a non-zero conjunction, we simply move on. If two terms when added already exist in the new dictionary, we add to that term. If the sum of terms is new, we create a new dictionary element. The use of BigInt is to ensure that no matter how big the terms and coefficients become, we don\u0026rsquo;t suffer from arithmetic overflow.\nFor example, suppose we consider the product \\[ (x_0+x_1+x_2)(x_1+x_2+x_3). \\] A straightforward expansion produces \\[ x_0x_3 + x_1x_3 + x_1^2 + x_0x_1 + x_0x_2 + 2x_1x_2 + x_2^2 + x_2x_3. \\] which by nilpotency becomes \\[ x_0x_3 + x_1x_3 + x_0x_1 + x_0x_2 + 2x_1x_2 + x_2x_3. \\] The dictionaries corresponding to the two polynomials are\n{1 =\u0026gt; 1, 2 =\u0026gt; 1, 4 =\u0026gt; 1}\nand\n{2 =\u0026gt; 1, 4 =\u0026gt; 1, 8 =\u0026gt; 1}\nThen:\njulia\u0026gt; poly_dict_mul(Dict(1=\u0026gt;1,2=\u0026gt;1,4=\u0026gt;1),Dict(2=\u0026gt;1,4=\u0026gt;1,8=\u0026gt;1)) Dict{BigInt,BigInt} with 6 entries: 9 =\u0026gt; 1 10 =\u0026gt; 1 3 =\u0026gt; 1 5 =\u0026gt; 1 6 =\u0026gt; 2 12 =\u0026gt; 1 If we were to rewrite the keys as binary numbers, we would have\n{1001 =\u0026gt; 1, 1010 =\u0026gt; 1, 11 =\u0026gt; 1, 101 =\u0026gt; 1, 110 =\u0026gt; 2, 1100 =\u0026gt; 1}\nin which you can see that each term corresponds with the term of the product above.\nHaving conquered multiplication, finding the permanent should then require two steps:\nTurning each row of the matrix into a polynomial dictionary. Starting with \\(p=1\\), multiply all rows together, one at a time. For step 1, suppose we have a row \\(i\\) of a matrix \\(M=m_{ij}\\). Then starting with an empty dictionary p, we move along the row, and for each non-zero element \\(m_{ij}\\) we add the term p[BigInt(1)\u0026lt;\u0026lt;j] = M[i,j]. For speed we use bit operations instead of arithmetic operations. This means we can create a list of all polynomial dictionaries:\nfunction mat_polys(M) (n,ncols) = size(M) ps = [] for i in 1:n p = Dict{BigInt,BigInt}() for j in 1:n if M[i,j] == 0 continue else p[BigInt(1)\u0026lt;\u0026lt;(j-1)] = M[i,j] end end push!(ps,p) end return(ps) end Step 2 is a simple loop; the permanent will be given as the value in the final step:\nfunction poly_perm(M) (n,ncols) = size(M) mp = mat_polys(M) p = Dict{BigInt,BigInt}(0=\u0026gt;1) for i in 1:n p = poly_dict_mul(p,mp[i]) end return(collect(values(p))[1]) end We don\u0026rsquo;t in fact need two separate functions here; since the polynomial dictionary for each row is only used once, we could simply create each one as we needed. However, given that none of our matrices will be too large, the saving of time and space would be minimal.\nNow for a few tests:\njulia\u0026gt; n = 10; M = [BigInt(1)*mod(j-i,n) in [1,2,3] for i = 1:n, j = 1:n); julia\u0026gt; poly_perm(M) 125 julia\u0026gt; n = 20; M = [BigInt(1)*mod(j-i,n) in [1,2,3] for i = 1:n, j = 1:n); julia\u0026gt; @time poly_perm(M) 0.003214 seconds (30.65 k allocations: 690.875 KiB) 15129 julia\u0026gt; n = 40; M = [BigInt(1)*mod(j-i,n) in [1,2,3] for i = 1:n, j = 1:n); julia\u0026gt; @time poly_perm(M) 0.014794 seconds (234.01 k allocations: 5.046 MiB) 228826129 julia\u0026gt; n = 100; M = [BigInt(1)*mod(j-i,n) in [1,2,3] for i = 1:n, j = 1:n); julia\u0026gt; @time poly_perm(M) 0.454841 seconds (3.84 M allocations: 83.730 MiB, 27.98% gc time) 792070839848372253129 julia\u0026gt; lucasnum(n)+2 792070839848372253129 This is extraordinarily fast, especially compared with our previous attempts: naive attempts using all permutations, and using Ryser\u0026rsquo;s algorithm.\nA few comparisons Over the previous blog posts, we have explored various different methods of computing the permanent:\npermanent, which is the most naive method, using the formal definition, and summing over all the permutations \\(S_n\\).\nperm1, Ryser\u0026rsquo;s algorithm, using the Combinatorics package and iterating over all non-empty subsets of \\(\\{1,2,\\ldots,n\\}\\).\nperm2, Same as perm1 but instead of using subsets, we use all non-zero binary vectors of length n.\nperm3, Ryser\u0026rsquo;s algorithm using Gray codes to speed the transition between subsets, and using a lookup table.\nAll these are completely general, and aside from the first function, which is the most inefficient, can be used for any matrix up to size about \\(25\\times 25\\).\nSo consider the \\(n\\times n\\) circulant matrix with three ones in each row, whose permanent is \\(L(n)+2\\). The following table shows times in seconds (except where minutes is used) for each calculation:\n10 12 15 20 30 40 60 100 permanent 9.3 - - - - - - - perm1 0.014 0.18 0.72 47 - - - - perm2 0.03 0.105 2.63 166 - - - - perm3 0.004 0.016 0.15 12.4 - - - - poly_perm 0.0008 0.004 0.001 0.009 0.008 0.02 0.05 0.18 Assuming that the time taken for permanent is roughly proportional to \\(n!n\\), then we would expect that the time for matrices of sizes 23 and 24 would be about \\(1.5\\times 10^{17}\\) and \\(3.8\\times 10^{18}\\) seconds respectively. Note that the age of the universe is approximately \\(4.32\\times 10^{17}\\) seconds, so my laptop would need to run for about the third of the universe\u0026rsquo;s age to compute the permanent of a \\(23\\times 23\\) matrix. That\u0026rsquo;s about the time since the solar system and the Earth were formed.\nNote also that poly_perm will slow down if the number of non-zero values in each row increases. For example, with four consecutive ones in each row, it takes over 10 seconds for a \\(100\\times 100\\) matrix. With five ones in each row, it takes about 2.7 and 21.6 seconds respectively for matrices of size 40 and 60. Extrapolating indicates that it would take about 250 seconds for the \\(100\\times 100\\) matrix. In general, an \\(n\\times n\\) matrix with \\(k\\) non-zero elements in each row will have a time complexity approximately of order \\(n^k\\). However, including the extra optimization (which we haven\u0026rsquo;t done) that allows for elements to be set to one before the multiplication, produces an algorithm whose complexity is \\(O(2^{\\text{min}(2w,n)}(w+1)n^2)\\) where \\(n\\) is the size of the matrix, and \\(w\\) its band-width. See the original paper for details.\n","link":"https://numbersandshapes.net/posts/the_butera_pernici_algorithm_2/","section":"posts","tags":["mathematics","computation"],"title":"The Butera-Pernici algorithm (2)"},{"body":"Introduction We know that there is no general sub-exponential algorithm for computing the permanent of a square matrix. But we may very reasonably ask \u0026ndash; might there be a faster, possibly even polynomial-time algorithm, for some specific classes of matrices? For example, a sparse matrix will have most terms of the permanent zero \u0026ndash; can this be somehow leveraged for a better algorithm?\nThe answer seems to be a qualified \u0026ldquo;yes\u0026rdquo;. In particular, if a matrix is banded, so that most diagonals are zero, then a very fast algorithm can be applied. This algorithm is described in an online article by Paolo Butera and Mario Pernici called Sums of permanental minors using Grassmann algebra. Accompanying software (Python programs) is available at github. This software has been rewritten for the SageMath system, and you can read about it in the documentation. The algorithm as described by Butera and Pernici, and as implemented in Sage, actually produces a generating function.\nOur intention here is to investigate a simpler version, which computes the permanent only.\nBasic outline Let \\(M\\) be an \\(n\\times n\\) square matrix, and consider the polynomial ring on \\(n\\) variables \\(x_1,x_2,\\ldots,x_n\\). Each row of the matrix will correspond to an element of this ring; in particular row \\(i\\) will correspond to\n\\[ \\sum_{j=1}^nm_{ij}a_j=m_{i1}a_1+m_{i2}a_2+\\cdots+m_{in}a_n. \\]\nSuppose further that all the generating elements \\(x_i\\) are nilpotent of order two, so that \\(x_i^2=0\\).\nNow if we take all the row polynomials and multiply them, each term of the product will have order \\(n\\). But by nilpotency, all terms which contain a repeated element will vanish. The result will be only those terms which contain each generator exactly once, of which there will be \\(n!\\). To obtain the permanent all that is required is to set \\(x_i=1\\) for each generator.\nHere\u0026rsquo;s an example in Sage.\nsage: R.\u0026lt;a,b,c,d,e,f,g,h,i,x1,x2,x3\u0026gt; = PolynomialRing(QQbar) sage: M = matrix([[a,b,c],[d,e,f],[g,h,i]]) sage: X = matrix([[x1],[x2],[x3]]) sage: MX = M*X [a*x1 + b*x2 + c*x3] [d*x1 + e*x2 + f*x3] [g*x1 + h*x2 + i*x3] To implement nilpotency, it\u0026rsquo;s easiest to reduce modulo the ideal defined by \\(x_i^2=0\\) for all \\(i\\). So we take the product of those row elements, and reduce:\nsage: I = R.ideal([x1^2, x2^2, x3^2]) sage: pr = MX[0,0]*MX[1,0]*MX[2,0] sage: pr.reduce(I) c*e*g*x1*x2*x3 + b*f*g*x1*x2*x3 + c*d*h*x1*x2*x3 + a*f*h*x1*x2*x3 + b*d*i*x1*x2*x3 + a*e*i*x1*x2*x3 Finally, set each generator equal to 1:\nsage: pr.reduce(I).subs({x1:1, x2:1, x3:1}) c*e*g + b*f*g + c*d*h + a*f*h + b*d*i + a*e*i and this is indeed the permanent for a general \\(3\\times 3\\) matrix.\nSome experiments Let\u0026rsquo;s experiment now with the matrices we\u0026rsquo;ve seen in a previous post, which contain three consecutive super-diagonals of ones, and the rest zero.\nSuch a matrix is easy to set up in Sage:\nsage: n = 10 sage: v = n*[0]; v[1:4] = [1,1,1] sage: M = matrix.circulant(v) sage: M [0 1 1 1 0 0 0 0 0 0] [0 0 1 1 1 0 0 0 0 0] [0 0 0 1 1 1 0 0 0 0] [0 0 0 0 1 1 1 0 0 0] [0 0 0 0 0 1 1 1 0 0] [0 0 0 0 0 0 1 1 1 0] [0 0 0 0 0 0 0 1 1 1] [1 0 0 0 0 0 0 0 1 1] [1 1 0 0 0 0 0 0 0 1] [1 1 1 0 0 0 0 0 0 0] Similarly we can define the polynomial ring:\nsage: R = PolynomialRing(QQbar,x,n) sage: R.inject_variables() Defining x0, x1, x2, x3, x4, x5, x6, x7, x8, x9 And now the polynomials corresponding to the rows:\nsage: MX = M*matrix(R.gens()).transpose() sage: MX [x1 + x2 + x3] [x2 + x3 + x4] [x3 + x4 + x5] [x4 + x5 + x6] [x5 + x6 + x7] [x6 + x7 + x8] [x7 + x8 + x9] [x0 + x8 + x9] [x0 + x1 + x9] [x0 + x1 + x2] If we multiply them, we will end up with a huge expression, far too long to display:\nsage: pr = prod(MX[i,0] for i in range(n)) sage: len(pr.monomials) 14103 We could reduce this by the ideal, but that would be slow. Far better to reduce after each separate multiplication:\nsage: I = R.ideal([v^2 for v in R.gens()]) sage: p = R.one() sage: for i in range(n): p = p*MX[i,0] p = p.reduce(I) sage: p.subs({v:1 for v in R.gens()) 125 The answer is almost instantaneous. We can repeat the above list of commands starting with different values of n; for example with n=20 the result is 15129, as we expect.\nThis is not yet optimal; for n=20 on my machine the final loop takes about 7.8 seconds. Butera and Pernici show that the multiplication and setting the variables to one can sometimes be done in the opposite order; that is, some variables can be identified to be set to one before the multiplication. This can speed the entire loop dramatically, and this optimization has been included in the Sage implementation. For details, see their paper.\n","link":"https://numbersandshapes.net/posts/the_butera_pernici_algorithm_1/","section":"posts","tags":["mathematics","computation"],"title":"The Butera-Pernici algorithm (1)"},{"body":"","link":"https://numbersandshapes.net/tags/astronomy/","section":"tags","tags":null,"title":"Astronomy"},{"body":"","link":"https://numbersandshapes.net/tags/science/","section":"tags","tags":null,"title":"Science"},{"body":"As a first blog post for 2020, I\u0026rsquo;m dusting off one from my previous blog, which I\u0026rsquo;ve edited only slightly.\nI\u0026rsquo;ve been looking up at the sky at night recently, and thinking about the sizes of things. Now it\u0026rsquo;s all very well to say something is for example a million kilometres away; that\u0026rsquo;s just a number, and as far as the real numbers go, a pretty small one (all finite numbers are \u0026ldquo;small\u0026rdquo;). The difficulty comes in trying to marry very large distances and times with our own human scale. I suppose if you\u0026rsquo;re a cosmologist or astrophysicist this is trivial, but for the rest of us it\u0026rsquo;s pretty daunting.\nIt\u0026rsquo;s all a problem of scale. You can say the sun has an average distance of 149.6 million kilometres from earth (roughly 93 million miles), but how big, really, is that? I don\u0026rsquo;t have any sense of how big such a distance is: my own sense of scale goes down to about 1mm in one direction, and up to about 1000km in the other. This is hopelessly inadequate for cosmological measurements.\nSo let\u0026rsquo;s start with some numbers:\nDiameter of Earth: 12,742 km\nDiameter of the moon: 3,475km\nDiameter of Sun: 1,391,684 km\nDiameter of Jupiter: 139,822 km\nAverage distance of Earth to the sun: 149,597,870km\nAverage distance of Jupiter to the sun: 778.5 million km\nAverage distance of Earth to the moon: 384,400 km\nOf course since all orbits are elliptical, distances will both exceed and be less than the average at different times. However, for our purposes of scale, an average is quite sufficient.\nBy doing a bit of division, we find that the moon is about 0.27 the width of the earth, Jupiter is about 11 times bigger (in linear measurements) and the Sun about 109.2 times bigger than the Earth.\nNow for some scaling. We will scale the earth down to the size of a mustard seed, which is about 1mm in diameter. On this scale, the Sun is about the size of a large grapefruit (which happily is large, round, and yellow), and the moon is about the size of a dust mite:\nOn this new scale, with 12742 km equals 1 millimetre, the above distances become:\nDiameter of Earth: 1mm\nDiameter of the moon: 0.27mm\nDiameter of Sun: 109.2m\nDiameter of Jupiter: 10.97mm\nAverage distance of Earth to the sun: 11740mm = 11.74m\nAverage distance of Jupiter to the sun: 61097.2mm = 61.1m\nAverage distance of Earth to the moon: 30.2mm = 3cm\nSo how long is the distance from the sun to the Earth? Well, a cricket pitch is 22 yards long, so 11 yards from centre to end, which is about 10.1 metres. So imagine our grapefruit placed at the centre of a cricket pitch. Go to an end of the pitch, and about 1.5 metres (about 5 feet) beyond. Place the mustard seed there. What you now have is a scale model of the sun and earth.\nHere\u0026rsquo;s a cricket pitch to give you an idea of its size:\nNote that in this picture, the yellow circle is not drawn to size. If you look just left of centre, you\u0026rsquo;ll see the cricket ball, which has a diameter of about 73mm. Our \u0026ldquo;Sun\u0026rdquo; grapefruit should be about half again as wide as that.\nIf you don\u0026rsquo;t have a sense of a cricket pitch (even with this picture), consider instead a tennis court: the distance from the net to the baseline is 39 feet, or 11.9m. At our scale, this is nearly exact:\n(Note that on this scale the sun is somewhat bigger than a tennis ball, and the Earth would in fact be too small to see on this picture.)\nSo we now have a scale model of the Sun and Earth. If we wanted to include the Moon, start with its average distance from Earth (384,400 km), then we\u0026rsquo;d have a dust mite circling our mustard seed at a distance of 3cm.\nHow about Jupiter? Well, we noted before that it is about 61m away. Continuing with our cricket pitch analogy, imagine three pitches laid end to end, which is 66 yards, or 60.35 metres. Not too far off, really! So place the grapefruit at the end of the first pitch, the mustard seed a little away from centre, and at the end of the third pitch place an 11mm ball for Jupiter: a glass marble will do nicely for this.\nAnd the size of the solar system? Assuming the edge is given by the heliopause (where the Sun\u0026rsquo;s solar wind is slowed down by interstellar particles); this is at a distance of about 18,100,000,000 km from the Sun, which in our scale is about 1.42 km, or a bit less than a mile (0.88 miles). Get that? With Earth the size of a mustard seed, the edge of the solar system is nearly a mile away!\nOnwards and outwards So with this scaling we have got the solar system down to a reasonably manageable size. If 149,600,000 km seems too vast a distance to make much sense of, scaling it down to 11.7 metres is a lot easier. But let\u0026rsquo;s get cosmological here, and start with a light year, which is 9,460,730,472,580,800 m, or more simply (and inexactly) 9.46× 1015m. In our scale, that becomes 742,483,948.562 mm, or about 742 km, which is about 461 miles. That\u0026rsquo;s about the distance from New York city to Greensboro, NC, or from Melbourne to Sydney. The nearest star is Proxima Centauri, which is 4.3 light years away: at our Earth=mustard seed scale, that\u0026rsquo;s about 3192.6 km, or 1983.8 miles. This is the flight distance from Los Angeles to Detroit. Look at that distance on an atlas, imagine our planet home mustard seed at one place and consider getting to the other.\nThe furthest any human has been from the mustard seed Earth is to the dust-mite Moon: 3cm, or 1.2 inches away. To get to the nearest star is, well, a lot further!\nThe nearest galaxy to the Milky Way is about 0.025 mly away. (\u0026ldquo;mly\u0026rdquo; = \u0026ldquo;millions of light years\u0026rdquo;). Now we\u0026rsquo;re getting into the big stuff. At our scale, this distance will be 18,500,000 kilometres, which means that at our mustard seed scale, the nearest galaxy is about 18.5 million kilometres away. And there are lots of other galaxies, and much further away than this. For example, the Andromeda Galaxy is 2,538,000 light years away, which at our scale is 1,884,465,000 km \u0026ndash; nearly two billion kilometres!\nWhat\u0026rsquo;s remarkable is that even scaling the Earth down to a tiny mustard seed speck, we are still up against distances too vast for human scale. We could try scaling the Earth down to a ball whose diameter is the thickness of the finest human hair \u0026ndash; about 0.01 mm \u0026ndash; which is the smallest distance within reach of our own scale. But even at this scale distances are only reduced by a factor of 100, so the nearest galaxy is still 18,844,650 km away.\nOne last try: suppose we scale the entire Solar System, out to the heliopause, down to a mustard seed. This means that the diameter of the heliopause: 36,200,000,000 km, is scaled down to 1mm. Note that the heliopause is about three times further away from the sun than the mean distance of Pluto. At this scale, one light year is a happily manageable 261mm, or about ten and a quarter inches. So the nearest star is 1.12m away, or about 44 inches. And the nearest galaxy? Well, it\u0026rsquo;s 25000 light years away, which puts it at about 6.5 km. The Andromeda Galaxy is somewhat over 663 km away. The furthest galaxy, with the enticing name of GN-z11 is said to be about 34 billion light years away. On our heliopause=mustard seed scale, that\u0026rsquo;s about 9.1 million kilometres.\nThere\u0026rsquo;s no escaping it, the Universe is big, and the scales need to describe it, no matter how you approach them, quickly leap out of of our own human scale.\n","link":"https://numbersandshapes.net/posts/the_size_of_the_universe/","section":"posts","tags":["science","astronomy"],"title":"The size of the universe"},{"body":"As I discussed in my last blog post, the permanent of an \\(n\\times n\\) matrix \\(M=m_{ij}\\) is defined as \\[ \\text{per}(M)=\\sum_{\\sigma\\in S_n}\\prod_{i=1}^nm_{i,\\sigma(i)} \\] where the sum is taken over all permutations of the \\(n\\) numbers \\(1,2,\\ldots,n\\). It differs from the better known determinant in having no sign changes. For example:\n\\[\\text{per} \\begin{bmatrix} a\u0026amp;b\u0026amp;c\\\\ d\u0026amp;e\u0026amp;f\\\\ g\u0026amp;h\u0026amp;i \\end{bmatrix} =aei+afh+bfg+bdi+cdi+ceg.\\]\nBy comparison, here is the determinant:\n\\[\\text{det} \\begin{bmatrix} a\u0026amp;b\u0026amp;c\\\\ d\u0026amp;e\u0026amp;f\\\\ g\u0026amp;h\u0026amp;i \\end{bmatrix} =aei - afh + bfg - bdi + cdi - ceg.\\]\nThe apparent simplicity of the permanent definition hides the fact that there is no known sub-exponential algorithm to compute it, nor does it satisfy most of the nice properties of determinants. For example, we have\n\\[ \\text{det}(AB)=\\text{det}(A)\\text{det}(B) \\]\nbut in general \\(\\text{per}(AB)\\ne\\text{per}(A)\\text{per}(B)\\). Nor is the permanent zero if two rows are equal, or if any subset of rows is linearly dependent.\nApplying the definition and summing over all the permutations is prohibitively slow; of \\(O(n!n)\\) complexity, and unusable except for very small matrices.\nIn the small but excellent textbook \u0026ldquo;Combinatorial Mathematics\u0026rdquo; by Herbert J. Ryser and published in 1963, one chapter is devoted to the inclusion-exclusion principle, of which the computation of permanents is given as an example. The permanent may be considered as a sum of products, where in each product we choose one value from each row and one value from each column.\nSuppose we start by adding the rows together and multiplying them: \\[ P = (a+d+g)(b+e+h)(c+f+i). \\] This will certainly contain all elements of the permanent, but it also includes products we don\u0026rsquo;t want, such as for example \\(aef\\) where the elements are chosen from only two rows, and \\(ghi\\) where the elements are all in one row.\nTo eliminate all possible products from only two rows we subtract them:\n\\[\\begin{aligned} P \u0026amp; -(a+d)(b+e)(c+f)\\text{ rows $1$ and $2$}\\\\ \u0026amp; - (a+g)(b+h)(c+i)\\text{ rows $1$ and $3$}\\\\ \u0026amp; - (d+g)(e+h)(f+i)\\text{ rows $2$ and $3$} \\end{aligned}\\]\nThe trouble with that subtraction is that we are subtracting products of each individual row twice; for example \\(abc\\) is in the first and second products. But we only want to subtract those products once from \\(P\\). So we have to add them again:\n\\[\\begin{aligned} P \u0026amp; -(a+d)(b+e)(c+f)\\qquad\\text{ rows $1$ and $2$}\\\\ \u0026amp; - (a+g)(b+h)(c+i)\\qquad\\text{ rows $1$ and $3$}\\\\ \u0026amp; - (d+g)(e+h)(f+i)\\qquad\\text{ rows $2$ and $3$}\\\\ \u0026amp; + abc\\\\ \u0026amp; + def\\\\ \u0026amp; + ghi. \\end{aligned}\\]\nComputing the permanent of a \\(4\\times 4\\) matrix would start by adding all the rows and multiplying the sums. Then we would subtract all products of rows taken three at a time. But this would subtract all products of rows taken two at a time twice for each pair, so we add those products back in again. Finally we find that the number of times we\u0026rsquo;ve subtracted products of a single row have cancelled out, so we need to subtract them again:\n\\[\\begin{array}{ccccc} \u0026amp;1\u0026amp;2\u0026amp;3\u0026amp;4\\\\ -\u0026amp;1\u0026amp;2\u0026amp;3\u0026amp;\\\\ -\u0026amp;1\u0026amp;2\u0026amp;\u0026amp;4\\\\ -\u0026amp;1\u0026amp;\u0026amp;3\u0026amp;4\\\\ -\u0026amp;\u0026amp;2\u0026amp;3\u0026amp;4\\\\ +\u0026amp;1\u0026amp;2\u0026amp;\u0026amp;\\\\ +\u0026amp;1\u0026amp;\u0026amp;3\u0026amp;\\\\ +\u0026amp;1\u0026amp;\u0026amp;\u0026amp;4\\\\ +\u0026amp;\u0026amp;2\u0026amp;3\u0026amp;\\\\ +\u0026amp;\u0026amp;2\u0026amp;3\u0026amp;\\\\ +\u0026amp;\u0026amp;\u0026amp;3\u0026amp;4\\\\ -\u0026amp;1\\\\ -\u0026amp;\u0026amp;2\\\\ -\u0026amp;\u0026amp;\u0026amp;3\\\\ -\u0026amp;\u0026amp;\u0026amp;\u0026amp;4 \\end{array}\\]\nAfter all of this we are left with only those products which include all fours rows and columns. And as you see, this is a standard inclusion-exclusion approach.\nFor an \\(n\\times n\\) matrix, let \\(S=\\{1,2,\\ldots,n\\}\\), and let \\(X\\subseteq S\\). Then define \\(R(X)\\) to be the product of the sums of elements of rows indexed by \\(X\\). For example, with \\(S=\\{1,2,3\\}\\) and \\(X=\\{1,3\\}\\), then\n\\[ R(X)=(a+g)(b+h)(c+i). \\] We can thus write the above method for obtaining the permanent as:\n\\[\\begin{aligned} \\text{per}(M)\u0026amp;=\\sum_{\\emptyset\\ne X\\subseteq S}(-1)^{n-|X|}R(X)\\\\ \u0026amp;= (-1)^n\\sum_{\\emptyset\\ne X\\subseteq S}(-1)^{|X|}R(X) \\end{aligned}\\]\nThis is Ryser\u0026rsquo;s algorithm.\nNaive Implementation A naive implementation would be to simply iterate through all the non-empty subsets \\(X\\) of \\(S=\\{1,2,\\ldots,n\\}\\), and for each subset add those rows, and multiply the resulting sums.\nHere\u0026rsquo;s one such Julia function:\nusing Combinatorics function perm1(M) n, nc = size(M) S = 1:n P = 0 for X in setdiff(powerset(S),powerset([])) P += (-1)^length(X)*prod(sum(M[i,:] for i in X)) end return((-1)^n * P) end Alternatively, we can manage without any extra packages, and use the fact that the subsets of \\(S\\) correspond to the binary digits of integers between 1 and \\(2^n-1\\):\nfunction perm2(M) n,nc = size(m) P = 0 for i in (1:2^n-1) indx = digits(i,base=2,pad=n) P += (-1)^sum(indx)*prod(sum(M .* indx,dims=1)) end return((-1)^n * P) end Now for some tests. There are very few matrices for which the permanent has a known value; however there are some circulant matrices of zeros and ones whose permanent is known. One such is the \\(n\\times n\\) matrix \\(M=m_{ij}\\) whose first, second, and third circulant superdiagonals are ones; that is, for which\n\\[ m_{ij}=1 \\Leftrightarrow\\bmod(j-1,n)\\in\\{1,2,3\\}. \\]\n(Note that since the permanent is trivially unchanged by any permutation of the rows, \\(M\\) can be also defined as being a circulant matrix each row of which has three consecutive ones.)\nThen\n\\[ \\text{per}(M)=F_{n-1}+F_{n+1}+2 \\]\nwhere \\(F_k\\) is the \\(k\\) -th Fibonacci number indexed from 1, so that \\(F_1=F_2=1\\). Note that the sums of Fibonacci numbers whose indices differ by two form the Lucas numbers \\(L_n\\).\nAlternatively,\n\\[ \\text{per}(M)=\\text{trace}(C_2^n)+2 \\]\nwhere\n\\[C_2=\\begin{bmatrix} 0\u0026amp;1\\\\ 1\u0026amp;1 \\end{bmatrix}\\]\nThis result, and some others, can be found in the article \u0026ldquo;Permanents\u0026rdquo; by Marvin Marcus and Henryk Minc, in The American Mathematical Monthly, Vol. 72, No. 6 (Jun. - Jul., 1965), pp. 577-591.\njulia\u0026gt; n = 10; julia\u0026gt; M = [1*(mod(j-i,n) in [1,2,3]) for i=1:n, j=1:n] 10×10 Array{Int64,2}: 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 julia\u0026gt; perm1(M) 125 julia\u0026gt; perm2(M) 125 julia\u0026gt; fibonaccinum(n-1)+fibonaccinum(n+1)+2 125 julia\u0026gt; lucasnum(n)+2 125 julia\u0026gt; C2=[0 1; 1 1]; julia\u0026gt; tr(C2^n)+2 125 However, it seems that using the machinery of subsets adds to the time, which becomes noticeable if \\(n\\) is large:\njulia\u0026gt; using Benchmarktools julia\u0026gt; n = 20; M = [1*(mod(j-i,n) in [1,2,3]) for i=1:n, j=1:n]; julia\u0026gt; @time perm1(M) 6.703187 seconds (31.46 M allocations: 5.097 GiB, 17.93% gc time) 15129 julia\u0026gt; @time perm2(M) 1.677242 seconds (3.21 M allocations: 3.721 GiB, 8.69% gc time) 15129 That is, the perm2 implementation is about four times faster than perm1.\nImplementation with Gray codes Here is where we can show some cleverness. Recall that a Gray code is a listing of all numbers 0 through \\(2^n-1\\) whose binary expansion changes in only one bit between consecutive values. It is also known that for \\(1\\le k\\le 2^n-1\\) then the Gray code corresponding to \\(k\\) is given by \\(k\\oplus (k\\gg 1)\\). For example, for \\(n=4\\):\njulia\u0026gt; for i in 1:15 j = xor(i,i\u0026gt;\u0026gt;1) println(lpad(i,2),\u0026#34;:\\t\\b\\b\\b\u0026#34;,bin(i,4),\u0026#39;\\t\u0026#39;,lpad(j,2),\u0026#34;:\\t\\b\\b\\b\u0026#34;,bin(j,4)) end 1: [1, 0, 0, 0] 1: [1, 0, 0, 0] 2: [0, 1, 0, 0] 3: [1, 1, 0, 0] 3: [1, 1, 0, 0] 2: [0, 1, 0, 0] 4: [0, 0, 1, 0] 6: [0, 1, 1, 0] 5: [1, 0, 1, 0] 7: [1, 1, 1, 0] 6: [0, 1, 1, 0] 5: [1, 0, 1, 0] 7: [1, 1, 1, 0] 4: [0, 0, 1, 0] 8: [0, 0, 0, 1] 12: [0, 0, 1, 1] 9: [1, 0, 0, 1] 13: [1, 0, 1, 1] 10: [0, 1, 0, 1] 15: [1, 1, 1, 1] 11: [1, 1, 0, 1] 14: [0, 1, 1, 1] 12: [0, 0, 1, 1] 10: [0, 1, 0, 1] 13: [1, 0, 1, 1] 11: [1, 1, 0, 1] 14: [0, 1, 1, 1] 9: [1, 0, 0, 1] 15: [1, 1, 1, 1] 8: [0, 0, 0, 1] Note that in the rightmost column of binary expansions, there is only one bit shift between consecutive values: either a single 1 is added, or removed.\nFor Ryser\u0026rsquo;s algorithm, we can consider this in terms of our sum of rows: if the subsets are given in Gray code order, then moving from one subset to the next is a matter of just adding or subtracting one row from the current sum.\nWe are not so much interested in the codes themselves, as in their successive differences. For example here are the differences for \\(n=4\\):\njulia\u0026gt; [xor(k,k\u0026gt;\u0026gt;1)-xor(k-1,(k-1)\u0026gt;\u0026gt;1) for k in 1:15]\u0026#39; 1×14 Adjoint{Int64,Array{Int64,1}}: 1 2 -1 4 1 -2 -1 8 1 2 -1 -4 1 -2 -1 This sequence is A055975 in the Online Encyclopaedia of Integer Sequences, and can be computed as\n\\[ a[n] = \\begin{cases} n,\u0026amp;\\text{if $n\\le 2$}\\\\ 2a[n/2],\u0026amp;\\text{if $n$ is even}\\\\ (-1)^{(n-1)/2},\u0026amp;\\text{if $n$ is odd} \\end{cases} \\]\nWe can interpret each term in this sequence as the row that should be added or subtracted from the current sum. A value of \\(2^m\\) means adding row \\(m+1\\); a value of \\(-2^m\\) means subtracting row \\(m+1\\). From any difference, we can obtain the bit position by taking the logarithm to base 2, and whether to add or subtract by its sign. But in fact the number of differences is very small: only \\(2n\\), so it will be much easier to create a lookup table, using a dictionary:\nv = vcat([(2^i,i+1,1) for i in 0:n-1],[(-2^i,i+1,-1) for i in 0:n-1]) lut = Dict(x[1] =\u0026gt; (x[2],x[3]) for x in v) For example, for \\(n=3\\) the lookup dictionary is\n4 =\u0026gt; (3, 1) -4 =\u0026gt; (3, -1) 2 =\u0026gt; (2, 1) -2 =\u0026gt; (2, -1) -1 =\u0026gt; (1, -1) 1 =\u0026gt; (1, 1) Now we can implement the algorithm by first pre-computing the sequences of differences, and for each element of the sequence, use the lookup dictionary to determine what row is to be added or subtracted. We start with the first row.\nPutting all this together gives another implementation of the permanent:\nfunction perm3(M) # Gray code version with lookup tables n,nc = size(M) if n != nc error(\u0026#34;Matrix must be square\u0026#34;) end gd = zeros(Int64,1,2^n-1) gd[1] = 1 v = vcat([(2^i,i+1,1) for i in 0:n-1],[(-2^i,i+1,-1) for i in 0:n-1]) lut = Dict(x[1] =\u0026gt; (x[2],x[3]) for x in v) r = M[1,:] # r will contain the sum of rows s = -1 pm = s*prod(r) for i in (2:2^n-1) if iseven(i) gd[i] = 2*gd[div(i,2)] else gd[i] = (-1)^((i-1)/2) end r += M[lut[gd[i]][1],:]*lut[gd[i]][2] s *= -1 pm += s*prod(r) end return(pm * (-1)^n) end This can be timed as before:\njulia\u0026gt; @time perm3(M) 0.943328 seconds (3.15 M allocations: 728.004 MiB, 15.91% gc time) 15129 This is our best time yet.\n","link":"https://numbersandshapes.net/posts/permanents_and_rysers_algorithm/","section":"posts","tags":null,"title":"Permanents and Ryser's algorithm        :mathematics:computation:julia"},{"body":"Introduction Python is of course one of the world\u0026rsquo;s currently most popular languages, and there are plenty of statistics to show it. Of all languages in current use, Python is one of the oldest (in the very quick time-scale of programming languages) dating from 1990 - only C and its variants are older. However, it seems to keep its eternal youth by being re-invented, and by its constantly increasing libraries. Indeed, one of Python\u0026rsquo;s greatest strength is its libraries, and pretty much every Python user will have worked with numpy, scipy, matplotlib, pandas, to name but four. In fact, aside from some specialized applications (mainly involving security, speed, or memory) Python can be happily used for almost everything.\nJulia on the other hand is newer, dating from 2012. (Only Swift is newer.) It was designed to have the speed of C, the power of Matlab, and the ease of use of Python. Note the comparison with Matlab - Julia was designed as a language for technical computing, although it is very much a general purpose language. It can even be used for low-level systems programming.\nLike Python, Julia can be extended through packages, of which there are many: according to Julia\u0026rsquo;s package repository there are 2554 at the time of writing. Some of the packages are big, mature, and robust, others are smaller or represent a niche interest. You can go to Julia Observer to get a sense of which packages are the most popular, largest, have the most commits on github, and so on. Because Julia is still relatively new, packages are still being actively developed. However, some such as Plots, JuMP for optimization, Differential Equations, to name but three, are very much ready for the Big Time.\nThe purpose of this post is to do a single comparison of Julia and Python for speed.\nMatrix permanents Given a square matrix, its determinant is a well-known and useful construct (in spite of Sheldon Axler).\nThe determinant of an \\(n\\times n\\) matrix \\(M=m_{ij}\\) can be formally defined as\n\\[ \\det(M)=\\sum_{\\sigma\\in S_n}\\left(\\text{sgn}(\\sigma)\\prod_{i=1}^nm_{i,\\sigma(i)}\\right) \\]\nwhere the sum is taken over all permutations \\(\\sigma\\) of \\(1,2,\\ldots,n\\), and where \\(\\text{sgn}(\\sigma)\\) is the sign of the permutation; which is defined in terms of the number of digit swaps to get to it: an even number of swaps has a sign of 1, and an odd number a sign of \\(-1\\). The determinant can be effectively computed by Gaussian elimination of a matrix into triangular form, which takes in the order of \\(n^3\\) operations; the determinant is then the product of the diagonal elements.\nThe permanent is defined similarly, except for the sign:\n\\[ \\text{per}(M)=\\sum_{\\sigma\\in S_n}\\prod_{i=1}^nm_{i,\\sigma(i)}. \\]\nRemarkably enough, this simple change renders the permanent impossible to be computed effectively; all known algorithms have exponential orders. Computing by expanding each permutation takes \\(O(n!n)\\) operations, some better algorithms (such as Ryser\u0026rsquo;s algorithm) have order \\(O(2^{n-1}n)\\).\nThe permanent has some applications, although not as many as the determinant. An easy and immediate result is that if \\(M\\) is a matrix consisting entirely of ones, except for the main diagonal of zeros (so that it is the \u0026ldquo;ones complement\u0026rdquo; of the identity matrix), its permanent is the number of derangements of \\(n\\) objects; that is, the number of permutations in which there are no fixed points.\nFirst Python. Here is a simple program, saved as permanent.py to compute the permanent from its definition:\nimport itertools as it import numpy as np def permanent(m): nr,nc = np.shape(m) if nr != nc: raise ValueError(\u0026#34;Matrix must be square\u0026#34;) pm = 0 for p in it.permutations(range(nr)): pm += np.product([m[i,p[i]] for i in range(nr)]) return pm I am not interested in optimizing speed; simply to implement the same algorithm in Python and Julia to see what happens. Now lets run this in a Python REPL (I\u0026rsquo;m using IPython here):\nIn [1]: import permanent as pt In [2]: import numpy as np In [3]: M = (1 - np.identity(4)).astype(np.intc) In [4]: pt.permanent(M) Out[4]: 9 and this is correct. This result was practically instantaneous, but it slows down appreciably, as you\u0026rsquo;d expect, for larger matrices:\nIn [5]: from timeit import default_timer as timer In [6]: M = (1 - np.identity(8)).astype(np.intc) In [7]: t = timer();print(pt.permanent(M));timer()-t 14833 Out[7]: 0.7398275199811906 In [8]: M = (1 - np.identity(9)).astype(np.intc) In [9]: t = timer();print(pt.permanent(M));timer()-t 133496 Out[9]: 10.244881154998438 In [10]: M = (1 - np.identity(10)).astype(np.intc) In [11]: t = timer();print(pt.permanent(M));timer()-t 1334961 Out[11]: 86.57762016600464 Now no doubt this could be speeded up in numerous ways, but that is not my point: I am simply implementing the same algorithm in each language. At any rate, my elementary program becomes effectively unusable for matrices bigger than about \\(8\\times 8\\).\nNow for Julia. Again, we start with a simple program:\nusing Combinatorics function permanent(m) nr,nc = size(m) if nr != nc error(\u0026#34;Matrix must be square\u0026#34;) end pm = 0 for p in permutations(1:nr) pm += prod(m[i,p[i]] for i in 1:nr) end return(pm) end You can see this program and the Python one above are, to all intents and purposes, identical. There are no clever optimizing tricks, it is a raw implementation of the basic definition.\nFirst, a quick test:\njulia\u0026gt; using LinearAlgebra julia\u0026gt; M = 1 .- Matrix(1I,4,4); julia\u0026gt; include(\u0026#34;permanent.jl\u0026#34;) julia\u0026gt; permanent(M) 9 So far, so good. Now for some time trials:\njulia\u0026gt; using BenchmarkTools julia\u0026gt; M = 1 .- Matrix(1I,8,8); julia\u0026gt; @time permanent(M) 0.020514 seconds (201.61 k allocations: 14.766 MiB) 14833 julia\u0026gt; M = 1 .- Matrix(1I,9,9); julia\u0026gt; @time permanent(M) 0.245049 seconds (1.81 M allocations: 143.965 MiB, 33.73% gc time) 133496 julia\u0026gt; M = 1 .- Matrix(1I,10,10); julia\u0026gt; @time permanent(M) 1.336724 seconds (18.14 M allocations: 1.406 GiB, 3.20% gc time) 1334961 You\u0026rsquo;ll see that Julia, thanks to its JIT compiler, is much much faster than Python. The point is that I didn\u0026rsquo;t have to do anything here to access that speed, it\u0026rsquo;s just a splendid part of the language.\nWinner: Julia, by a country mile.\nA few words at the end The timings given above are not absolute - running on a different system or with different versions of Python, Julia, and their libraries, will give different results. But the point is not the exact times taken, but the comparison of time between Julia and Python.\nFor what it\u0026rsquo;s worth, I\u0026rsquo;m running a fairly recently upgraded version of Arch Linux on a Lenovo Thinkpad X1 Carbon, generation 3. I\u0026rsquo;m running Julia 1.3.0 and Python 3.7.4. The machine has 8Gb of memory, of which about 2Gb were free.\n","link":"https://numbersandshapes.net/posts/speeds_of_julia_and_python/","section":"posts","tags":["programming","python","julia"],"title":"Speeds of Julia and Python"},{"body":"Just recently there was a news item about a solo explorer being the first Australian to reach the Antarctic \u0026ldquo;Pole of Inaccessibility\u0026rdquo;. Such a Pole is usually defined as that place on a continent that is furthest from the sea. The South Pole is about 1300km from the nearest open sea, and can be reached by specially fitted aircraft, or by tractors and sleds along the 1600km \u0026ldquo;South Pole Highway\u0026rdquo; from McMurdo Base. However, it is only about 500km from the nearest coast line on the Ross Ice Shelf. McMurdo Base is situated on the outside of the Ross Ice Shelf, so that it is accessible from the sea.\nThe Southern Pole of Inaccessibility is about 870km further inland from the South Pole, and is very hard to reach\u0026mdash;indeed the first people there were a Russian party in 1958, whose enduring legacy is a bust of Lenin at that Pole. Unlike at the South Pole, there is no base or habitation there; just a frigid wilderness. The Southern Pole of Inaccessibility is 1300km from the nearest coast.\nA pole of inaccessibility on any landmass can be defined as the centre of the largest circle that can be drawn entirely within it. You can see all of these for the world\u0026rsquo;s continents at an ArcGIS site. If you don\u0026rsquo;t want to wait for the images to load, here is Antarctica:\nYou\u0026rsquo;ll notice that this map of Antarctica is missing the ice shelves, which fill up most of the bays. If the ice shelves are included, then we can draw a larger circle.\nAs an image processing exercise, I decided to experiment, using the distance transform to measure distances from the coasts, and Julia as the language. Although Julia has now been in development for a decade, it\u0026rsquo;s still a \u0026ldquo;new kid on the block\u0026rdquo;. But some of its libraries (known as \u0026ldquo;packages\u0026rdquo;) are remarkably mature and robust. One such is its imaging package Images.\nIn fact, we need to install and use several packages as well as Images: Colors, FileIO, ImageView, Plots. These can all be added with Pkg.add(\u0026quot;packagename\u0026quot;) and brought into the namespace with using packagename. We also need an image to use, and for Antarctica I chose this very nice map of the ice surface:\ntaken from a BBC report. The nice thing about this map is that it shows the ice shelves, so that we can experiment with and without them. We start by reading the image, making it gray-scale, and thresholding it so as to remove the ice-shelves:\njulia\u0026gt; ant = load(\u0026#34;antarctica.jpg\u0026#34;); julia\u0026gt; G = Gray.(ant); julia\u0026gt; B = G .\u0026gt; 0.8; julia\u0026gt; imshow(B) which produces this:\nwhich as you see has removed the ice shelves. Now we apply the distance transform and find its maximum:\njulia\u0026gt; D = distance_transform(feature_transform(B)); julia\u0026gt; findmax(D) (116.48175822848829, CartesianIndex(214, 350)) This indicates that the largest distance from all edges is about 116, at pixel location (214,350). To show this circle on the image it\u0026rsquo;s easiest to simply plot the image, and then plot the circle on top of it. We\u0026rsquo;ll also plot the centre, as a smaller circle:\njulia\u0026gt; plot(ant,aspect_ratio = 1) julia\u0026gt; x(t) = 214 + 116*cos(t) julia\u0026gt; x(t) = 350 + 116*sin(t) julia\u0026gt; xc(t) = 214 + 5*cos(t) julia\u0026gt; xc(t) = 350 + 5*sin(t) julia\u0026gt; plot!(y,x,0,2*pi,lw=3,lc=\u0026#34;red\u0026#34;,leg=false) julia\u0026gt; plot!(yc,xc,0,2*pi,lw=2,lc=\u0026#34;white\u0026#34;,leg=false) This shows the Pole of Inaccessibility in terms of the actual Antarctic continent. However, in practical terms the ice shelves, even though not actually part of the landmass, need to be traversed just the same. To include the ice shelves, we just threshold at a higher value:\njulia\u0026gt; B1 = opening(G .\u0026gt; 0.95); julia\u0026gt; D1 = distance_transform(feature_transform(B1)); julia\u0026gt; findmax(D1) (160.22796260328596, CartesianIndex(225, 351)) The use of opening in the first line is to fill in any holes in the image: the distance transform is very sensitive to holes. Then we plot the continent again with the two circles as above, but using this new centre and radius. This produces:\nThe position of the Pole of Inaccessibility has not in fact changed all that much from the first one.\n","link":"https://numbersandshapes.net/posts/poles_of_inaccessibility/","section":"posts","tags":["image-processing","julia"],"title":"Poles of inaccessibility"},{"body":"I am not an analyst, so I find the sums of infinite series quite mysterious. For example, here are three. The first one is the value of \\(\\zeta(2)\\), very well known, sometimes called the \u0026ldquo;Basel Problem\u0026rdquo; and first determined by (of course) Euler: \\[ \\sum_{n=1}^\\infty\\frac{1}{n^2}=\\frac{\\pi^2}{6}. \\] Second, subtracting one from the denominator: \\[ \\sum_{n=2}^\\infty\\frac{1}{n^2-1}=\\frac{3}{4} \\] This sum is easily demonstrated by partial fractions: \\[ \\frac{1}{n^2-1}=\\frac{1}{2}\\left(\\frac{1}{n-1}-\\frac{1}{n+1}\\right) \\] and so the series can be expanded as: \\[ \\frac{1}{2}\\left(\\frac{1}{1}-\\frac{1}{3}+\\frac{1}{2}-\\frac{1}{4}+\\frac{1}{3}-\\frac{1}{5}\\cdots\\right) \\] This is a telescoping series in which every term in the brackets is cancelled except for \\(1+1/ 2\\), which produces the sum immediately.\nFinally, add one to the denominator: \\[ \\sum_{n=2}^\\infty\\frac{1}{n^2+1}=\\frac{1}{2}(\\pi\\coth(\\pi)-1). \\] And this sum is obtained from one of the series representations for \\(\\coth(z)\\):\n\\[ \\coth(z)=\\frac{1}{z}+2z\\sum_{n=1}^\\infty\\frac{1}{\\pi^2n^2+z^2} \\]\n(for all \\(z\\) except for when \\(\\pi^2n^2+z^2=0\\)).\nI was looking around for infinite series to give my numerical methods students to test their powers of approximation, and I came across this beauty: \\[ \\sum_{n=2}^\\infty\\frac{1}{n^2+n-1}=1+\\frac{\\pi}{\\sqrt{5}}\\tan\\left(\\frac{\\sqrt{5}\\pi}{2}\\right). \\] This led me on a mathematical treasure hunt through books and all over the internet, until I had worked it out.\nMy starting place, after googling \u0026ldquo;sum quadratic reciprocal\u0026rdquo; was a very nice and detailed post on stackexchange. This post then referred to a previous one started with the infinite product expression for \\(\\sin(x)\\) and turned it (by taking logarithms and differentiating) into a series for \\(\\cot(x)\\).\nHowever, I want an expression for \\(\\tan(x)\\), which means starting with the infinite product form for \\(\\sec(x)\\), which is:\n\\[ \\sec(x)=\\prod_{n=1}^\\infty\\frac{\\pi^2(2n-1)^2}{\\pi^2(2n-1)^2-4x^2}. \\] Making a substitution simplifies the expression in the product: \\[ \\sec\\left(\\frac{\\pi x}{2}\\right)=\\prod_{n=1}^\\infty\\frac{(2n-1)^2}{(2n-1)^2-x^2}. \\] Now take logs of both sides:\n\\[ \\log\\left(\\sec\\left(\\frac{\\pi x}{2}\\right)\\right)= \\sum_{n=1}^\\infty\\log\\left(\\frac{(2n-1)^2}{(2n-1)^2-x^2}\\right) \\]\nand differentiate: \\[ \\frac{\\pi}{2}\\tan\\left(\\frac{\\pi x}{2}\\right)= \\sum_{n=1}^\\infty\\frac{2x}{(2n-1)^2-x^2}. \\] Now we have to somehow equate this new sum on the right with our original sum. So let\u0026rsquo;s go back to it.\nFirst of all, a bit of completing the square produces \\[ \\frac{1}{n^2+n-1}=\\frac{1}{\\left(n+\\frac{1}{2}\\right)^2-\\frac{5}{4}}=\\frac{4}{(2n+1)^2-5}. \\] This means that \\[ \\sum_{n=1}^\\infty\\frac{1}{n^2+n-1}=\\sum_{n=2}^\\infty\\frac{4}{(2n-1)^2-5}= \\frac{2}{\\sqrt{5}}\\sum_{n=2}^\\infty\\frac{2\\sqrt{5}}{(2n-1)^2-5}. \\] We have changed the index from \\(n=1\\) to \\(n=2\\) which allows the rewriting of \\(2n+1\\) as \\(2n-1\\). This means we are missing a first term. Comparing the final sum with that for \\(\\tan(x)\\) above, we have \\[ \\sum_{n=1}^\\infty\\frac{1}{n^2+n-1}=\\frac{2}{\\sqrt{5}}\\left(\\frac{\\pi}{2}\\tan\\left(\\frac{\\pi \\sqrt{5}}{2}\\right)-\\frac{-\\sqrt{5}}{2}\\right) \\] where the last term is the missing first term: the summand for \\(n=1\\). Simplifying the right hand side produces \\[ \\sum_{n=1}^\\infty\\frac{1}{n^2+n-1}=1+\\frac{\\pi}{\\sqrt{5}}\\tan\\left(\\frac{\\sqrt{5}\\pi}{2}\\right). \\] Note that the above series for \\(\\tan(x)\\) can be obtained directly, using a general technique discussed (for example) in that fine old text: \u0026ldquo;A Course in Modern Analysis\u0026rdquo;, by E. T. Whittaker and G. N. Watson. If \\(f(x)\\) has only simple poles \\(a_n\\) with residues \\(b_n\\), then \\[ f(x) = f(0)+\\sum_{n=1}^\\infty\\left(\\frac{1}{x-a_n}+\\frac{1}{a_n}\\right). \\] Expressing a function as a series of such reciprocals is known as Mittag-Leffler\u0026rsquo;s theorem and in fact the series for \\(\\tan(x)\\) is given there as one of the examples.\n","link":"https://numbersandshapes.net/posts/an_interesting_sum/","section":"posts","tags":["mathematics","analysis"],"title":"An interesting sum"},{"body":"","link":"https://numbersandshapes.net/tags/analysis/","section":"tags","tags":null,"title":"Analysis"},{"body":"","link":"https://numbersandshapes.net/tags/geogebra/","section":"tags","tags":null,"title":"Geogebra"},{"body":"Runge\u0026rsquo;s phenomenon says roughly that a polynomial through equally spaced points over an interval will wobble a lot near the ends. Runge demonstrated this by fitting polynomials through equally spaced point in the interval \\([-1,1]\\) on the function \\[ \\frac{1}{1+25x^2} \\] and this function is now known as \u0026ldquo;Runge\u0026rsquo;s function\u0026rdquo;.\nIt turns out that Geogebra can illustrate this extremely well.\nEqually spaced vertices Either open up your local version of Geogebra, or go to http://geogebra.org/graphing. In the boxes on the left, enter the following expressions in turn:\nStart by entering Runge\u0026rsquo;s function \\[ f(x)=\\frac{1}{1+25x^2} \\] You should now either zoom in, or use the graph settings tool to display \\(x\\) between \\(-1.5\\) and \\(1.5\\). Create a list of \\(x\\) values: \\[ x1 = \\frac{\\{-5..5\\}}{5} \\] Use those values to create a set of points on the curve: \\[ p1 = (x1,f(x1)) \\] Now create an interpolating polynomial through them: \\[ \\mathsf{Polynomial}[p1(1)] \\] The resulting graph looks like this:\nChebyshev vertices For the purpose of this post, we\u0026rsquo;ll take the Chebyshev vertices to be those points in the graph whose \\(x\\) coordinates are given by\n\\[ x_k = \\cos\\left(\\frac{k\\pi}{10}\\right) \\] for \\(k = 0,1,2,\\ldots 10\\). These values are more clustered at the ends of the interval.\nIn Geogebra:\nAs before, enter the \\(x\\) values first: \\[ x2 = \\cos(\\frac{\\{0..10\\}\\cdot\\pi}{10}) \\] Then turn them into a sequence of points on the curve \\[ p2 = (x2,f(x2)) \\] Finally create the polynomial through them: \\(\\mathsf{Polynomial}(p2(1))\\). And this graph looks like this:\nYou\u0026rsquo;ll notice how better the second polynomial hugs the curve. The issue is even more pronounced with 21 points, either separated by \\(0.1\\), or with \\(x\\) values given by the cosine function again. All we need do is to change the definitions of the \\(x\\) value sequences \\(x1\\) and \\(x2\\) to:\n\\[ \\eqalign{ x1 \u0026amp;= \\frac{\\{-10..10\\}}{10}\\\\ x2 \u0026amp;= \\cos(\\frac{\\{0..20\\}\\pi}{20}) } \\]\nIn fact, you can create a slider \\(1\\le N \\le 20\\), say, and then define\n\\[ \\eqalign{ x1 \u0026amp;= \\frac{\\{-N..N\\}}{N}\\\\ x2 \u0026amp;= \\cos(\\frac{\\{0..2N\\}\\pi}{2N}) } \\]\nand then see how as \\(N\\) increases, the \u0026ldquo;Chebyshev\u0026rdquo; interpolant fits the curve better than the equally spaced interpolant. For \\(N=20\\), the turning points of the equally spaced polynomial have \\(y\\) values as high as \\(59.78\\).\nIntegration Using equally spaced values to create an interpolating polynomial and then integrating that polynomial is Newton-Cotes integration. Runge\u0026rsquo;s phenomenon shows why it is better to partition the interval into small sub-intervals and apply a low-order rule to each one. For example, with 20 points on the curve, we would be better applying Simpson\u0026rsquo;s rule to each pair of two sub-intervals, and adding the result. Using a 21-point polynomial is equivalent to a Newton-Cotes rule of order 20, which is far too inaccurate to use.\nWith our curve \\(f(x)\\), and our equal-spaced polynomial \\(g(x)\\), the integrals are\n\\[ \\eqalign{ \\int^1_{-1}\\frac{1}{1+25x^2}\\,dx\u0026amp;=\\frac{2}{5}\\arctan(5)\\approx 0.5493603067780064\\\\ \\int^1_{-1}g(x)\\,dx\u0026amp;\\approx -5.369910417304622 } \\]\nHowever, using the polynomial through the Chebyshev nodes:\n\\[ \\int^1_{-1}h(x)\\approx 0.5498082303389538. \\]\nThe absolute errors between the integral values and the exact values are thus (approximately) \\(5.92\\) and \\(0.00045\\) respectively.\nIntegrating an interpolating polynomial through Chebyshev nodes is one way of implementing Clenshaw-Curtis quadrature.\nNote that using Simpson\u0026rsquo;s rule on our 21 points produces a value of 0.5485816035037206, which has absolute error of about \\(0.0012\\).\n","link":"https://numbersandshapes.net/posts/runges_phenomenon_in_geogebra/","section":"posts","tags":["mathematics","computation","geogebra"],"title":"Runge's phenomenon in Geogebra"},{"body":"Introduction and the problem The SIR model for spread of disease was first proposed in 1927 in a collection of three articles in the Proceedings of the Royal Society by Anderson Gray McKendrick and William Ogilvy Kermack; the resulting theory is known as Kermack–McKendrick theory; now considered a subclass of a more general theory known as compartmental models in epidemiology. The three original articles were republished in 1991, in a special issue of the Bulletin of Mathematical Biology.\nThe SIR model is so named because it assumes a static population, so no births or deaths, divided into three mutually exclusive classes: those susceptible to the disease; those infected with the disease, and those recovered with immunity from the disease. This model is clearly not applicable to all possible epidemics: there may be births and deaths, people may be re-infected, and so on. More complex models take these and other factors into account.\nThe SIR model consists of three non-linear ordinary differential equations, parameterized by two growth factors \\(\\beta\\) and \\(\\gamma\\):\n\\begin{eqnarray*} \\frac{dS}{dt}\u0026amp;=\u0026amp;-\\frac{\\beta IS}{N}\\\\ \\frac{dI}{dt}\u0026amp;=\u0026amp;\\frac{\\beta IS}{N}-\\gamma I\\\\ \\frac{dR}{dt}\u0026amp;=\u0026amp;\\gamma I \\end{eqnarray*}\nHere \\(N\\) is the population, and since each of \\(S\\), \\(I\\) and \\(R\\) represent the number of people in mutually exclusive sets, we should have \\(S+I+R=N\\). Note that the right hand sides of the equations sum to zero, hence \\[ \\frac{dS}{dt}+\\frac{dI}{dt}+\\frac{dR}{dt}=\\frac{dN}{dt}=0 \\] which indicates that the population is constant.\nThe problem here is to see if values of \\(\\beta\\) and \\(\\gamma\\) can be found which will provide a close fit to a well-known epidemiological case study: that of influenza in a British boarding school. This was described in a 1978 issue of the British Medical Journal.\nThis should provide a good test of the SIR model, as it satisfies all of the criteria. In this case there was a total population of 763, and an outbreak of 14 days, with infected numbers as:\nDay: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Infections: 1 3 6 25 73 222 294 258 237 191 125 69 27 11 4 Some years ago, on my old (and now taken off line) blog, I explored using Julia for this task, which was managed easily (once I got the hang of Julia syntax). However, that was five years ago, and when I tried to recreate it I found that Julia has changed over the last five years, and my original comments and code no longer worked. So I decided to experiment with Python instead, in which I have more expertise, or at least, experience.\nUsing Python Setup and some initial computation We start by importing some of the modules and functions we need, and define the data from the table above:\nimport matplotlib.pyplot as plt import numpy as np from scipy.integrate import solve_ivp data = [1, 3, 6, 25, 73, 222, 294, 258, 237, 191, 125, 69, 27, 11, 4] Now we enter the SIR system with some (randomly chosen) values of \\(\\beta\\) and \\(\\gamma\\), using syntax conformable with the solver solve_ivp:\nbeta,gamma = [0.01,0.1] def SIR(t,y): S = y[0] I = y[1] R = y[2] return([-beta*S*I, beta*S*I-gamma*I, gamma*I]) We can now solve this system using solve\\_ivp and plot the results:\nsol = solve_ivp(SIR,[0,14],[762,1,0],t_eval=np.arange(0,14.2,0.2)) fig = plt.figure(figsize=(12,4)) plt.plot(sol.t,sol.y[0]) plt.plot(sol.t,sol.y[1]) plt.plot(sol.t,sol.y[2]) plt.plot(np.arange(0,15),data,\u0026#34;k*:\u0026#34;) plt.grid(\u0026#34;True\u0026#34;) plt.legend([\u0026#34;Susceptible\u0026#34;,\u0026#34;Infected\u0026#34;,\u0026#34;Removed\u0026#34;,\u0026#34;Original Data\u0026#34;]) Note that we have used the t_eval argument in our call to solve_ivp which allows us to exactly specify the points at which the solution will be given. This will allow us to align points in the computed values of \\(I\\) with the original data.\nThe output is this plot:\n!First SIR plot\nThere is a nicer plot, with more attention paid to setup and colours, here. This use of scipy to solve the SIR equations uses the odeint tool. This works perfectly well, but I believe it is being deprecated in favour of solve_ivp.\nFitting the data To find values \\(\\beta\\) and \\(\\gamma\\) with a better fit to the data, we start by defining a function which gives the sum of squared differences between the data points, and the corresponding values of \\(I\\). The problem will then be minimize that function.\ndef sumsq(p): beta, gamma = p def SIR(t,y): S = y[0] I = y[1] R = y[2] return([-beta*S*I, beta*S*I-gamma*I, gamma*I]) sol = solve_ivp(SIR,[0,14],[762,1,0],t_eval=np.arange(0,14.2,0.2)) return(sum((sol.y[1][::5]-data)**2)) As you see, we have just wrapped the SIR definition and its solution inside a calling function whose variables are the parameters of the SIR equations.\nTo minimize this function we need to import another python function first:\nfrom scipy.optimize import minimize msol = minimize(sumsq,[0.001,1],method=\u0026#39;Nelder-Mead\u0026#39;) msol.x with output:\narray([0.00218035, 0.44553886]) To see if this does provide a better fit, we can simply run the solver with these values, and plot them, as we did at the beginning:\nbeta,gamma = msol.x def SIR(t,y): S = y[0] I = y[1] R = y[2] return([-beta*S*I, beta*S*I-gamma*I, gamma*I]) sol = solve_ivp(SIR,[0,14],[762,1,0],t_eval=np.arange(0,14.2,0.2)) fig = plt.figure(figsize=(10,4)) plt.plot(sol.t,sol.y[0],\u0026#34;b-\u0026#34;) plt.plot(sol.t,sol.y[1],\u0026#34;r-\u0026#34;) plt.plot(sol.t,sol.y[2],\u0026#34;g-\u0026#34;) plt.plot(np.arange(0,15),data,\u0026#34;k*:\u0026#34;) plt.legend([\u0026#34;Susceptible\u0026#34;,\u0026#34;Infected\u0026#34;,\u0026#34;Removed\u0026#34;,\u0026#34;Original Data\u0026#34;]) with output:\n!Second SIR plot\nand as you see, a remarkably close fit!\n","link":"https://numbersandshapes.net/posts/fitting_sir_to_data_in_python/","section":"posts","tags":["mathematics","computation","python"],"title":"Fitting the SIR model of disease to data in Python"},{"body":"So this goes back quite some time to the recent Australian Federal election on May 18. In my own electorate (known formally as a \u0026ldquo;Division\u0026rdquo;) of Cooper, the Greens, who until recently had been showing signs of winning the seat, were pretty well trounced by Labor.\nSome background asides First, \u0026ldquo;Labor\u0026rdquo; as in \u0026ldquo;Australian Labor Party\u0026rdquo; is spelled the American way; that is, without a \u0026ldquo;u\u0026rdquo;, even though \u0026ldquo;labour\u0026rdquo; meaning work, is so spelled in Australian English. This is because much of Australia\u0026rsquo;s pre-federal political history has a large American influence; indeed one of the loudest political voices in the 19th century was King O\u0026rsquo;Malley who was born in 1858 in Kansas, and didn\u0026rsquo;t come to Australia until he was about 30. He was responsible, amongst other things, for selecting the site for Canberra (the nation\u0026rsquo;s capital) and in selecting Walter Burley Griffin as its architect. As well, the Australian Constitution shows a large American influence; the constitutions of both countries bear a remarkably close resemblance, and Australia\u0026rsquo;s parliament is modelled on that of the United States Congress.\nSecond, my electorate was formerly known as \u0026ldquo;Batman\u0026rdquo; after John Batman, the supposed founder of Melbourne. However, Batman was known in his lifetime as a contemptible figure, and the historical record well bears out the description of him as \u0026ldquo;the vilest man I have ever known\u0026rdquo;. He was responsible for the slaughter of indigenous peoples, and his so called \u0026ldquo;treaty\u0026rdquo; with the people of the Kulin Nation in exchange for land on what would would become Melbourne is considered invalid. In respect for the local peoples (\u0026ldquo;who have never ceded sovereignty\u0026rdquo;), the electorate was renamed last year in honour of William Cooper, a Yorta Yorta elder, a tireless activist for Aboriginal rights, and the only individual in the world to lodge a formal protest to a German embassy on the occasion of Kristallnacht.\nBack to mapping All I wanted to do was to map the size of Labor\u0026rsquo;s gains (over the Greens) between the last election in 2016 and this one, at each polling booth in the electorate. For this I used the following Python packages: matplotlib, pandas, geopandas, numpy, cartopy. The nice thing about Python, for me at least, is the ease of prototyping, and the availability of packages for just about everything. Indeed, for an amateur programmer like myself, one of the biggest difficulties is finding the right package for the job. There\u0026rsquo;s a score or more for GIS alone.\nAll information about the election can be downloaded from the Australian Electorial Commission tallyroom. And the GIS information can be obtained also from the AES. The Victorian shapefile needs to be unzipped before using.\nThen the map set-up looks like this:\nshp = gpd.read_file(\u0026#39;VicMaps/E_AUGFN3_region.shp\u0026#39;) cooper = shp.loc[shp[\u0026#39;Elect_div\u0026#39;]==\u0026#39;Cooper\u0026#39;] bounds = cooper.geometry.bounds bg = bounds.values[0] # latitude, longitude of extent of region pad = 0.01 # padding for display extent = [bg[0]-pad,bg[2]+pad,bg[1]-pad,bg[3]+pad] The idea of the padding is simply to ensure that the map, once displayed, extends beyond the area of the electorate. The units are degrees of latitude and longitude. In Melbourne, at about 37.8 degrees south, 0.01 degrees latitude corresponds to about 1.11km, and 0.01 degrees longitude corresponds to about 0.88km.\nNow we need to determine the percentage of votes to Labor (on a two-party preferred computation) which again involves reading material from the AEC site. I downloaded it first, but it could also be read directly from the site into pandas.\n# Get all votes by polling booths in Cooper for 2019 b19 = pd.read_csv(\u0026#39;Elections/TCPByCandidateByPollingPlaceVIC-2019.csv\u0026#39;) # b19 = booths, 2019 v19 = b19.loc[b19.DivisionNm==\u0026#39;Cooper\u0026#39;] # v19 = Cooper Booths, 2019 v19 = v19[[\u0026#39;PollingPlace\u0026#39;,\u0026#39;PartyAb\u0026#39;,\u0026#39;OrdinaryVotes\u0026#39;]] v19r = v19.loc[v19.PartyAb==\u0026#39;GRN\u0026#39;] v19k = v19.loc[v19.PartyAb==\u0026#39;ALP\u0026#39;] v19c = v19r.merge(v19k,left_on=\u0026#39;PollingPlace\u0026#39;,right_on=\u0026#39;PollingPlace\u0026#39;) # Complete votes v19c[\u0026#39;Percent_x\u0026#39;] = (v19c[\u0026#39;OrdinaryVotes_x\u0026#39;]*100/(v19c[\u0026#39;OrdinaryVotes_x\u0026#39;]+v19c[\u0026#39;OrdinaryVotes_y\u0026#39;])).round(2) v19c[\u0026#39;Percent_y\u0026#39;] = (v19c[\u0026#39;OrdinaryVotes_y\u0026#39;]*100/(v19c[\u0026#39;OrdinaryVotes_x\u0026#39;]+v19c[\u0026#39;OrdinaryVotes_y\u0026#39;])).round(2) v19c = v19c.dropna() # note: suffix -x is GRN; suffix _y is ALP The next step is to determine the positions of all polling places. For simplification, I\u0026rsquo;m only interested in places used in both this most recent election, and the previous federal election in 2016:\nv16 = pd.read_csv(\u0026#39;Elections/Batman2016_TCP.csv\u0026#39;) v_both = v16.merge(v19c,right_on=\u0026#39;PollingPlace\u0026#39;,left_on=\u0026#39;Booth\u0026#39;) c19 = pd.read_csv(\u0026#39;Elections/Cooper_PollingPlaces.csv\u0026#39;) c19 = c19.drop(44).reset_index() # Drop Special Hospital, which has index 44 v_all=v_both.merge(c19,left_on=\u0026#39;PollingPlace\u0026#39;,right_on=\u0026#39;PollingPlaceNm\u0026#39;) lats = np.array(v_all[\u0026#39;Latitude\u0026#39;]) longs = np.array(v_all[\u0026#39;Longitude\u0026#39;]) booths = np.array(v_all[\u0026#39;PollingPlaceNm\u0026#39;]) diffs = np.array(v_all[\u0026#39;Percent_y\u0026#39;]-v_all[\u0026#39;ALP percent\u0026#39;]) # change in ALP percentage Having now got the boundary of the electorate, the positions of each polling booth, the percentage change in votes for Labor, the map can now be created and displayed:\nfig = plt.figure(figsize = (16,16)) tiler = GoogleTiles() ax = plt.axes(projection=tiler.crs) ax.set_extent(extent) ax.add_image(tiler, 13, interpolation=\u0026#39;hanning\u0026#39;) ax.add_geometries(cooper.geometry,crs=ccrs.PlateCarree(),facecolor=\u0026#39;none\u0026#39;,edgecolor=\u0026#39;k\u0026#39;,linewidth=2) for i in range(34): if diffs[i]\u0026gt;0: ax.plot(longs[i],lats[i],marker=\u0026#39;o\u0026#39;,markersize=diffs[i],markerfacecolor=\u0026#39;r\u0026#39;,transform=ccrs.Geodetic()) else: ax.plot(longs[i],lats[i],marker=\u0026#39;o\u0026#39;,markersize=-diffs[i],markerfacecolor=\u0026#39;g\u0026#39;,transform=ccrs.Geodetic()) plt.show() I found by trial and error that Hanning interpolation seemed to give the best results.\n!Swings 2019\nSo this image shows not the size of Labor\u0026rsquo;s vote, but the size of Labor\u0026rsquo;s gain since the previous election. The larger gains are in the southern part of the electorate: the northern part has always been more Labor friendly, and so the gains were smaller there.\n","link":"https://numbersandshapes.net/posts/mapping_voting_gains_between_elections/","section":"posts","tags":["voting","GIS","python"],"title":"Mapping voting gains between elections"},{"body":"Here is an interactive version of this diagram:\n[!APF Figure 4](/APF_figure_4.png)\n(click on the image to show a larger version.)\n","link":"https://numbersandshapes.net/posts/educational_disciplines/","section":"posts","tags":null,"title":"Educational disciplines: size against market growth"},{"body":"Here we show how a Tschirnhausen transformation can be used to solve a quartic equation. The steps are:\nEnsure the quartic is missing the cubic term, and its initial coefficient is 1. We can do this by first dividing by the initial coefficient to obtain an equation \\[ x^4+b_3x^3+b_2x^2+b_1x+b_0=0 \\] and then replace the variable \\(x\\) with \\(y=x-b_3/ 4\\). This will produce a monic quartic equation missing the cubic term.\nWe can thus take \\[ x^4+bx^2+cx+d=0 \\] as a completely general formulation of the quartic equation.\nNow apply the Tschirnhausen transformation \\[ y = x^2+rx+s \\] using the resultant, which will produce a quartic equation in \\(y\\).\nChose \\(r\\) and \\(s\\) so that the coefficients of the linear and cubic terms are zero. This will require solving a cubic equation.\nSubstitute those \\(r,s\\) values into the resultant, which will produce a biquadratic equation in \\(y\\): \\[ y^4+Ay^2+B=0. \\] This can be readily solved as it\u0026rsquo;s a quadratic in \\(y^2\\). Finally, for each value of \\(y\\), and using the \\(r,s\\) values, solve \\[ x^2+rx+s-y=0. \\] This will in fact produce eight values, of which four are the solution to the original quartic.\nAn example Consider the equation \\[ x^4-94x^2-480x-671=0 \\] which has solutions\n\\begin{aligned} x \u0026amp;= -2 \\, \\sqrt{5} \\pm \\sqrt{3} \\sqrt{9 -4\\, \\sqrt{5}},\\\\ x \u0026amp;= 2 \\, \\sqrt{5} \\pm \\sqrt{3} \\sqrt{9 + 4 \\, \\sqrt{5}} \\end{aligned}\nNote that these relatively nice solutions arise from the polynomial being factorizable in the number field \\(\\mathbb{Q}[\\sqrt{5}]\\). We can show this using Sagemath:\nN.\u0026lt;a\u0026gt; = NumberField(x^2-5) K.\u0026lt;x\u0026gt; = N[] factor(x^4 - 94*x^2 - 480*x - 671) \\[ (x^{2} - 4 a x - 12 a - 7) \\cdot (x^{2} + 4 a x + 12 a - 7) \\]\nWe shall continue to use Sagemath to perform all the dirty work; here\u0026rsquo;s how this solution works:\nvar(\u0026#39;x,y,r,s\u0026#39;) qx = x^4 - 94*x^2 - 480*x - 671 res = qx.resultant(y-x^2-r*x-s,x).poly(y) We now want to find the values of \\(r\\) and \\(s\\) to eliminate the linear and cubic terms. The cubic term is easy:\nres.coefficient(y,3) \\[ -4s-188 \\] and so\ns_sol = solve(res.coefficient(y,3),s,solution_dict=True) and we can substitute this into the linear coefficient:\nres.coefficient(y,1).subs(s_sol[0]).factor() \\[ -480(r+10)(r+8)(r+6). \\] In general the coefficient would not be as neatly factorizable as this, but we can still find the values of \\(r\\):\nr_sol = solve(res.coefficient(y,1).subs(s_sol[0]),r,solution_dict=True) We can choose any value we like; here let\u0026rsquo;s choose the first value and substitute it into the resultant from above, first creating a dictionary to hold the \\(r\\) and \\(s\\) values:\nrs = s_sol[0].copy() rs.update(r_sol[0]) rs \\[ \\lbrace s:-47,r:-8\\rbrace \\]\nres.subs(rs) \\[ y^4-256y^2+1024 \\]\ny_sol = solve(res.subs(rs),y,solution_dict=True) This will produce four values of \\(y\\), and for each one we solve the equation \\[ x^2+rx+s-y=0 \\] for \\(x\\):\nfor ys in ysol: display(solve((x^2+r*x+s-y).subs(rs).subs(ys),x)) \\begin{aligned} x \u0026amp;= -\\sqrt{-4 \\, \\sqrt{2 \\, \\sqrt{15} + 8} + 63} + 4,\u0026amp; x \u0026amp;= \\sqrt{-4 \\, \\sqrt{2 \\, \\sqrt{15} + 8} + 63} + 4\\\\ x \u0026amp;= -\\sqrt{4 \\, \\sqrt{2 \\, \\sqrt{15} + 8} + 63} + 4,\u0026amp; x\u0026amp; = \\sqrt{4 \\,\\sqrt{2 \\, \\sqrt{15} + 8} + 63} + 4\\\\ x\u0026amp; = -\\sqrt{-4 \\, \\sqrt{-2 \\, \\sqrt{15} + 8} + 63} + 4,\u0026amp; x\u0026amp; = \\sqrt{-4 \\,\\sqrt{-2 \\, \\sqrt{15} + 8} + 63} + 4\\\\ x\u0026amp; = -\\sqrt{4 \\, \\sqrt{-2 \\, \\sqrt{15} + 8} + 63} + 4,\u0026amp; x\u0026amp; = \\sqrt{4 \\, \\sqrt{-2 \\, \\sqrt{15} + 8} + 63} + 4 \\end{aligned}\nWe can check these values to see which ones are actually correct. But to experiment, we can determine the minimal polynomial of each value given:\nfor ys in ysol: s1 = solve((x^2+r*x+s-y).subs(rs).subs(ys),x,solution_dict=True) ql = [QQbar(z[x]).minpoly() for z in s1] display(ql) \\begin{aligned} \u0026amp;x^{4} - 94 x^{2} - 480 x - 671,\u0026amp;\u0026amp; x^{4} - 32 x^{3} + 290 x^{2} - 64 x - 6431\u0026amp;\\\\ \u0026amp;x^{4} - 94 x^{2} - 480 x - 671,\u0026amp;\u0026amp; x^{4} - 32 x^{3} + 290 x^{2} - 64 x - 6431\u0026amp;\\\\ \u0026amp;x^{4} - 32 x^{3} + 290 x^{2} - 64 x - 6431,\u0026amp;\u0026amp; x^{4} - 94 x^{2} - 480 x - 671\u0026amp;\\\\ \u0026amp;x^{4} - 94 x^{2} - 480 x - 671,\u0026amp;\u0026amp; x^{4} - 32 x^{3} + 290 x^{2} - 64 x - 6431\u0026amp; \\end{aligned}\nHalf of these are the original equation we tried to solve. And the others?\nqx.subs(x=8-x).expand() \\[ x^{4} - 32 x^{3} + 290 x^{2} - 64 x - 6431 \\] This is in fact what we should expect, from solving the equation \\[ x^2+rx+s-y=0 \\] If the roots are \\(x_1\\) and \\(x_2\\), then by Vieta\u0026rsquo;s formulas \\(x_1+x_2=-(-r)=r\\).\nFurther comments The trouble with this method is that it only works nicely on some equations. In general, the snarls of square, cube, and fourth roots become unwieldy very quickly. For example, consider the equation \\[ x^4+6x^2-60x+36=0 \\] which according to Cardan in Ars Magna (Chapter XXXIX, Problem V) was first solved by Ferrari.\nTaking the resultant with \\(y-x^2-rx-s\\) as a polynomial on \\(y\\), we find that the coefficient of \\(y^3\\) is \\(-4s+12\\), and so \\(s=3\\). Substituting this in the linear coefficient, we obtain this cubic in \\(r\\): \\[ 5r^3-9r^2-60r+300=0. \\] The simplest (real) solution is: \\[ r = \\frac{1}{5} \\, {\\left(50 \\, \\sqrt{3767} - 3273\\right)}^{\\frac{1}{3}} + \\frac{109}{5 \\, {\\left(50 \\, \\sqrt{3767} - 3273\\right)}^{\\frac{1}{3}}} + \\frac{3}{5} \\] Substituting these values of \\(r\\) and \\(s\\) into the resultant, we obtain the equation \\[ y^4+c_2y^2+c_0=0 \\] with \\[ c_2=\\frac{6 \\, {\\left({\\left(50 \\, \\sqrt{3767} - 3273\\right)}^{\\frac{2}{3}} {\\left(50 \\, \\sqrt{3767} - 18969\\right)} - {\\left(7200 \\, \\sqrt{3767} - 483193\\right)} {\\left(50 \\, \\sqrt{3767} - 3273\\right)}^{\\frac{1}{3}} + 100 \\, \\sqrt{3767} - 6546\\right)}}{25 \\, {\\left(50 \\, \\sqrt{3767} - 3273\\right)}} \\] and \\[ c_0=\\frac{27\\left(% \\begin{array}{l} 2\\,(14353657451700 \\, \\sqrt{3767} - 880869586608887)\\, (50 \\, \\sqrt{3767} - 3273)^{\\frac{2}{3}}\\\\ \\quad+109541 \\, (2077754350 \\, \\sqrt{3767} - 127532539917) (50 \\, \\sqrt{3767} - 3273)^{\\frac{1}{3}}\\quad{}\\\\ \\hspace{18ex} - 5262543802068000 \\, \\sqrt{3767} + 322980491997672634 \\end{array}% \\right)} {625 \\, {\\left(2077754350 \\, \\sqrt{3767} - 127532539917\\right)} {\\left(50 \\, \\sqrt{3767} - 3273\\right)}^{\\frac{1}{3}}} \\] Impressed?\nRight, so we solve the equation for \\(y\\), to obtain \\[ y=\\pm\\sqrt{-\\frac{1}{2}c_2\\pm\\frac{1}{2}\\sqrt{c_2^2-4c_0}}. \\] For each of those values of \\(y\\), we solve the equation \\[ x^2+rx+s-y=0 \\] to obtain (for example) \\[ x= = -\\frac{1}{2} \\, r \\pm \\frac{1}{2} \\, \\sqrt{r^{2} - 4 \\, s + 4 \\, \\sqrt{-\\frac{1}{2} \\, c_{2} + \\frac{1}{2} \\, \\sqrt{c_{2}^{2} - 4 \\, c_{0}}}} \\] With \\(r\\) being the solution of a cubic equation, and \\(c_0\\), \\(c_2\\) being the appalling expressions above, you can see that this solution, while \u0026ldquo;true\u0026rdquo; in a logical sense, is hardly useful or enlightening.\nCardan again: \u0026ldquo;So progresses arithmetic subtlety, the end of which, it is said, is as refined as it is useless.\u0026rdquo;\n","link":"https://numbersandshapes.net/posts/tschirnhausens_transformations_quartic/","section":"posts","tags":["mathematics","algebra"],"title":"Tschirnhausen transformations and the quartic"},{"body":"A general cubic polynomial has the form \\[ ax^3+bx^2+cx+d \\] but a general cubic equation can have the form \\[ x^3+ax^2+bx+c=0. \\] We can always divide through by the coefficient of \\(x^3\\) (assuming it to be non-zero) to obtain a monic equation; that is, with leading coefficient of 1. We can now remove the \\(x^2\\) term by replacing \\(x\\) with \\(y-a/3\\): \\[ \\left(y-\\frac{a}{3}\\right)^{\\negmedspace 3}+a\\left(y-\\frac{a}{3}\\right)^{\\negmedspace 2} +b\\left(y-\\frac{a}{3}\\right)+c=0. \\] Expanding and simplifying produces \\[ y^3+\\left(b-\\frac{a^2}{3}\\right)y+\\frac{2}{27}a^3-\\frac{1}{3}ab+c=0. \\] In fact this can be simplified by writing the initial equation as \\[ x^3+3ax^2+bx+c=0 \\] and then substituting \\(x=y-a\\) to obtain \\[ y^3+(b-3a^2)y+(2a^3-ab+c)=0. \\] This means that in fact an equation of the form \\[ y^3+Ay+B=0 \\] is a completely general form of the cubic equation. Such a form of a cubic equation, missing the quadratic term, is known as a depressed cubic.\nWe could go even further by substituting \\[ y=z\\sqrt{A} \\] to obtain\n\\[ A^{3/ 2}z^3+A\\sqrt{A}z+B=0 \\]\nand dividing through by \\(A^{3/ 2}\\) to produce\n\\[ z^3+z+BA^{-3/ 2}=0. \\]\nThis means that\n\\[ z^3+z+W=0 \\]\nis also a perfectly general form for the cubic equation.\nCardan\u0026rsquo;s method Although this is named for Gerolamo Cardano (1501-1576), the method was in fact discovered by Niccolò Fontana (1500-1557), known as Tartaglia (\u0026ldquo;the stammerer\u0026rdquo;) on account of a injury obtained when a soldier slashed his face when he was a boy. In the days before peer review and formal dissemination of ideas, any new mathematics was closely guarded: mathematicians would have public tests of skill, and a new solution method was invaluable. After assuring Tartaglia that his new method was safe with him, Cardan then proceeded to publish it as his own in his magisterial Ars Magna in 1545. A fascinating account of the mix of Cardan, Tartaglia, and several other egotistic mathematicians of the time, can be read here.\nCardan\u0026rsquo;s method solves the equation \\[ x^3-3ax-2b=0 \\] noting from above that this is a perfectly general form for the cubic, and where we have introduced factors of \\(-3\\) and \\(-2\\) to eliminate fractions later on. We start by assuming that the solution will have the form \\[ x=p^{1/ 3}+q^{1/ 3} \\] and so \\[ x^3=(p^{1/ 3}+q^{1/ 3})^3=p+3p^{2/ 3}q^{1/ 3}+3p^{1/ 3}q^{2/ 3}+q. \\] This last can be written as \\[ p+q+3p^{1/ 3}q^{1/ 3}(p^{1/ 3}+q^{1/ 3}). \\] We can thus write \\[ x^3=3p^{1/ 3}q^{1/ 3}x+p+q \\] and comparing with the initial cubic equation we have \\[ 3p^{1/ 3}q^{1/ 3}=3a,\\quad p+q=2b. \\] These can be written as \\[ pq=a^3,\\quad p+q=2b \\] for which the solutions are \\[ p,q=b\\pm\\sqrt{b^2-a^3} \\] and so \\[ x = (b+\\sqrt{b^2-a^3})^{1/ 3}+(b-\\sqrt{b^2-a^3})^{1/ 3}. \\] This can be written in various different ways.\nFor example, \\[ x^3-6x-6=0 \\] for which \\(a=2\\) and \\(b=3\\). Here \\(b^2-a^3=1\\) and so one solution is \\[ x=4^{1/ 3}+2^{1/ 3}. \\] Given that a cubic must have three solutions, the other two are \\[ \\omega p^{1/ 3}+\\omega^2 q^{1/ 3},\\quad \\omega^2 p^{1/ 3}+\\omega q^{1/ 3} \\] where \\(\\omega\\) is a cube root of 1, for example \\[ \\omega=\\frac{1}{2}+i\\frac{\\sqrt{3}}{2}. \\]\nAnd so to Tschirnhausen At the beginning we eliminated the \\(x^2\\) terms from a cubic equation by a linear substitution \\(x=y-a/3\\) or \\(y=x+a/3\\). Writing in the year 1680, the German mathematician Ehrenfried Walther von Tschirnhausen (1651-1708) began experimenting with more general polynomial substitutions, believing that it would be possible to eliminate other terms at the same time. Such substitutions are now known as Tschirnhausen transformations and of course the modern general approach places them squarely within field theory.\nTschirnhausen was only partially correct: it is indeed possible to remove some terms from a polynomial equation, and in 1858 the English mathematician George Jerrard (1804-1863) showed that it was possible to remove the terms of degree \\(n-1\\), \\(n-2\\) and \\(n-3\\) from a polynomial of degree \\(n\\). In particular, the general quintic equation can be reduced to \\[ x^5+px+q=0 \\] which is known as the Bring-Jerrard form; also honouring Jerrard\u0026rsquo;s predecessor, the Swedish mathematician Erland Bring (1736-1798). Note that Jerrard was quite well aware of the work of Ruffini, Abel and Galois in proving the general unsolvability by radicals of the quintic equation.\nNeither Bring nor Tschirnhausen had the advantage of this knowledge, and both were working towards a general solution of the quintic.\nHappily, Tschirnhausen\u0026rsquo;s work is available in an English translation, published in the ACM SIGSAM Bulletin by R. F. Green in 2003. For further delight, Jerrard\u0026rsquo;s text, with the splendidly formal English title \u0026ldquo;An Essay on the Resolution of Equations\u0026rdquo;, is also available online.\nAfter that history lesson, let\u0026rsquo;s explore how to remove both the quadratic and linear terms from a cubic equation using Tschirnhausen\u0026rsquo;s method, and also using SageMath to do the heavy algebraic work. There is in fact nothing particularly conceptually difficult, but the algebra is quite messy and fiddly.\nWe start with a depressed cubic equation \\[ x^3+3ax+2b=0 \\] and we will use the Tschirnhausen transformation \\[ y=x^2+rx+s. \\]\nThis can be done by hand of course, using a somewhat fiddly argument, but for us the best approach is to compute the resultant of the two polynomials, which is a polynomial expression equal to zero if the two polynomials have a common root. The resultant can be computed as the determinant of the Sylvester matrix (named for its discoverer); but we can simply use SageMath:\nvar(\u0026#39;a,b,c,x,y,r,s\u0026#39;) cb = x^3 + 3*a*x + 2*b res = cb.resultant(y-x^2-r*x-s,x).poly(y) res \\[ \\displaylines{ y^3+3(2a-s)y^{2}+3(ar^{2}+3a^{2}+2r-4as+s^{2})y\\\\ {\\ }\\mspace4em -4b^{2}+2br^{3}-3ar^{2}s+6abr-9a^{2}s-6brs+6as^{2}-s^{3} } \\]\nNow we find values of \\(r\\) and \\(s\\) for which the coefficients of \\(y^2\\) and \\(y\\) will be zero:\nsol = solve([res.coefficient(y,1),res.coefficient(y,2)],[r,s],solution_dict=True) sol \\[ \\left[\\left\\lbrace s : 2 \\, a, r : -\\frac{b + \\sqrt{a^{3} + b^{2}}}{a}\\right\\rbrace, \\left\\lbrace s : 2 \\, a, r : -\\frac{b - \\sqrt{a^{3} + b^{2}}}{a}\\right\\rbrace\\right] \\]\nWe can now substitute say the second solution into the resultant from above, which should produce an expression of the form \\(y^3+A\\):\ncby = res.subs(sol[1]).canonicalize_radical().poly(y) cby \\[ y^3-8 \\, a^{3} - 16 \\, b^{2} + 8 \\, \\sqrt{a^{3} + b^{2}} b - \\frac{8 \\, b^{4}}{a^{3}} + \\frac{8 \\, \\sqrt{a^{3} + b^{2}} b^{3}}{a^{3}} \\] We can simply take the cube root of the constant term as our solution:\nsol_y = solve(cby,y,solution_dict=True) sol_y[2] \\[ \\left\\lbrace y : \\frac{2 \\, {\\left(a^{6} + 2 \\, a^{3} b^{2} - \\sqrt{a^{3} + b^{2}} a^{3} b + b^{4} - \\sqrt{a^{3} + b^{2}} b^{3}\\right)}^{\\frac{1}{3}}}{a}\\right\\rbrace \\]\nNow we solve the equation \\(y=x^2+rx+s\\) using the values \\(r\\) and \\(s\\) from above, and the value of \\(y\\) just obtained:\neq = x^2+r*x+s-y eqrs = eq.subs(sol[1]) eqx = eqrs.subs(sol_y[2]) solx = solve(eqx,x,solution_dict=True) solx[0] \\[ \\left\\lbrace x : \\frac{b - \\sqrt{a^{3} + b^{2}} - \\sqrt{-7 \\, a^{3} + 2 \\, b^{2} - 2 \\, \\sqrt{a^{3} + b^{2}} b + 8 \\, {\\left(a^{6} + 2 \\, a^{3} b^{2} + b^{4} - {\\left(a^{3} b + b^{3}\\right)} \\sqrt{a^{3} + b^{2}}\\right)}^{\\frac{1}{3}} a}}{2 \\, a}\\right\\rbrace \\]\nA equation, of err\u0026hellip; rare beauty, or if not beauty, then something else. It certainly lacks the elegant simplicity of Cardan\u0026rsquo;s solution. On the other hand, the method can be applied to quartic (and quintic) equations, which Cardan\u0026rsquo;s solution can\u0026rsquo;t.\nFinally, let\u0026rsquo;s test this formula, again on the equation \\(x^3-6x-6=0\\), for which \\(a=-2\\) and \\(b=-3\\):\nxs = solx[0][x].subs({a:-2, b:-3}) xs \\[ \\frac{1}{4} \\, \\sqrt{-16 \\cdot 4^{\\frac{1}{3}} + 80} + 1 \\]\nThis can clearly be simplified to\n\\[ 1+\\sqrt{5-4^{1/ 3}} \\] It certainly looks different from Cardan\u0026rsquo;s result, but watch this:\nxt = QQbar(xs) xt.radical_expression() \\[ \\frac{1}{2}4^{2/ 3}+4^{1/ 3} \\]\nwhich is Cardan\u0026rsquo;s result, only very slightly rewritten. And finally:\nxt.minpoly() \\[ x^3-6x-6 \\]\n","link":"https://numbersandshapes.net/posts/tschirnhausens_solution_of_the_cubic/","section":"posts","tags":["mathematics","algebra"],"title":"Tschirnhausen's solution of the cubic"},{"body":"The date January 26 is one of immense current debate in Australia. Officially it\u0026rsquo;s the date of Australia Day, which supposedly celebrates the founding of Australia. To Aboriginal peoples it is a day of deep mourning and sadness, as the date commemorates over two centuries of oppression, bloodshed, and dispossession. To them and their many supporters, January 26 is Invasion Day.\nThe date commemorates the landing in 1788 of Arthur Phillip, in charge of the First Fleet and the first Governor of the colony of New South Wales.\nThe trouble is that \u0026ldquo;Australia\u0026rdquo; means two things: the island continent, and the country. The country didn\u0026rsquo;t exist until Federation on January 1, 1901; before which time the land since 1788 was subdivided into independent colonies. Many people believe that Australia Day would be better moved to January 1; the trouble with that is that it\u0026rsquo;s already a public holiday, and apparently you can\u0026rsquo;t have a national day that doesn\u0026rsquo;t have its own public holiday. And many other dates have been proposed.\nMy own preferred date is June 3; this is the date of the High Court \u0026ldquo;Mabo\u0026rdquo; decision in 1992 which formally recognized native title and rejected the doctrine of terra nullius under which the British invaded.\nThat the continent was invaded rather than settled is well established: all serious historians take this view, and it can be backed up with legal arguments. The Aboriginal peoples, numbering maybe close to a million in 1788, had mastered the difficult continent and all of its many ecosystems, and had done so for around 80,000 years. Aboriginal culture is the oldest continually maintained culture on the planet, and by an enormous margin.\nArthur Phillip did in fact arrive with formal instructions to create \u0026ldquo;amity\u0026rdquo; with the \u0026ldquo;natives\u0026rdquo; and indeed to live in \u0026ldquo;kindness\u0026rdquo; with them, but this soon went downhill. Although Phillip himself seems to have been a man of rare understanding for his time (when speared in the shoulder, for example, he refused to let his soldiers retaliate), he was no match for the many convicts and soldiers under his rule. When he retired back to England in 1792 the colony was ruled by a series of weak and ineffective governors, and in particular by the military, culminating in the governorship of Lachlan Macquarie who is seen as a mass murderer of Aboriginal peoples, although the evidence is not clear-cut. What is clear is that mass murders of Aboriginal peoples were common and indiscriminate, and often with appalling cruelty. On many occasions large groups were poisoned with strychnine: this works by affecting the nerves which control muscle movement, so that the body goes into agonizing spasms resulting in death by asphyxiation. Strychnine is considered by toxicologists to be one of the most painful acting of all poisons. Even though Macquarie himself ordered retribution only after \u0026ldquo;resistance\u0026rdquo;; groups considered harmless, or consisting only of old men, women and children, were brutally murdered.\nPeople were routinely killed by gunfire, or by being hacked to death; there is at least one report of a crying baby - the only survivor of a massacre - being thrown onto the fire made to burn the victims.\nMany more died of disease: smallpox and tuberculosis were responsible for deaths of over 50% of Aboriginal peoples. Their numbers today are thus tiny, and as in the past they are still marginalized.\nOnly recently has this harrowing part of Australia\u0026rsquo;s past been formally researched; the casual nature of the massacres meant that many were not recorded, and it has taken a great deal of time and work to uncover their details. This work has been headed by Professor Lyndall Ryan at the University of Newcastle. The painstaking and careful work by her team has unearthed much detail, and their results are available at their site Colonial Frontier Massacres in Central and Eastern Australia 1788-1930\nAs a January 26 exercise I decided to rework one of their maps, producing a single map which would show the sites of massacres by markers whose size is proportional to the number of people killed. This turned out to be quite easy using Python and its folium library, but naturally it took me a long time to get it right.\nI started by downloading the timeline from the Newcastle site as a csv file, and going through each massacre adding its location. The project historians point out that the locations are deliberately vague. Sometimes this is because the vagueness of the historical record; but also (from the Introduction):\nIn order to protect the sites from desecration, and respect for the wishes of Aboriginal communities to observe the site as a place of mourning, the points have been made purposefully imprecise by rounding coordinates to 3 digits, meaning the point is precise only to around 250m.\nGiven the database, the Python commands were:\nimport folium import pandas as pd mass = pd.read_csv(\u0026#39;massacres.csv\u0026#39;) a = folium.Map(location=[-27,134],width=1000, height=1000,tiles=\u0026#39;OpenStreetMap\u0026#39;,zoom_start=4.5) for i in range(0,len(mass)): number_killed = mass.iloc[i][\u0026#39;Estimated Aboriginal People Killed\u0026#39;] folium.Circle( location=[float(mass.iloc[i][\u0026#39;Lat\u0026#39;]), float(mass.iloc[i][\u0026#39;Long\u0026#39;])], tooltip=mass.iloc[i][\u0026#39;Location\u0026#39;]+\u0026#39;: \u0026#39;+str(number_killed), radius=int(number_killed)*150, color=\u0026#39;goldenrod\u0026#39;, fill=True, fill_color=\u0026#39;gold\u0026#39; ).add_to(a) a.save(\u0026#34;massacres.html\u0026#34;) The result is shown below. You can zoom in and out, and hovering over a massacre site will produce the location and number of people murdered.\nThe research is ongoing and this data is incomplete\n\u0026lt;iframe seamless src=\u0026quot;/massacres.html\u0026quot; width=\u0026ldquo;1000\u0026rdquo; height=\u0026ldquo;1000\u0026rdquo;\u0026gt;\u0026lt;/iframe\u0026gt;\nThe data was extracted from: Ryan, Lyndall; Richards, Jonathan; Pascoe, William; Debenham, Jennifer; Anders, Robert J; Brown, Mark; Smith, Robyn; Price, Daniel; Newley, Jack Colonial Frontier Massacres in Eastern Australia 1788 – 1872, v2.0 Newcastle: University of Newcastle, 2017, http://hdl.handle.net/1959.13/1340762 (accessed 08/02/2019). This project has been funded by the Australian Research Council (ARC).\nNote finally that Professor Ryan and team have defined a massacre to be a killing of at least six people. Thus we can assume there are many other killings of five or less people which are not yet properly documented, or more likely shall never been known. A shameful history indeed.\n","link":"https://numbersandshapes.net/posts/colonial_massacres/","section":"posts","tags":["history","GIS","python"],"title":"Colonial massacres, 1794 to 1928"},{"body":"","link":"https://numbersandshapes.net/tags/history/","section":"tags","tags":null,"title":"History"},{"body":"Recently we have seen senators behaving in ways that seem stupid, or contrary to accepted public opinion. And then people will start jumping up and down and complaining that such a senator only got a tiny number of first preference votes. One commentator said that one senator, with 19 first preference votes, \u0026ldquo;couldn’t muster more than 19 members of his extended family to vote for him\u0026rdquo;. This displays an ignorance of how senate counting works. In fact first preference votes are almost completely irrelevant; or at least, far less relevant than they are in the lower house.\nSenate counting works on a proportional system, where multiple candidates are elected from the same group of ballots. This is different from the lower house (the House of Representatives federally) where only one person is elected. For the lower house, first preference votes are indeed much more important. As for the lower house, senate voting is preferential: voters number their preferred candidates starting with 1 for their most preferred, and so on (but see below).\nA full explanation is given by the Australian Electoral Commission on their Senate Counting page; this blog post will run through a very simple example to demonstrate how a senator can be elected with a tiny number of first preference votes.\nAn aside on micro parties and voting One problem in Australia is the proliferation of micro parties, many of which hold racist, anti-immigration, or hard-line religious views, or who in some other ways represent only a tiny minority of the electorate. The problem is just as bad at State level; in my own state of Victoria we have the Shooters, Fishers and Farmers Party, the Aussie Battlers Party, and the Transport Matters Party (who represent taxi drivers) to name but three. This has the affect that the number of candidates standing for senate election has become huge, and the senate ballot papers absurdly unwieldy:\nInitially the law required voters to number every box starting from 1: on a large paper this would mean numbering carefully from 1 up to at least 96 in one recent election. To save this trouble (and most Australian voters are nothing if not lazy), \u0026ldquo;above the line voting\u0026rdquo; was introduced. This gave voters the option to put just a single \u0026ldquo;1\u0026rdquo; in the box representing the party of choice: you will see from the image above that the ballot paper is divided: the columns represent all the parties; the boxes below the line represent all the candidates from that party, and the single box above just the party name. Here is a close up of a NSW senate ballot:\nAlmost all voters willingly took advantage of that and voted above the line. The trouble is then that voters have no control over where their preferences go: that is handled by the parties themselves. By law, all parties must make their preferences available before the election, and they are published on the site of the relevant Electoral Commission. But the only people who carefully check this site and the party\u0026rsquo;s preferences are the sort of people who would carefully number each box below the line anyway. Most people don\u0026rsquo;t care enough to be bothered.\nThis enables all the micro-parties to make \u0026ldquo;preference deals\u0026rdquo;; in effect they act as one large bloc, ensuring that at least some of them get a senate seat. This has been handled by a so-called \u0026ldquo;preference whisperer\u0026rdquo;.\nThe current system in the state of Victoria has been to encourage voting below the line by allowing, instead of all boxes to be numbered, at least six. And there are strong calls for voting above the line to be abolished.\nA simple example To show how senate counting works, we suppose an electorate of 100 people, and three senators to be elected from five candidates. We also suppose that every ballot paper has been numbered from 1 to 5 indicating each voter\u0026rsquo;s preferences.\nBefore the counting begins we need to determine the number of votes each candidate must amass to be elected: this is chosen as the smallest number of votes for which no more candidates can be elected. If there are \\(n\\) formal votes cast, and \\(k\\) senators to be elected, that number is clearly\n\\[\\left\\lfloor\\frac{n}{k+1}\\right\\rfloor + 1.\\]\nThis value is known as the Droop quota. In our example, this quota is\n\\[ \\left\\lfloor\\frac{100}{3+1}\\right\\rfloor +1 = 26. \\]\nYou can see that it is not possible for four candidates to obtain this value.\nSuppose that the ballots are distributed as follows, where the numbers under the candidates indicate the preferences cast:\nNumber of votes A B C D E 20 1 2 3 4 5 20 1 5 4 3 2 40 2 1 5 4 3 5 2 3 5 1 4 4 4 3 1 2 5 1 2 3 4 5 1 Counting first preferences produces:\nCandidate First Prefs A 40 B 40 C 4 D 5 E 1 The first step in the counting is to determine if any candidate has amassed first preference votes equal to or greater than the Droop quota. In the example just given, both A and B have 40 first preferences each, so they are both elected.\nSince only 26 votes are needed for election, for each of A and B there are 14 votes remaining which can be passed on to other candidates according to the voting preferences. Which votes are passed on? For B it doesn\u0026rsquo;t matter, but which votes do we deem surplus for A? The Australian choice is to pass on all votes, but at a reduced value known as the transfer value. This value is simply the fraction of surplus votes over total votes; in our case it is\n\\[\\frac{14}{40}=0.35\\]\nfor each of A and B.\nLooking at the first line of votes: the next highest preference from A to a non-elected candidate is C, so C gets 0.35 of those 20 votes. From the second line, E gets 0.35 of those 20 votes. From the third line, E gets 0.35 of all 40 votes.\nThe votes now allocated to the remaining candidates are as follows:\nC: \\(4 + 0.35\\times 20 = 11\\)\nD: 5\nE: \\(1 + 0.35\\times 20 + 0.35\\times 40 = 22\\)\nAt this stage no candidate has amassed a quota, so the lowest ranked candidate in the counting is eliminated - in this case D - and all of those votes are passed on to the highest candidate (of those that are left, which is now only C and E) in those preferences, which is E. This produces:\nC: 11\nE: \\(22 + 5 = 27\\)\nwhich means E has achieved the quota and thus is elected.\nThis is of course a very artificial example, but it shows two things:\nHow a candidate with a very small number of first preference votes can still be elected: in this case E had the lowest number of first preference votes. The importance of preferences. So let\u0026rsquo;s have no more complaining about the low number of first preference votes in a senate count. In a lower house count, sure, the candidate with the least number of first preference votes is eliminated, but in a senate count such a candidate might amass votes (or reduced values of votes) in such a way as to achieve the quota.\n","link":"https://numbersandshapes.net/posts/vote_counting_in_australian_senate/","section":"posts","tags":["voting"],"title":"Vote counting in the Australian Senate"},{"body":"This evening I saw the Australia Brandenburg Orchestra with guest soloist Lixsania Fernandez, a virtuoso player of the viola da gamba, from Cuba. (Although she studied, and now lives, in Spain.) Lixsania is quite amazing: tall, statuesque, quite absurdly beautiful, and plays with a technique that encompasses the wildest of baroque extravagances as well as the most delicate and refined tenderness.\nThe trouble with the viol, being a fairly soft instrument, is that it\u0026rsquo;s not well suited to a large concert hall. This means that it\u0026rsquo;s almost impossible to get any sort of balance between it and the other instruments. Violins, for example, even if played softly, can overpower it.\nThomas Mace, in his \u0026ldquo;Musick\u0026rsquo;s Monument\u0026rdquo;, published in 1676, complained vigorously about violins:\nMace has been described as a \u0026ldquo;conservative old git\u0026rdquo; which he certainly was, but I do love the idea of this last hold-out against the \u0026ldquo;High-Priz\u0026rsquo;d Noise\u0026rdquo; of the violin. And I can see his point!\nBut back to Lixsania. The concert started with a \u0026ldquo;pastiche\u0026rdquo; of La Folia, taking in parts of Corelli\u0026rsquo;s well known set for solo violin, Vivaldi\u0026rsquo;s for two, Scarlatti for harpsichord, and of course Marin Marais \u0026ldquo;32 couplets de Folies\u0026rdquo; from his second book of viol pieces. The Australian Brandenburgs have a nice line in stagecraft, and this started with a dark stage with only Lixsania lit, playing some wonderful arpeggiated figurations over all the strings, with a bowing of utter perfection. I was sitting side on to her position here, and I could see with what ease she moved over the fingerboard - the mark of a true master of their instrument - being totally at one with it. Little by little other instrumentalists crept in: a violinist here and there, Paul Dyer (leader of the orchestra) to the harpsichord, cellists and a bassist, until there was a sizable group on stage all playing madly. I thought it was just wonderful.\nFor this first piece Lixsania was wearing a black outfit with long and full skirts and sort of halter top which left her arms, sides and back bare. This meant I had an excellent view of her rib-cage, which was a first for me in a concert.\nThe second piece was the 12th concerto, the so called \u0026ldquo;Harmonic Labyrinth\u0026rdquo; from Locatelli\u0026rsquo;s opus 3. These concertos contain, in their first and last movements, a wild \u0026ldquo;capriccio\u0026rdquo; for solo violin. This twelfth concerto contains capricci of such superhuman difficulty that even now, nearly 300 years after they were composed, they still stand at the peak of virtuosity. The Orchestra\u0026rsquo;s concertmaster, Shaun Lee-Chen, was however well up to the challenge, and powered his way through both capricci with the audience hardly daring to breathe. Even though conventional concert behaviour does not include applause after individual movements, so excited was the audience that there was an outburst of clapping after the first movement. And quite right too.\nThe final piece of the first half was a Vivaldi concerto for two violins and cello, the cello part being taken by Lixsania on viol. I felt this didn\u0026rsquo;t come across so well; the viol really couldn\u0026rsquo;t be heard much, and you really do need the strength of the cello to make much sense of the music. However, it did give Lixsania some more stage-time.\nAfter interval we were treated to a concerto for viol by Johann Gottlieb Graun, a court composer to Frederick the Great of Prussia. Graun wrote five concertos for the instrument - all monumentally difficult to play - which have been recorded several times. However, a sixth one has recently been unearthed in manuscript - and apparently we were hearing it for the first time in this concert series. The softness of the viol in the largeness of the hall meant that it was not always easy to hear: I solved that by closing my eyes, so I could focus on the sound alone. Lixsania played, as you would imagine, as though she owned it, and its formidable technical difficulties simply melted away under the total assurance of her fingers. She\u0026rsquo;d changed into a yellow outfit for this second half, and all the male players were wearing yellow ties.\nThen came a short Vivaldi sinfonia - a quite remarkable piece; very stately and with shifting harmonies that gave it a surprisingly modern feel. Just when you think Vivaldi is mainly about pot-boilers, he gives you something like this. Short, but superb.\nFinally, the fourth movement of a concerto written in 2001 for two viols by \u0026ldquo;Renato Duchiffre\u0026rdquo; (the pen name of René Schiffer, cellist and violist with Apollo\u0026rsquo;s Fire): a Tango. Now my exposure to tangos has mainly been through that arch-bore Astor Piazolla. But this tango was magnificent. The other violist was Anthea Cottee, of whom I\u0026rsquo;d never heard, but she\u0026rsquo;s no mean player. She and Lixsania made a fine pair, playing like demons, complementing each other and happily grinning at some of the finer passages. One of the many likeable characteristics of Lixsania is that she seems to really enjoy playing, and smiles a lot - I hate that convention of players who adopt a poker-face. And she has a great smile.\nIn fact the whole orchestra has a wonderful enjoyment about them, led by Paul Dyer who displays a lovely dynamism at the harpsichord. Not for him the expressionless sitting still; he will leap up if given half an opportunity and conduct a passage with whichever hand is free; sometimes he would play standing and sort of conduct with his body; between him and Lixsania there was a chemistry of heart and mind, both leaning towards each other, as if inspiring each other to reach higher musical heights. This was one of the most delightful displays of communicative musicianship I\u0026rsquo;ve ever seen.\nNaturally there had to be an encore: and it was Lixsania singing a Cuban lullaby, accompanying herself by plucking the viol - which was stood on a chair for easier access - with Anthea Cottee providing a bowed accompaniment. Lixsania told us (of course she speaks English fluently, with a charming Cuban accent) that it was a lullaby of special significance, as it was the first song she\u0026rsquo;d ever sang to her son. There\u0026rsquo;s no reason why instrumentalists should be able to sing well, but in fact Lixsania has a lovely, rich, warm, enveloping sort of voice, and the effect was breathtakingly lovely. Lucky son!\nThis was a great concert.\n","link":"https://numbersandshapes.net/posts/lixsania_and_labyrinth/","section":"posts","tags":["music"],"title":"Concert review: Lixsania and the Labyrinth"},{"body":"","link":"https://numbersandshapes.net/tags/music/","section":"tags","tags":null,"title":"Music"},{"body":"Here\u0026rsquo;s an example of a transportation problem, with information given as a table:\nDemands 300 360 280 340 220 750 100 150 200 140 35 Supplies\u0026nbsp; 400 50 70 80 65 80 350 40 90 100 150 130 This is an example of a balanced, non-degenerate transportation problem. It is balanced since the sum of supplies equals the sum of demands, and it is non-degenerate as there is no proper subset of supplies whose sum is equal to that of a proper subset of demands. That is, there are no balanced \u0026ldquo;sub-problems\u0026rdquo;.\nIn such a problem, the array values may be considered to be the costs of transporting one object from a supplier to a demand. (In the version of the problem I pose to my students it\u0026rsquo;s cars between distributors and car-yards; in another version it\u0026rsquo;s tubs of ice-cream between dairies and supermarkets.) The idea of course is to move all objects from supplies to demands while minimizing the total cost.\nThis is a standard linear optimization problem, and it can be solved by any method used to solve such problems, although generally specialized methods are used.\nBut the intention here is to show how easily this problem can be managed using myMathProg (and with numpy, for the simple use of printing an array):\nimport pymprog as py import numpy as np py.begin(\u0026#39;transport\u0026#39;) M = range(3) # number of rows and columns N = range(5) A = py.iprod(M,N) # Cartesian product x = py.var(\u0026#39;x\u0026#39;, A, kind=int) # all the decision variables are integers costs = [[100,150,200,140,35],[50,70,80,65,80],[40,90,100,150,130]] supplies = [750,400,350] demands = [300,360,280,340,220] py.minimize(sum(costs[i][j]*x[i,j] for i,j in A)) # the total sum in each row must equal the supplies for k in M: sum(x[k,j] for j in N)==supplies[k] # the total sum in each column must equal the demands for k in N: sum(x[i,k] for i in M)==demands[k] py.solve() print(\u0026#39;\\nMinimum cost: \u0026#39;,py.vobj()) A = np.array([[x[i,j].primal for j in N] for i in M]) print(\u0026#39;\\n\u0026#39;) print(A) print(\u0026#39;\\n\u0026#39;) #py.sensitivity() py.end() with solution:\nGLPK Simplex Optimizer, v4.65 n8 rows, 15 columns, 30 non-zeros 0: obj = 0.000000000e+00 inf = 3.000e+03 (8) 7: obj = 1.789500000e+05 inf = 0.000e+00 (0) * 12: obj = 1.311000000e+05 inf = 0.000e+00 (0) OPTIMAL LP SOLUTION FOUND GLPK Integer Optimizer, v4.65 8 rows, 15 columns, 30 non-zeros 15 integer variables, none of which are binary Integer optimization begins... Long-step dual simplex will be used + 12: mip = not found yet \u0026gt;= -inf (1; 0) + 12: \u0026gt;\u0026gt;\u0026gt;\u0026gt;\u0026gt; 1.311000000e+05 \u0026gt;= 1.311000000e+05 0.0% (1; 0) + 12: mip = 1.311000000e+05 \u0026gt;= tree is empty 0.0% (0; 1) INTEGER OPTIMAL SOLUTION FOUND Minimum cost: 131100.0 [[ 0. 190. 0. 340. 220.] [ 0. 120. 280. 0. 0.] [300. 50. 0. 0. 0.]] As you see, the definition of the problem in Python is very straightforward.\n","link":"https://numbersandshapes.net/posts/linear_programming_in_python_2/","section":"posts","tags":["linear-programming","python"],"title":"Linear programming in Python (2)"},{"body":"For my elementary linear programming subject, the students (who are all pre-service teachers) use Excel and its Solver as the computational tool of choice. We do this for several reasons: Excel is software with which they\u0026rsquo;re likely to have had some experience, also it\u0026rsquo;s used in schools; it also means we don\u0026rsquo;t have to spend time and mental energy getting to grips with new and unfamiliar software. And indeed the mandated curriculum includes computer exploration, using either Excel Solver, or the Wolfram Alpha Linear Programming widget.\nThis is all very well, but I balk at the reliance on commercial software, no matter how widely used it may be. And for my own exploration I\u0026rsquo;ve been looking for an open-source equivalent.\nIn fact there are plenty of linear programming tools and libraries; two of the most popular open-source ones are:\nThe GNU Linear Programming Kit, GLPK Coin-or Linear Programming, Clp There\u0026rsquo;s a huge list on wikipedia which includes open-source and proprietary software.\nFor pretty much any language you care to name, somebody has taken either GLPK or Clp (or both) and produced a language API for it. For Python there\u0026rsquo;s PuLP; for Julia there\u0026rsquo;s JuMP; for Octave there\u0026rsquo;s the `glpk` command, and so on. Most of the API\u0026rsquo;s include methods of calling other solvers, if you have them available.\nHowever not all of these are well documented, and in particular some of them don\u0026rsquo;t allow sensitivity analysis: computing shadow prices, or ranges of the objective coefficients. I discovered that JuMP doesn\u0026rsquo;t yet support this - although to be fair sensitivity analysis does depend on the problem being solved, and the solver being used.\nBeing a Python aficionado, I thought I\u0026rsquo;d check out some Python packages, of which a list is given at an operations research page.\nHowever, I then discovered the Python package PyMathProg which for my purposes is perfect - it just calls GLPK, but in a nicely \u0026ldquo;pythonic\u0026rdquo; manner, and the design of the package suits me very well.\nA simple example Here\u0026rsquo;s a tiny two-dimensional problem I gave to my students:\nA furniture workshop produces chairs and tables. Each day 30m2 of wood board is delivered to the workshop, of which chairs require 0.5m2 and tables 1.5m2. (We assume, of course, that all wood is used with no wastage.) All furniture needs to be laminated; there is only one machine available for 10 hours per day, and chairs take 15 minutes each, tables 20 minutes. If chairs are sold for $30 and \u003e tables for $60, then maximize the daily profit (assuming that all are sold).\nLetting \\(x\\) be the number of chairs, and \\(y\\) be the number of tables, the problem is to maximize \\[ 30x+60y \\] given\n\\[\\begin{aligned} 0.5x+1.5y\u0026amp;\\le 30\\\\ 15x+20y\u0026amp;\\le 600\\\\ x,y\u0026amp;\\ge 0 \\end{aligned}\\]\nProblems don\u0026rsquo;t get much simpler than this. In pyMathProg:\nimport pymathprog as pm pm.begin(\u0026#39;furniture\u0026#39;) # pm.verbose(True) x, y = pm.var(\u0026#39;x, y\u0026#39;) # variables pm.maximize(30 * x + 60 * y, \u0026#39;profit\u0026#39;) 0.5*x + 1.5*y \u0026lt;= 30 # wood 15*x + 20*y \u0026lt;= 600 # laminate pm.solve() print(\u0026#39;\\nMax profit:\u0026#39;,pm.vobj()) pm.sensitivity() pm.end() with output:\nGLPK Simplex Optimizer, v4.65 2 rows, 2 columns, 4 non-zeros * 0: obj = -0.000000000e+00 inf = 0.000e+00 (2) * 2: obj = 1.440000000e+03 inf = 0.000e+00 (0) OPTIMAL LP SOLUTION FOUND Max profit: 1440.0 PyMathProg 1.0 Sensitivity Report Created: 2018/10/28 Sun 21:42PM ================================================================================ Variable Activity Dual.Value Obj.Coef Range.From Range.Till -------------------------------------------------------------------------------- *x 24 0 30 20 45 *y 12 0 60 40 90 ================================================================================ ================================================================================ Constraint Activity Dual.Value Lower.Bnd Upper.Bnd RangeLower RangeUpper -------------------------------------------------------------------------------- R1 30 24 -inf 30 20 45 R2 600 1.2 -inf 600 400 900 ================================================================================ From that output, we see that the required maximum is $1440, obtained by making 24 chairs and 12 tables. We also see that the shadow prices for the constraints are 24 and 1.2. Furthermore, the ranges of objective coefficients which will not affect the results are \\([20,45]\\) for prices for chairs, and \\([40,90]\\) for table prices.\nThis is the simplest API I\u0026rsquo;ve found so far which provides that sensitivity analysis.\nNote that if we just want a solution, we can use the linprog command from scipy:\nfrom scipy.optimize import linprog linprog([-30,-60],A_ub=[[0.5,1.5],[15,20]],b_ub=[30,600]) linprog automatically minimizes a function, so to maximize we use a negative function. The output is\nfun: -1440.0 message: \u0026#39;Optimization terminated successfully.\u0026#39; nit: 2 slack: array([0., 0.]) status: 0 success: True x: array([24., 12.]) The negative value given as fun above simply reflects that we are entering a negative function. In respect of our problem, we simply negate that value to obtain the required maximum of 1440.\n","link":"https://numbersandshapes.net/posts/linear_programming_in_python/","section":"posts","tags":["linear-programming","python"],"title":"Linear programming in Python"},{"body":"Here\u0026rsquo;s an example of a coloured tetrahedron:\nhello \u0026lt;div oncontextmenu=\u0026quot;return false;\u0026quot; id=\u0026quot;viewerContext\u0026quot; style = \u0026quot;width:640px;height:470px;\u0026quot; design-url=\u0026quot;tetrahedron.jscad\u0026quot;\u0026gt;\u0026lt;/div\u0026gt; \u0026lt;div id=\u0026quot;tail\u0026quot; style=\u0026quot;display: none;\u0026quot;\u0026gt; \u0026lt;div id=\u0026quot;statusdiv\u0026quot;\u0026gt;\u0026lt;/div\u0026gt; \u0026lt;/div\u0026gt; ","link":"https://numbersandshapes.net/posts/test_of_openjscad/","section":"posts","tags":["CAD"],"title":"A test of OpenJSCAD"},{"body":"There\u0026rsquo;s a celebrated elementary result which claims that:\nThere are irrational numbers \\(x\\) and \\(y\\) for which \\(x^y\\) is rational.\nThe standard proof goes like this. Now, we know that \\(\\sqrt{2}\\) is irrational, so let\u0026rsquo;s consider \\(r=\\sqrt{2}^\\sqrt{2}\\). Either \\(r\\) is rational, or it is not. If it is rational, then we set \\(x=\\sqrt{2}\\), \\(y=\\sqrt{2}\\) and we are done. If \\(r\\) is irrational, then set \\(x=r\\) and \\(y=\\sqrt{2}\\). This means that \\[ x^y=\\left(\\sqrt{2}^\\sqrt{2}\\right)^{\\sqrt{2}}=\\sqrt{2}^2=2 \\] which is rational.\nThis is a perfectly acceptable proof, but highly non-constructive, And for some people, the fact that the proof gives no information about the irrationality of \\(\\sqrt{2}^\\sqrt{2}\\) is a fault.\nSo here\u0026rsquo;s a lovely constructive proof I found on reddit . Set \\(x=\\sqrt{2}\\) and \\(y=2\\log_2{3}\\). The fact that \\(y\\) is irrational follows from the fact that if \\(y=p/q\\) with \\(p\\) and \\(q\\) integers, then \\(2\\log_2{3}=p/q\\) so that \\(2^{p/2q}=3\\), or \\(2^p=3^{2q}\\) which contradicts the fundamental theorem of arithmetic. Then:\n\\begin{eqnarray*} x^y\u0026amp;=\u0026amp;\\sqrt{2}^{2\\log_2{3}}\\\\ \u0026amp;=\u0026amp;2^{\\log_2{3}}\\\\ \u0026amp;=\u0026amp;3. \\end{eqnarray*}\n","link":"https://numbersandshapes.net/posts/powers_of_irrationals/","section":"posts","tags":["mathematics"],"title":"The power of two irrational numbers being rational"},{"body":"For years I have been running a blog and other web apps on a VPS running Ubuntu 14.04 and Apache - a standard LAMP system. However, after experimenting with some apps - temporarily installing them and testing them, only to discard them, the system was becoming a total mess. Worst of all, various MySQL files were ballooning out in size: the ibdata1 file in /var/lib/mysql was coming in at a whopping 37Gb (39568015360 bytes to be more accurate).\nNow, there are ways of dealing with this, but I don\u0026rsquo;t want to have to become an expert in MySQL; all I wanted to do was to recover my system and make it more manageable.\nI decided to use Docker. This is a \u0026ldquo;container system\u0026rdquo; where each app runs in its own container - a sort of mini system which contains all the files required to serve it up to the web. This clearly requires a certain amount of repetition between containers, but that\u0026rsquo;s the price to be paid for independence. The idea is that you can start or stop any container without affecting any of the others. For web apps many containers are based on Alpine Linux which is a system designed to be as tiny as possible, along with the nginx web server.\nThere seems to be a sizable ecosystem of tools to help manage and deploy docker containers. Given my starting position of knowing nothing, I wanted to keep my extra tools to a minimum; I went with just two over and above docker itself: docker-compose, which helps design, configure, and run docker containers, and traefik, a reverse proxy, which handles all requests from the outside world to docker containers - thus managing things like ports - as well as interfacing with the certificate authority Lets Encrypt.\nMy hope was that I should be able to get these all set up so they would work as happily together as they were supposed to do. And so indeed it has turned out, although it took many days of fiddling, and innumerable questions to forums and web sites (such as reddit) to make it work.\nSo here\u0026rsquo;s my traefik configuration:\ndefaultEntryPoints = [\u0026#34;http\u0026#34;, \u0026#34;https\u0026#34;] [web] address = \u0026#34;:8080\u0026#34; [web.auth.basic] users = [\u0026#34;admin:$apr1$v7kJtvT7$h0F7kxt.lAzFH4sZ8Z9ik.\u0026#34;] [entryPoints] [entryPoints.http] address = \u0026#34;:80\u0026#34; [entryPoints.http.redirect] entryPoint = \u0026#34;https\u0026#34; [entryPoints.https] address = \u0026#34;:443\u0026#34; [entryPoints.https.tls] [traefikLog] filePath=\u0026#34;./traefik.log\u0026#34; format = \u0026#34;json\u0026#34; # Below here comes from # www.smarthomebeginner.com/traefik-reverse-proxy-tutorial-for-docker/ # with values adjusted for local use, of course # Let\u0026#39;s encrypt configuration [acme] email=\u0026#34;amca01@gmail.com\u0026#34; storage=\u0026#34;./acme.json\u0026#34; acmeLogging=true onHostRule = true entryPoint = \u0026#34;https\u0026#34; # Use a HTTP-01 acme challenge rather than TLS-SNI-01 challenge [acme.httpChallenge] entryPoint = \u0026#34;http\u0026#34; [[acme.domains]] main = \u0026#34;numbersandshapes.net\u0026#34; sans = [\u0026#34;monitor.numbersandshapes.net\u0026#34;, \u0026#34;adminer.numbersandshapes.net\u0026#34;, \u0026#34;portainer.numbersandshapes.net\u0026#34;, \u0026#34;kanboard.numbersandshapes.net\u0026#34;, \u0026#34;webwork.numbersandshapes.net\u0026#34;, \u0026#34;blog.numbersandshapes.net\u0026#34;] # Connection to docker host system (docker.sock) [docker] endpoint = \u0026#34;unix:///var/run/docker.sock\u0026#34; domain = \u0026#34;numbersandshapes.net\u0026#34; watch = true # This will hide all docker containers that don\u0026#39;t have explicitly set label to \u0026#34;enable\u0026#34; exposedbydefault = false and (part of) my docker-compose configuration, the file docker-compose.yml:\nversion: \u0026#34;3\u0026#34; networks: proxy: external: true internal: external: false services: traefik: image: traefik:1.6.0-alpine container_name: traefik restart: always command: --web --docker --logLevel=DEBUG volumes: - /var/run/docker.sock:/var/run/docker.sock - $PWD/traefik.toml:/traefik.toml - $PWD/acme.json:/acme.json networks: - proxy ports: - \u0026#34;80:80\u0026#34; - \u0026#34;443:443\u0026#34; labels: - traefik.enable=true - traefik.backend=traefik - traefik.frontend.rule=Host:monitor.numbersandshapes.net - traefik.port=8080 - traefik.docker.network=proxy blog: image: blog volumes: - /home/amca/docker/whats_this/public:/usr/share/nginx/html networks: - internal - proxy labels: - traefik.enable=true - traefik.backend=blog - traefik.docker.network=proxy - traefik.port=80 - traefik.frontend.rule=Host:blog.numbersandshapes.net The way this works, at least in respect of this blog, is that files copied into the directory /home/amca/docker/whats_this/public on my VPS will be automatically served by nginx. So all I now need is a command on my local system (on which I do all my blog writing), which serves up these files. I\u0026rsquo;ve called it docker-deploy:\nhugo -b \u0026#34;https://blog.numbersandshapes.net/\u0026#34; -t \u0026#34;blackburn\u0026#34; \u0026amp;\u0026amp; rsync -avz -e \u0026#34;ssh\u0026#34; --delete public/ amca@numbersandshapes.net:~/docker/whats_this/public Remarkably enough, it all works!\nOne issue I had at the beginning was that my original blog was served up at the URL https://numberdsandshapes.net/blog and for some reason these links were still appearing in my new blog. It turned out (after a lot of anguished messages) that it was my mis-handling of rsync. I just ended up deleting everything except for the blog source files, and re-created everything from scratch.\n","link":"https://numbersandshapes.net/posts/wrestling_with_docker/","section":"posts","tags":null,"title":"Wrestling with Docker"},{"body":"These are a class of root-finding methods; that is, for the numerical solution of a single nonlinear equation, developed by Alston Scott Householder in 1970. They may be considered a generalisation of the well known Newton-Raphson method (also known more simply as Newton\u0026rsquo;s method) defined by\n\\[ x\\leftarrow x-\\frac{f(x)}{f\u0026rsquo;(x)}. \\]\nwhere the equation to be solved is \\(f(x)=0\\).\nFrom a starting value \\(x_0\\) a sequence of iterates can be generated by\n\\[ x_{n+1}=x_n-\\frac{f(x_n)}{f\u0026rsquo;(x_n)}. \\]\nAs is well known, Newton\u0026rsquo;s method exhibits quadratic convergence; that is, if the sequence of iterates converges to a root value \\(r\\), then the limit\n\\[ \\lim_{n\\to\\infty}\\frac{x_{n+1}-r}{(x_n-r)^2} \\]\nis finite. This means, in effect, that the number of correct decimal places doubles at each step. Householder\u0026rsquo;s method for a rate of convergence \\(d+1\\) is defined by\n\\[ x\\leftarrow x-d\\frac{(1/f)^{(d-1)}(x)}{(1/f)^{(d)}(x)}. \\]\nWe show how this definition can be rewritten in terms of ratios of derivatives, by using Python and its symbolic toolbox SymPy.\nWe start by defining some variables and functions.\nfrom sympy import * x = Symbol(\u0026#39;x\u0026#39;) f = Function(\u0026#39;f\u0026#39;)(x) Now we can define the first Householder formula, with \\(d=1\\):\nd = 1 H1 = x + d*diff(1/f,x,d-1)/diff(1/f,x,d) H1 \\[ x-\\frac{f(x)}{\\frac{d}{dx}f(x)} \\]\nwhich is Newton\u0026rsquo;s formula. Now for \\(d=2\\):\nd = 2 H2 = x + d*diff(1/f,x,d-1)/diff(1/f,x,d) H2 \\[ x - \\frac{2 \\frac{d}{d x} f{\\left (x \\right )}}{- \\frac{d^{2}}{d x^{2}} f{\\left (x \\right )} + \\frac{2 \\left(\\frac{d}{d x} f{\\left (x \\right )}\\right)^{2}}{f{\\left (x \\right )}}} \\]\nThis is a mighty messy formula, but it can be greatly simplified by using ratios of derivatives defined by\n\\[ r_k=\\frac{f^{(d-1}(x)}{f^{(d)}(x)} \\] This means that \\[ r_1=\\frac{f}{f\u0026rsquo;},\\quad r_2=\\frac{f\u0026rsquo;}{f^{\\prime\\prime}} \\] To make the substitution into the current expression above, we can use the substitutions \\[ f^{\\prime\\prime}=f\u0026rsquo;/r_2,\\quad f\u0026rsquo;=f/r_1 \\] to be done sequentially (first defining the new symbols)\nr_1,r_2,r_3 = symbols(\u0026#39;r_1,r_2,r_3\u0026#39;) H2r = H2s.subs([(Derivative(f,x,2), Derivative(f,x)/r_2), (Derivative(f,x), f/r_1)]).simplify() H2r \\[ -\\frac{2r_1r_1}{r_1-2r_2} \\] Dividing the top and bottom by \\(2r_2\\) produces the formulation \\[ \\frac{r_1}{1-\\displaystyle{\\frac{r_1}{2r_2}}} \\] and so Householder\u0026rsquo;s method for \\(d=2\\) is defined by the recurrence \\[ x\\leftarrow x-\\frac{r_1}{1-\\displaystyle{\\frac{r_1}{2r_2}}}. \\] This is known as Halley\u0026rsquo;s method, after Edmond Halley, also known for his comet. This method has been called the most often rediscovered iteration formula in the literature.\nIt would exhibit cubic convergence, which means that the number of correct figures roughly triples at each step.\nApply the same sequence of steps for \\(d=3\\), and including the substitution \\[ f^{\\prime\\prime\\prime} = f^{\\prime\\prime}/r_3 \\] produces the fourth order formula \\[ x\\leftarrow x-\\frac{3 r_{1} r_{3} \\left(2r_{2} - r_{1}\\right)}{r_{1}^{2} - 6 r_{1} r_{3} + 6 r_{2} r_{3}} \\]\nA test We\u0026rsquo;ll use the equation \\[ x^5+x-1=0 \\] which has a root close to \\(0.7\\). First Newton\u0026rsquo;s method, which is the Householder method of order \\(d=1\\), and we start by defining the symbol \\(x\\) and the function \\(f\\):\nx = Symbol(\u0026#39;x\u0026#39;) f = x**5+x-1 Next define the iteration of Newton\u0026rsquo;s method, which can be turned into a function with the handy tool lambdify:\nnr = lambdify(x, x - f/diff(f,x)) Now, a few iterations, and print them as strings:\ny = 0.7 ys = [y] for i in range(10): y = N(nr(y),100) ys += [y] for i in ys: print(str(i)) 0.7 0.7599545557827765973613054484303575009107589721679687500000000000000000000000000000000000000000000000 0.7549197891599746887794253559985793967456078439525201893202319456623650882121929457935763902468565963 0.7548776691557956141971506438033504033307707534709697222674827264390889507161368160254597915269779252 0.7548776662466927739251146002523856449587324643131536407777773148939177229546284200355119465808326870 0.7548776662466927600495088963585290075677963335246916447723036615900830138144428153523526591809355834 0.7548776662466927600495088963585286918946066177727931439892839706462440390043279509776806970677946058 0.7548776662466927600495088963585286918946066177727931439892839706460806551280810907382270928422503037 0.7548776662466927600495088963585286918946066177727931439892839706460806551280810907382270928422503037 0.7548776662466927600495088963585286918946066177727931439892839706460806551280810907382270928422503037 0.7548776662466927600495088963585286918946066177727931439892839706460806551280810907382270928422503037 We can easily compute the number of correct decimal places each time by simply finding the first place in each string where it differs from the previous one:\nfor i in range(1,7): d = [ys[i][j] == ys[i+1][j] for j in range(102)] print(d.index(False)-2) \\begin{array}{r} 2\\cr 3\\cr 8\\cr 16\\cr 32\\cr 66 \\end{array}\nand we see a remarkable closeness with doubling of the number of correct values each iteration.\nNow, the fourth order method, with \\(d=3\\):\nr1 = lambdify(x,g(x)/diff(g(x),x)) r2 = lambdify(x,diff(g(x),x)/diff(g(x),x,2)) r3 = lambdify(x,diff(g(x),x,2)/diff(g(x),x,3)) h3 = lambdify(x,x-3*r1(x)*r3(x)*(2*r2(x)-r1(x))/(r1(x)**2-6*r1(x)*r3(x)+6*r2(x)*r3(x))) Now we basically copy down the above commands, except that we\u0026rsquo;ll use 1500 decimal places instead of 100:\ny = 0.7 ys = [str(x)] for i in range(10): y = N(h3(x),1500) ys += [str(y)] for i in range(1,6): d = [xs[i][j] == xs[i+1][j] for j in range(1502)] print(d.index(False)-2) \\begin{array}{r} 4\\\\ 19\\\\ 76\\\\ 308\\\\ 1233 \\end{array}\nand we that the number of correct decimal places at each step is indeed increased by a factor very close to 4.\n","link":"https://numbersandshapes.net/posts/householders_methods/","section":"posts","tags":["mathematics","algebra"],"title":"Householder's methods"},{"body":"The Joukowksy Transform is an elegant and simple way to create an airfoil shape.\nLet \\(C\\) be a circle in the complex plane that passes through the point \\(z=1\\) and encompasses the point \\(z=-1\\). The transform is defined as\n\\[ \\zeta=z+\\frac{1}{z}. \\]\nWe can explore the transform by looking at the circles centred at \\((-r,0)\\) with \\(r\u0026lt;0\\) and with radius \\(1+r\\):\n\\[ \\|z-r\\|=1+r \\]\nor in cartesian coordinates with parameter \\(t\\):\n\\[\n\\begin{aligned} x \u0026amp;= -r+(1+r)\\cos(t)\\\\ y \u0026amp;= (1+r)\\sin(t) \\end{aligned}\n\\]\nso that \\[ (x,y)\\rightarrow \\left(x+\\frac{x}{x^2+y^2},y-\\frac{y}{x^2+y^2}\\right). \\]\nTo see this in action, move the point \\(c\\) in this diagram about. You\u0026rsquo;ll get the best result when it is close to the origin.\n","link":"https://numbersandshapes.net/posts/joukowsky-transform/","section":"posts","tags":["mathematics","geometry","jsxgraph"],"title":"The Joukowsky Transform"},{"body":"This was a comedy sketch initially performed in the revue \u0026ldquo;Clowns in Clover\u0026rdquo; which had its first performance at the Adelphi Theatre in London on December 1, 1927. This particular sketch was written by Dion Titheradge and starred the inimitable Cicely Courtneidge as the annoyed customer Mrs Spooner. It has been recorded and is available on many different collections; you can also hear it on youtube.\nI have loved this sketch since I first heard it as a teenager on a three record collection called something like \u0026ldquo;Masters of Comedy\u0026rdquo;, being a collection of classic sketches. Double Damask has also been performed by Beatrice Lillie, and you can search for this also on youtube. For example, here. I hope admirers of the excellent Ms Lillie will not be upset by my saying I far prefer Cicely Courtneidge, whose superb diction and impeccable comic timing are beyond reproach.\nNo doubt the original script is available somewhere, but in the annoying way of the internet, I couldn\u0026rsquo;t find it. So here is my transcription of the Courtneidge version of \u0026ldquo;Double Damask\u0026rdquo;.\nDouble Damask\nwritten by\nDion Titheradge\nCharacters:\\ A customer, Mrs Spooner\\ A shop assistant (unnamed)\\ A manager, Mr Peters\nScene: The linen department of a large store.\nMRS SPOONER: I wonder if you could tell me if my order has gone off yet?\nASSISTANT: Not knowing your order, madam, I really couldn\u0026rsquo;t say.\nMRS SPOONER: But I was in here an hour ago and gave it to you.\nASSISTANT: What name, madam?\nMRS SPOONER: Spooner, Mrs Spooner,\nASSISTANT: Have you an address?\nMRS SPOONER: Do I look as if I live in the open air? I gave a large order for sheets and tablecloths, to be sent to Bacon Villa, Egham. (pronounced \u0026ldquo;Eg\u0026rsquo;m\u0026rdquo;)\nASSISTANT: Eg\u0026rsquo;m?\nMRS SPOONER: I hope I speak plainly: Egg Ham!\nASSISTANT: Oh yes, yes I remember perfectly now, Madam. Let me see now\u0026hellip; no, your order won\u0026rsquo;t go through until tomorrow morning. Is there anything further?\nMRS SPOONER: Yes, (very quickly) I want two dozen double damask dinner napkins.\nASSISTANT: I beg your pardon?\nMRS SPOONER (as quicky as before): I said two dozen double damask dinner napkins.\nASSISTANT: I\u0026rsquo;m sorry madam, I don\u0026rsquo;t quite catch -\nMRS SPOONER: Dinner napkins, man! Dinner napkins!\nASSISTANT: Of course madam. Plain?\nMRS SPOONER: Not plain, double damask.\nASSISTANT: Yes\u0026hellip; would you mind repeating your order Madam? I\u0026rsquo;m not quite sure.\nMRS SPOONER: I want two dozen dammle dubbuck; I want two dammle dubb\u0026hellip; oh dear, stupid of me! I want two dozen dammle dizzick danner nipkins.\nASSISTANT: Danner nipkins Madam?\nMRS SPOONER: Yes.\nASSISTANT: You mean dinner napkins.\nMRS SPOONER: That\u0026rsquo;s what I said.\nASSISTANT: No, pardon me, Madam, you said danner nipkins!\nMRS SPOONER: Don\u0026rsquo;t be ridiculous! I said dinner napkins, and I meant danner nipkins. Nipper dank\u0026hellip;you know you\u0026rsquo;re getting me muddled now.\nASSISTANT: I\u0026rsquo;m sorry Madam. You want danner nipkins, exactly. How many?\nMRS SPOONER: Two duzzle.\nASSISTANT: Madam?\nMRS SPOONER: Oh, gracious, young man - can\u0026rsquo;t you get it right? I want two dubbin duzzle damask dinner napkins.\nASSISTANT: Oh no, Madam, not two dubbin - you mean two dozen!\nMRS SPOONER: I said two dozen! Only they must be dammle duzzick!\nASSISTANT: No, we haven\u0026rsquo;t any of that in stock, Madam.\nMRS SPOONER (in a tone of complete exasperation): Oh dear, of all the fools! Can\u0026rsquo;t I find anybody, just anybody with a modicum of intelligence in this store?\nASSISTANT: Well, here is our Mr Peters, Madam. Now perhaps if you ask him he might-\nMR PETERS (In an authoritative \u0026ldquo;we can fix anything\u0026rdquo; kind of voice): Can I be of any assistance to you, Madam?\nMRS SPOONER: I\u0026rsquo;m sorry to say that your assistant doesn\u0026rsquo;t appear to speak English. I\u0026rsquo;m giving an order, but it might just as well be in Esperanto for all he understands.\nMR PETERS: Allow me to help you Madam. You require?\nMRS SPOONER: I require (as quickly as before) two dozen double damask dinner napkins.\nMR PETERS: I beg pardon, Madam?\nMRS SPOONER: Oh heavens - can\u0026rsquo;t you understand?\nMR PETERS: Would you mind repeating your order, Madam.\nMRS SPOONER: I want two dazzen -\nMR PETERS: Two dozen!\nMRS SPOONER: I said two dozen!\nMR PETERS: Oh no no Madam - no, you said two dazzen. But I understand perfectly what you mean. You mean two dozen; in other words - a double dozen.\nMRS SPOONER: That\u0026rsquo;s it! A duzzle dubbin double damask dinner napkins.\nMR PETERS: Oh no, pardon me, Madam, pardon me: you mean a double dozen double dummick dinner napkins.\nASSISTANT: Double damask, sir.\nMR PETERS: I said double damask! It\u0026rsquo;s\u0026hellip; dapper ninkins you require, sir.\nMRS SPOONER: Please get it right, I want dinner napkins, dinner napkins.\nMR PETERS: I beg pardon, Madam. So stupid of me\u0026hellip;one gets so confused\u0026hellip; (Laughs)\nMRS SPOONER: It is not a laughing matter.\nMR PETERS: Of course. Dipper nankins, madam.\nASSISTANT: Dapper ninkins, sir.\nMRS SPOONER: Danner nipkins.\nMR PETERS: I understand exactly what Madam wants. It is two d-d-d-d-..two d- Would you mind repeating your order please, Madam?\nMRS SPOONER: Ohhh, dear.. I want two duzzle dizzen damask dinner dumplings!\nMR PETERS: Allow me, Madam, allow me. The lady requires (quickly) two dubbin double damask dunner napkins.\nASSISTANT: Dunner napkins sir?\nMR PETERS: Certainly! Two dizzen.\nMRS SPOONER: Not two dizzen - I want two dowzen!\nMR PETERS: Quite so, Madam, quite so. If I may say so we\u0026rsquo;re getting a little bit confused, splitting it up, as it were. Now, the full order, the full order, is two dazzen dibble dummisk n\u0026rsquo;dipper dumkins.\nASSISTANT: Excuse me, sir, you mean two dummen dammle dimmick dizzy napkins.\n(The next four four lines are spoken almost on top of each other)\nMRS SPOONER: I do not want dizzy napkins, I want two dizzle dammen damask -\nMR PETERS: No - two dizzle dammle dizzick!\nASSISTANT: Two duzzle dummuck dummy!\nMRS SPOONER: Two damn dizzy diddle dimmer dipkins!\nMR PETERS (Shocked): Madam, Madam! Please, please - your language!\nMRS SPOONER: Oh, blast. Give me twenty four serviettes.\n","link":"https://numbersandshapes.net/posts/double-damask/","section":"posts","tags":["humour"],"title":"Double Damask"},{"body":"","link":"https://numbersandshapes.net/tags/humour/","section":"tags","tags":null,"title":"Humour"},{"body":"I recently came across some nice material on John Cook\u0026rsquo;s blog about equations that described eggs.\nIt turns out there are vast number of equations whose graphs are egg-shaped: that is, basically ellipse shape, but with one end \u0026ldquo;rounder\u0026rdquo; than the other.\nYou can see lots at Jürgen Köller\u0026rsquo;s Mathematische Basteleien page. (Although this blog is mostly in German, there are enough English language pages for monoglots such as me). And plenty of egg equations can be found in the 2dcurves pages.\nAnother excellent source of eggy equations is TDCC Laboratory from Japan (the link here is to their English language page). For the purposes of experimenting we will use equations from this TDCC, adjusted as necessary. Many of their equations are given in parametric form, which means they can be easily graphed and explored using JSXGraph.\nThe first set of parametric equations, whose author is given to be Nobuo Yamamoto, is:\n\\[\\begin{aligned} x\u0026amp;=(a+b+b\\cos\\theta)\\cos\\theta\\\\ y\u0026amp;=(a+b\\cos\\theta)\\sin\\theta \\end{aligned}\\]\nIf we divide these equations by \\(a\\), and use the parameter \\(c\\) for \\(b/a\\) we obtain slightly simpler equations:\n\\[\\begin{aligned} x\u0026amp;=(1+c+c\\cos\\theta)\\cos\\theta\\\\ y\u0026amp;=(1+c\\cos\\theta)\\sin\\theta \\end{aligned}\\]\nHere you can explore values of \\(c\\) between 0 and 1:\nAnother set of equations is said to be due to Tadao Ito (whose surname is sometimes transliterated as Itou):\n\\[\\begin{aligned} x\u0026amp;=\\cos\\theta\\\\ y\u0026amp;=c\\cos\\frac{\\theta}{4}\\sin\\theta \\end{aligned}\\]\nMany more equations: parametric, implicit, can be found at the sites linked above.\n","link":"https://numbersandshapes.net/posts/egg_graphs/","section":"posts","tags":["geometry","jsxgraph"],"title":"Graphs of Eggs"},{"body":"JSXGraph is a graphics package deveoped in Javascript, and which seems to be tailor-made for a static blog such as this. It consists of only two files: the javascript file itself, and an accompanying css file, which you can download. Alternaively you can simply link to the online files at the Javascript content delivery site cdnjs managed by cloudflare. There are cloudflare servers all over the world - even in my home town of Melbourne, Australia.\nSo I modified the head.html file of my theme to include a link to the necessary files:\nSo I downloaded the javascript and css files as described here and also, for good measure, added the script line (from that page) to the layouts/partials/head.html file of the theme. Then copied the following snippet from the JSXGraph site:\n\u0026lt;div id=\u0026#34;box\u0026#34; class=\u0026#34;jxgbox\u0026#34; style=\u0026#34;width:500px; height:500px;\u0026#34;\u0026gt;\u0026lt;/div\u0026gt; \u0026lt;script type=\u0026#34;text/javascript\u0026#34;\u0026gt; var board = JXG.JSXGraph.initBoard(\u0026#39;box\u0026#39;, {boundingbox: [-10, 10, 10, -10], axis:true}); \u0026lt;/script\u0026gt; However, to make this work the entire script needs to be inside a \u0026lt;div\u0026gt;, \u0026lt;/div\u0026gt; pair, like this:\n\u0026lt;div id=\u0026#34;box\u0026#34; class=\u0026#34;jxgbox\u0026#34; style=\u0026#34;width:500px; height:500px;\u0026#34;\u0026gt; \u0026lt;script type=\u0026#34;text/javascript\u0026#34;\u0026gt; var board = JXG.JSXGraph.initBoard(\u0026#39;box\u0026#39;, {boundingbox: [-10, 10, 10, -10], axis:true}); \u0026lt;/script\u0026gt; \u0026lt;/div\u0026gt; Just to see how well this works, here\u0026rsquo;s Archimedes\u0026rsquo; neusis construction of an angle trisection: given an angle \\(\\theta\\) in a unit semicircle, its trisection is obtained by laying against the circle a straight line with points spaced 1 apart (drag point A about the circle to see this in action):\nFor what it\u0026rsquo;s worth, here is the splendid javascript code to produce the above figure:\n\u0026lt;div id=\u0026#34;box\u0026#34; class=\u0026#34;jxgbox\u0026#34; style=\u0026#34;width:500px; height:333.33px;\u0026#34;\u0026gt; \u0026lt;script type=\u0026#34;text/javascript\u0026#34;\u0026gt; JXG.Options.axis.ticks.insertTicks = false; JXG.Options.axis.ticks.drawLabels = false; var board = JXG.JSXGraph.initBoard(\u0026#39;box\u0026#39;, {boundingbox: [-1.5, 1.5, 3, -1.5],axis:true}); var p = board.create(\u0026#39;point\u0026#39;,[0,0],{visible:false,fixed:true}); var neg = board.create(\u0026#39;point\u0026#39;,[-0.67,0],{visible:false,fixed:true}); var c = board.create(\u0026#39;circle\u0026#39;,[[0,0],1.0]); var a = board.create(\u0026#39;glider\u0026#39;,[-Math.sqrt(0.5),Math.sqrt(0.5),c],{name:\u0026#39;A\u0026#39;}); var l1 = board.create(\u0026#39;segment\u0026#39;,[a,p]); var ang = board.create(\u0026#39;angle\u0026#39;,[a,p,neg],{radius:0.67,name:\u0026#39;θ\u0026#39;}); var theta = JXG.Math.Geometry.rad(a,p,neg); var bb = board.create(\u0026#39;point\u0026#39;,[function(){return Math.cos(Math.atan2(a.Y(),-a.X())/3);},function(){return Math.sin(Math.atan2(a.Y(),-a.X())/3);}],{name:\u0026#39;B\u0026#39;}); var w = board.create(\u0026#39;point\u0026#39;,[function(){return Math.cos(Math.atan2(a.Y(),-a.X())/3)/0.5;},0]); var l2 = board.create(\u0026#39;line\u0026#39;,[a,w]); var l3 = board.create(\u0026#39;segment\u0026#39;,[p,bb]); var l4 = board.create(\u0026#39;segment\u0026#39;,[bb,w],{strokeWidth:6,strokeColor:\u0026#39;#FF0000\u0026#39;}); var ang2 = board.create(\u0026#39;angle\u0026#39;,[bb,w,neg],{radius:0.67,name:\u0026#39;θ/3\u0026#39;}); \u0026lt;/script\u0026gt; \u0026lt;/div\u0026gt; Quite wonderful, it is.\n","link":"https://numbersandshapes.net/posts/exploring_jsxgraph/","section":"posts","tags":["jsxgraph"],"title":"Exploring JSXGraph"},{"body":"When I was teaching the binomial theorem (or, to be more accurate, the binomial expansion) to my long-suffering students, one of them asked me if there was a trinomial theorem. Well, of course there is, although in fact expanding sums of greater than two terms is generally not classed as a theorem described by the number of terms. The general result is\n\\[ (x_1+x_2+\\cdots+x_k)^n=\\sum_{a_1+a_2+\\cdots+a_k=n} {n\\choose a_1,a_2,\\ldots,a_k}x_1^{a_1}x_2^{a_2}\\cdots x_k^{a_k} \\]\nso in particular a \u0026ldquo;trinomial theorem\u0026rdquo; would be\n\\[ (x+y+z)^n=\\sum_{a+b+c=n}{n\\choose a,b,c}x^ay^bz^c. \\]\nHere we define\n\\[ {n\\choose a,b,c}=\\frac{n!}{a!b!c!} \\]\nand this is known as a trinomial coefficient; more generally, for an arbitrary number of variables, it is a multinomial coefficient. It is guaranteed to be an integer if the lower values sum to the upper value.\nSo to compute \\((x+y+z)^5\\) we could list all integers \\(a,b,c\\) with \\(0\\le a,b,c\\le 5\\) for which \\(a+b+c=5\\), and put them all into the above sum.\nBut of course there\u0026rsquo;s a better way, and it comes from expanding \\((x+y+z)^5\\) as a binomial \\((x+(y+z))^5\\) so that\n\\begin{array}{rcl} (x+(y+x))^5\u0026amp;=\u0026amp;x^5\\\\ \u0026amp;\u0026amp;+5x^4(y+z)\\\\ \u0026amp;\u0026amp;+10x^3(y+z)^2\\\\ \u0026amp;\u0026amp;+10x^2(y+z)^3\\\\ \u0026amp;\u0026amp;+5x(y+z)^4\\\\ \u0026amp;\u0026amp;+(y+z)^5 \\end{array}\nNow we can expand each of those binomial powers:\n\\begin{array}{rcl} (x+(y+x))^5\u0026amp;=\u0026amp;x^5\\\\ \u0026amp;\u0026amp;+5x^4(y+z)\\\\ \u0026amp;\u0026amp;+10x^3(y^2+2yz+z^2)\\\\ \u0026amp;\u0026amp;+10x^2(y^3+3y^2z+3yz^2+z^3)\\\\ \u0026amp;\u0026amp;+5x(y^4+4y^3z+6y^2z^2+4yz^3+z^4)\\\\ \u0026amp;\u0026amp;+(y^5+5y^4z+10y^3z^2+10y^2z^3+5yz^4+z^5) \\end{array}\nExpanding this produces\n\\begin{split} x^5\u0026amp;+5x^4y+5x^4z+10x^3y^2+20x^3yz+10x^3z^2+10x^2y^3+30x^2y^2z+30x^2yz^3\\\\ \u0026amp;+10x^2z^3+5zy^4+20xy^3z+30xy^2z^2+20xyz^3+5xz^4+y^5+5y^4z+10y^3z^2\\\\ \u0026amp;+10y^2z^3+5yz^4+z^5 \\end{split}\nwhich is an equation of rare beauty.\nBut there\u0026rsquo;s a nice way of setting this up, which involves writing down Pascal\u0026rsquo;s triangle to the fifth row, and putting a fifth row, as a column, on the side. Then multiply across:\n\\begin{array}{lcccccccccc} 1\u0026amp;\u0026amp;\u0026amp;\u0026amp;\u0026amp;\u0026amp;1\u0026amp;\u0026amp;\u0026amp;\u0026amp;\u0026amp;\\\\ 5\u0026amp;\u0026amp;\u0026amp;\u0026amp;\u0026amp;1\u0026amp;\u0026amp;1\u0026amp;\u0026amp;\u0026amp;\u0026amp;\\\\ 10\\quad\\times\u0026amp;\u0026amp;\u0026amp;\u0026amp;1\u0026amp;\u0026amp;2\u0026amp;\u0026amp;1\u0026amp;\u0026amp;\u0026amp;\\\\ 10\u0026amp;\u0026amp;\u0026amp;1\u0026amp;\u0026amp;3\u0026amp;\u0026amp;3\u0026amp;\u0026amp;1\u0026amp;\u0026amp;\\\\ 5\u0026amp;\u0026amp;1\u0026amp;\u0026amp;4\u0026amp;\u0026amp;6\u0026amp;\u0026amp;4\u0026amp;\u0026amp;1\u0026amp;\\\\ 1\u0026amp;1\u0026amp;\u0026amp;5\u0026amp;\u0026amp;10\u0026amp;\u0026amp;10\u0026amp;\u0026amp;5\u0026amp;\u0026amp;1 \\end{array}\nto produce the final array of coefficients (with index numbers at the left):\n\\begin{array}{l*{10}{c}} 0\\qquad{}\u0026amp;\u0026amp;\u0026amp;\u0026amp;\u0026amp;\u0026amp;1\u0026amp;\u0026amp;\u0026amp;\u0026amp;\u0026amp;\\\\ 1\u0026amp;\u0026amp;\u0026amp;\u0026amp;\u0026amp;5\u0026amp;\u0026amp;5\u0026amp;\u0026amp;\u0026amp;\u0026amp;\\\\ 2\u0026amp;\u0026amp;\u0026amp;\u0026amp;10\u0026amp;\u0026amp;20\u0026amp;\u0026amp;10\u0026amp;\u0026amp;\u0026amp;\\\\ 3\u0026amp;\u0026amp;\u0026amp;10\u0026amp;\u0026amp;30\u0026amp;\u0026amp;30\u0026amp;\u0026amp;10\u0026amp;\u0026amp;\\\\ 4\u0026amp;\u0026amp;5\u0026amp;\u0026amp;20\u0026amp;\u0026amp;30\u0026amp;\u0026amp;20\u0026amp;\u0026amp;5\u0026amp;\\\\ 5\u0026amp;1\u0026amp;\u0026amp;5\u0026amp;\u0026amp;10\u0026amp;\u0026amp;10\u0026amp;\u0026amp;5\u0026amp;\u0026amp;1 \\end{array}\nRow \\(i\\) of this array corresponds to \\(x^{5-i}\\) and all combinations of powers \\(y^bz^c\\) for \\(0\\le b,c\\le i\\). Thus for example the fourth row down, corresponding to \\( i=3 \\), may be considered as the coefficients of the terms\n\\[ x^2y^3,\\quad x^2y^2z,\\quad x^2yz^2,\\quad xz^3. \\]\nNote that the triangle of coefficients is symmetrical along all three centre lines, as well as rotationally symmetric by 120°.\n","link":"https://numbersandshapes.net/posts/trinomial_theorem/","section":"posts","tags":["mathematics","algebra"],"title":"The trinomial theorem"},{"body":"","link":"https://numbersandshapes.net/tags/hugo/","section":"tags","tags":null,"title":"Hugo"},{"body":"","link":"https://numbersandshapes.net/tags/org/","section":"tags","tags":null,"title":"Org"},{"body":"I\u0026rsquo;ve been using wordpress as my blogging platform since I first started, about 10 years ago. (In fact the first post I can find is dated March 30, 2008.) I chose wordpress.com back then because it was (a) free, and (b) supported mathematics through a version (or subset) of LaTeX. As I have used LaTeX extensively for all my writing since the early 1990\u0026rsquo;s, it\u0026rsquo;s a standard requirement for me.\nSome time later I decided to start hosting my own server (well, a VPS), on which I could use wordpress.org, which is the self-hosted version of wordpress. The advantages of a self hosted blog are many, but I particularly like the greater freedom, the ability to include a far greater variety of plugins, and the larger choice of themes. And one of the plugins I liked particularly was WP QuickLaTeX which provided a LaTeX engine far superior to the in-built one of wordpress.com. Math bloggin heaven!\nHowever, hosting my own wordpress site was not without difficulty. First I had to install it and get it up and running (even this was non-trivial), and then I had to manage all the users and passwords: myself as a standard user, wp-admin for accessing the Wordpress site itself, a few others. I have quite a long list containing all the commands I used, and all the users and passwords I created.\nThis served me well, but it was also slow to use. My VPS is perfectly satisfactory, but it is not fast (I\u0026rsquo;m too cheap to pay for much more than a low-powered one), and the edit-save-preview cycle of online blogging with my wordpress installation was getting tiresome.\nPlus the issue of security. I\u0026rsquo;ve been hacked once, and I\u0026rsquo;ve since managed to secure my site with a free certificate from Let\u0026rsquo;s Encrypt. In fact, in many ways Let\u0026rsquo;s Encrypt is one of the best things to have happened for security. An open Certificate Authority is manna from heaven, as far as I\u0026rsquo;m concerned.\nWordpress is of course more than just blogging software. It now grandly styles itself as Site Building software and Content Management System, and the site claims that \u0026ldquo;30% of the web uses Wordpress\u0026rdquo;. It is in fact hugely powerful and deservedly popular, and can be used for pretty much whatever sort of site you want to build. Add to that a seemingly infinite set of plugins, and you have an entire ecosystem of web-building.\nHowever, all of that popularity and power comes at a cost: it is big, confusing, takes work to maintain, keep secure, and keep up-to-date, and is a target for hackers. Also for me, it has become colossal overkill. I don\u0026rsquo;t need all those bells and whistles; all I want to do is host my blog and share my posts with the world (the \\(1.5\\times 10^{-7}\\%\\) of the world who reads it).\nThe kicker for me was checking out a mathematics education blog by an author I admire greatly, to discover it was built with the static blog engine jekyll. So being the inventive bloke I am, I thought I\u0026rsquo;d do the same.\nBut a bit of hunting led me to Hugo, which apparently is very similar to jekyll, but much faster, and written in Go instead of Ruby. Since I know nothing about either Go or Ruby I don\u0026rsquo;t know if it\u0026rsquo;s the language which makes the difference, or something else. But it sure looks nice, and supports mathjax for LaTeX.\nSo my current plan is to migrate from wordpress to Hugo, and see how it goes!\n","link":"https://numbersandshapes.net/posts/playing_with_hugo/","section":"posts","tags":["hugo","org"],"title":"Playing with Hugo"},{"body":"Election mapping A few weeks ago there was a by-election in my local electorate (known as an electoral division) of Batman here in Australia. I was interested in comparing the results of this election with the previous election two years ago. In this division it\u0026rsquo;s become a two-horse race: the Greens against the Australian Labor Party. Although Batman had been a solid Labor seat for almost its entire existence - it used to be considered one of the safest Labor seats in the country - over the past decade or so the Greens have been making inroads into this Labor heartland, to the extent that is no longer considered a safe seat. And in fact for this particular election the Greens were the popular choice to win. In the end Labor won, but my interest is not so much tracing the votes, but trying to map them.\nPython has a vast suite of mapping tools, so much so that it may be that Python has become the GIS tool of choice. And there are lots of web pages devoted to discussing these tools and their uses, such as this one.\nMy interest was producing maps such as are produced by pollbludger This is the image from that page:\n!pollbludger\nAs you can see there are basically three elements:\nthe underlying streetmap the border of the division the numbers showing the percentage wins of each party at the various polling booths. I wanted to do something similar, but replace the numbers with circles whose sizes showed the strength of the percentage win at each place.\nGetting the information Because this election was in a federal division, the management of the polls and of the results (including counting the votes) was managed by the Australian Electoral Commission, whose pages about this by-election contain pretty much all publicly available information. You can copy and paste the results from their pages, or download them as CSV files.\nThen I needed to find the coordinates (Longitude and Latitude) of all the polling places, of which there were 42 at fixed locations. There didn\u0026rsquo;t seem to be a downloadable file for this, so for each booth address (given on the AEC site), I entered it into Google Maps and copied down the coordinates as given.\nThe boundaries of all the divisions can again be downloaded from the AEC GIS page. These are given in various standard GIS files.\nPutting it all together The tools I felt brave enough to use were:\nPandas: Python\u0026rsquo;s data analysis library. I really only needed to read information from CSV files that I could then use later. Geopandas: This is a GIS library with Pandas-like syntax, and is designed in part to be a GIS extension to Pandas. I would use it to extract and manage the boundary data of the electoral division. Cartopy: which is a library of \u0026ldquo;cartographic tools\u0026rdquo;. And of course the standard matplotlib for plotting, numpy for array handling.\nMy guides were the London tube stations example from Cartopy and a local (Australian) data analysis blog which discussed the use of Cartopy including adding graphics to an map image.\nThere are lots of other GIS tools for Python, some of which seem to be very good indeed, and all of which I downloaded:\nFiona: which is a \u0026ldquo;nimble\u0026rdquo; API for handling maps Descartes: which provides a means by which matplotlib can be used to manage geographic objects geoplotlib: for \u0026ldquo;visualizing geographical data and making maps\u0026rdquo; Folium: for visualizing maps using the leaflet.js library. It may be that the mapping I wanted to do with Python could have been done just as well in Javascript alone. And probably other languages. I stuck with Python simply because I knew it best. QGIS: which is designed to be a complete free and open source GIS, and with APIs both for Python and C++ GDAL: the \u0026ldquo;Geospatial Data Abstraction Library\u0026rdquo; which has a Python package also called GDAL, for manipulating geospatial raster and vector data. I suspect that if I was professionally working in the GIS area some or all of these packages would be at least as - and maybe even more - suitable than the ones I ended up using. But then, I was starting from a position of absolute zero with regards to GIS, and also I wanted to be able to make use of the tools I already knew, such as Pandas, matplotlib, and numpy.\nHere\u0026rsquo;s the start, importing the libraries, or the bits of them I needed:\nimport matplotlib.pyplot as plt import numpy as np import cartopy.crs as ccrs from cartopy.io.img_tiles import GoogleTiles import geopandas as gpd import pandas as pd I then had to read in the election data, which was a CSV files from the AEC containing the Booth, and the final distributed percentage weighting to the ALP and Greens candidates, and heir percentage scores. As well, I read in the boundary data:\nbb = pd.read_csv(\u0026#39;Elections/batman_booths_coords.csv\u0026#39;) # contains all election info plus lat, long of booths longs = np.array(bb[\u0026#39;Long\u0026#39;]) lats = np.array(bb[\u0026#39;Lat\u0026#39;]) v = gpd.read_file(\u0026#39;VicMaps/VIC_ELB.MIF\u0026#39;) # all electoral divisions in MapInfo form bg = v.loc[2].geometry # This is the Polygon representing Batman b_longs = bg.exterior.xy[0] # These next two lines are the longitudes and latitudes b_lats = bg.exterior.xy[1] # Notice that bb uses Pandas to read in the CSV files which contains all the AEC information, as well as the latitude and longitude of each Booth, which I\u0026rsquo;d added myself. Here longs and lats are the coordinates of the polling booths, and b_longs and b-lats are all the vertices which form the boundary of the division.\nNow it\u0026rsquo;s all pretty straigtforward, especially with the examples mentioned above:\nfig = plt.figure(figsize=(16,16)) tiler = GoogleTiles() ax = plt.axes(projection=tiler.crs) margin=0.01 ax.set_extent((bg.bounds[0]-margin, bg.bounds[2]+margin,bg.bounds[1]-margin, bg.bounds[3]+margin)) ax.add_image(tiler,12) for i in range(44): plt.plot(longs[i],lats[i],ga2[i],markersize=abs(ga[i]),alpha=0.7,transform=ccrs.Geodetic()) plt.plot(b_longs,b_lats,\u0026#39;k-\u0026#39;,linewidth=5,transform=ccrs.Geodetic()) plt.title(\u0026#39;Booth results in the 2018 Batman by-election\u0026#39;) plt.show() Here GoogleTiles provide the street map to be used as the \u0026ldquo;base\u0026rdquo; of our map. Open Streep Map (as OSM) is available too, but I thin in this instance, Google Maps is better. Because the map is rendered as an image (with some unavoidable blurring), I find that Google gave a better result than OSM.\nAlso, ga2 is a little array which simply produces plotting of the style ro (red circle) or go (green circle). Again, I make the program do most of the work.\nAnd here is the result, saved as an image:\n!Batman 2018\nI\u0026rsquo;m quite pleased with this output. https://www.gnu.org/software/glpk/\n","link":"https://numbersandshapes.net/posts/python_gis/","section":"posts","tags":["GIS","python","voting"],"title":"Python GIS, and election results"},{"body":"Presentations are a modern bugbear. Anybody in academia or business, or any professional field really, will have sat through untold hours of presentations. And almost all of them are terrible. Wordy, uninteresting, too many \u0026ldquo;transition effects\u0026rdquo;, low information content, you know as well as I do.\nPretty much every speaker reads the words on their slides, as though the audience were illiterate. I went to a talk once which consisted of 60 \u0026ndash; yes, sixty \u0026ndash; slides of very dense text, and the presenter read through each one. I think people were gnawing their own limbs off out of sheer boredom by the end. Andy Warhol\u0026rsquo;s \u0026ldquo;Empire\u0026rdquo; would have been a welcome relief.\nSince most of my talks are technical and full of mathematics, I have naturally gravitated to the LaTeX presentation tool Beamer. Now Beamer is a lovely thing for LaTeX: as part of the LaTeX ecosystem you get all of LaTeX loveliness along with elegant slide layouts, transitions, etc. My only issue with Beamer (and this is not a new observation by any means), is that all Beamer presentations have a certain sameness to them. I suspect that this is because most Beamer users are mathematicians, who are rightly more interested in co[[https://orgmode.org][]]ntent than appearance. It is quite possible of course to make Beamer look like something new and different, but hardly anybody does.\nHowever, I am not a mathematician, I am a mathematics educator, and I do like my presentations to look good, and if possible to stand out a little. I also have a minor issue in that I use Linux on my laptop, which sometimes means my computer won\u0026rsquo;t talk to an external projector system. Or my USB thumb drive won\u0026rsquo;t be recognized by the computer I\u0026rsquo;ll be using, and so on. One way round all this is to use an online system; maybe one which can be displayed in a browser, and which can be placed on a web server somewhere. There are of course plenty of such tools, and I have had a brief dalliance with prezi, but for me prezi was not the answer: yes it was fun and provided a new paradigm for organizing slides, but really, when you took the whizz-bang aspect out, what was left? The few prezis I\u0026rsquo;ve seen in the wild showed that you can be as dull with prezi as with any other software. Also, at the time it didn\u0026rsquo;t support mathematics.\nIn fact I have an abiding distrust of the whole concept of \u0026ldquo;presentations\u0026rdquo;. Most are a colossal waste of time \u0026ndash; people can read so there\u0026rsquo;s no need for wordiness, and most of the graphs and charts that make up the rest of most slides are dreary and lacklustre. Hardly anybody knows how to present information graphically in a way that really grabs people\u0026rsquo;s attention. It\u0026rsquo;s lazy and insulting to your audience to simply copy a chart from your spreadsheet and assume they\u0026rsquo;ll be delighted by it. Then you have the large class of people who fill their blank spaces with cute cartoons and clip art. This sort of thing annoys me probably more than it should \u0026ndash; when I\u0026rsquo;m in an audience I don\u0026rsquo;t want to be entertained with cute irrelevant additions, I want to learn. This comes to the heart of presenting. A presenter is acting as a teacher; the audience the learners. So presenting should be about engaging the audience. What\u0026rsquo;s in your slides comes a distant second. I don\u0026rsquo;t want new technology with clever animations and transitions, bookmarks, non-linear slide shows; I want presenters to be themselves interesting. (As an aside, some of the very worst presentations have been at education conferences.)\nFor a superb example of attention-grabbing graphics, check out the TED talk by the late Hans Rosling. Or you can admire the work of David McCandless.\nI seem to have digressed, from talking about presentation software to banging on about the awfulness of presentations generally. So, back to the topic.\nFor a recent conference I determined to do just that: use an online presentation tool, and I chose reveal.js. I reckon reveal.js is presentations done right: elegant, customizable, making the best use of html for content and css for design; and with nicely chosen defaults so that even if you just put a few words on your slides the result will still look good. Even better, you can take your final slides and put them up on github pages so that you can access them from anywhere in the world with a web browser. And if you\u0026rsquo;re going somewhere which is not networked, you can always take your slides on some sort of portable media. And it has access to almost all of LaTeX via MathJax.\nOne minor problem with reveal.js is that the slides are built up with raw html code, and so can be somewhat verbose and hard to read (at least for me). However, there is a companion software for emacs org mode called org-reveal, which enables you to structure your reveal.js presentation as an org file. This is presentation heaven. The org file gives you structure, and reveal.js gives you a lovely presentation.\nTo make it available, you upload all your presentations to github.pages, and you can present from anywhere in the world with an internet connection! You can see an example of one of my short presentations at\nhttps://amca01.github.io/ATCM_talks/lindenmayer.html\nOf course the presentation (the software and what you do with it), is in fact the least part of your talk. By far the most important part is the presenter. The best software in the world won\u0026rsquo;t overcome a boring speaker who can\u0026rsquo;t engage an audience.\nI like my presentations to be simple and effect-free; I don\u0026rsquo;t want the audience to be distracted from my leaping and capering about. Just to see how it works\n","link":"https://numbersandshapes.net/posts/presentations_and_js_reveal/","section":"posts","tags":null,"title":"Presentations and the delight of js-reveal"},{"body":"","link":"https://numbersandshapes.net/tags/cryptography/","section":"tags","tags":null,"title":"Cryptography"},{"body":"","link":"https://numbersandshapes.net/tags/haskell/","section":"tags","tags":null,"title":"Haskell"},{"body":"Programming the Vigenère cipher is my go-to problem when learning a new language. It\u0026rsquo;s only ever a few lines of code, but it\u0026rsquo;s a pleasant way of getting to grips with some of the basics of syntax. For the past few weeks I\u0026rsquo;ve been wrestling with Haskell, and I\u0026rsquo;ve now got to the stage where a Vigenère program is in fact pretty easy.\nAs you know, the Vigenère cipher works using a plaintext and a keyword, which is repeated as often as need be:\nT H I S I S T H E P L A I N T E X T K E Y K E Y K E Y K E Y K E Y K E Y The corresponding letters are added modulo 26 (using the values A=0, B=1, C=2, and on up to Z=25), then converted back to letters again. So for the example above, we have these corresponding values:\n19 7 8 18 8 18 19 7 4 15 11 0 8 13 19 4 23 19 10 4 24 10 4 24 10 4 24 10 4 24 10 4 24 10 4 24 Adding modulo 26 and converting back to letters:\n3 11 6 2 12 16 3 11 2 25 15 24 18 17 17 D L G C M Q D L C Z P Y S R R gives us the ciphertext.\nThe Vigenère cipher is historically important as it is one of the first cryptosystems where a single letter may be encrypted to different characters in the ciphertext. For example, the two \u0026ldquo;S\u0026quot;s are encrypted to \u0026ldquo;C\u0026rdquo; and \u0026ldquo;Q\u0026rdquo;; the first and last \u0026ldquo;T\u0026quot;s are encrypted to \u0026ldquo;D\u0026rdquo; and \u0026ldquo;R\u0026rdquo;. For this reason the cipher was considered unbreakable - as indeed it was for a long time - and was known to the French as le chiffre indéchiffrable - the unbreakable cipher. It was broken in 1863. See the Wikipedia page for more history.\nSuppose the length of the keyword is . Then the -th character of the plaintext will correspond to the character of the keyword (assuming a zero-based indexing). Thus the encryption can be defined as\n\\[ c_i = p_i+k_{i\\pmod{n}}\\pmod{26} \\]\nHowever, encryption can also be done without knowing the length of the keyword, but by shifting the keyword each time - first letter to the end - and simply taking the left-most letter. Like this:\nT H I S I S T H E P L A I N T E X T K E Y so \u0026ldquo;T\u0026rdquo;+\u0026ldquo;K\u0026rdquo; (modulo 26) is the first encryption. Then we shift the keyword:\nT H I S I S T H E P L A I N T E X T E Y K and \u0026ldquo;H\u0026rdquo;+\u0026ldquo;E\u0026rdquo; (modulo 26) is the second encrypted letter. Shift again:\nT H I S I S T H E P L A I N T E X T Y K E for \u0026ldquo;I\u0026rdquo;+\u0026ldquo;Y\u0026rdquo;; shift again:\nT H I S I S T H E P L A I N T E X T K E Y for \u0026ldquo;S\u0026rdquo;+\u0026ldquo;K\u0026rdquo;. And so on.\nThis is almost trivial in Haskell. We need two extra functions from the module Data.Char: chr which gives the character corresponding to the ascii value, and ord which gives the ascii value of a character:\nλ\u0026gt; ord \u0026#39;G\u0026#39; 71 λ\u0026gt; chr 88 \u0026#39;X\u0026#39; So here\u0026rsquo;s what might go into a little file called vigenere.hs:\nimport Data.Char (ord,chr) vige :: [Char] -\u0026gt; [Char] -\u0026gt; [Char] vige [] k = [] vige p [] = [] vige (p:ps) (k:ks) = (encode p k):(vige ps (ks++[k])) where encode a b = chr $ 65 + mod (ord a + ord b) 26 vigd :: [Char] -\u0026gt; [Char] -\u0026gt; [Char] vigd [] k = [] vigd p [] = [] vigd (p:ps) (k:ks) = (decode p k):(vigd ps (ks++[k])) where decode a b = chr $ 65 + mod (ord a - ord b) 26 And a couple of tests: the example from above, and the one on the Wikipedia page:\nλ\u0026gt; vige \u0026#34;THISISTHEPLAINTEXT\u0026#34; \u0026#34;KEY\u0026#34; \u0026#34;DLGCMQDLCZPYSRROBR\u0026#34; λ\u0026gt; vige \u0026#34;ATTACKATDAWN\u0026#34; \u0026#34;LEMON\u0026#34; \u0026#34;LXFOPVEFRNHR\u0026#34; ","link":"https://numbersandshapes.net/posts/vigenere_cipher_haskell/","section":"posts","tags":["cryptography","haskell"],"title":"The Vigenere cipher in haskell"},{"body":"On November 18, 2017, a by-election was held in my suburb of Northcote, on account of the death by cancer of the sitting member. It turned into a two-way contest between Labor (who had held the seat since its inception in 1927), and the Greens, who are making big inroads into the inner city. The Greens candidate won, much to Labor\u0026rsquo;s surprise. As I played a small part in this election, I had some interest in its result. And so I thought I\u0026rsquo;d experiment with the results and see how close the result was, and what other voting systems might have produced.\nIn Australia, the voting method used for almost all lower house elections (state and federal), is Instant Runoff Voting, also known as the Alternative Vote, and known locally as the \u0026ldquo;preferential method\u0026rdquo;. Each voter must number the candidates sequentially starting from 1. All boxes must be filled in (except the last); no numbers can be repeated or missed. In Northcote there were 12 candidates, and so each voter had to number the boxes from 1 to 12 (or 1 to 11); any vote without those numbers is invalid and can\u0026rsquo;t be counted. Such votes are known as \u0026ldquo;informal\u0026rdquo;. Ballots are distributed according to first preferences. If no candidate has obtained an absolute majority, then the candidate with the lowest count is eliminated, and all those ballots distributed according to their second preferences. This continues through as many redistributions as necessary until one candidate ends up with an absolute majority of ballots. So at any stage the candidate with the lowest number of ballots is eliminated, and those ballots redistributed to the remaining candidates on the basis of the highest preferences. As voting systems go it\u0026rsquo;s not the worst, although it has many faults. However, it is too entrenched in Australian political life for change to be likely.\nEach candidate had prepared a How to Vote card, listing the order of candidates they saw as being most likely to ensure a good result for themselves. In fact there is no requirement for any voter to follow a How to Vote card, but most voters do. For this reason the ordering of candidates on these cards is taken very seriously, and one of the less savoury aspects of Australian politics is backroom \u0026ldquo;preference deals\u0026rdquo;, where parties will wheel and deal to ensure best possible preference positions on other How to Vote cards.\nHere are the 12 candidates and their political parties, in the order as listed on the ballots:\nAttention: The internal data of table \u0026ldquo;4\u0026rdquo; is corrupted!\nFor this election the How to Vote cards can be seen at the ABC news site. The only candidate not to provide a full ordered list was Joseph Toscano, who simple advised people to number his square 1, and the other squares in any order they liked, along with a recommendation for people to number Lidia Thorpe 2.\nAs I don\u0026rsquo;t have a complete list of all possible ballots with their orderings and numbers, I\u0026rsquo;m going to make the following assumptions:\nEvery voter followed the How to Vote card of their preferred candidate exactly. Joseph Toscano\u0026rsquo;s preference ordering is: 3,4,2,5,6,7,8,9,1,10,11,12 (This gives Toscano 1; Thorpe 2; and puts the numbers 3 \u0026ndash; 12 in order in the remaining spaces). These assumptions are necessarily crude, and don\u0026rsquo;t reflect the nuances of the election. But as we\u0026rsquo;ll see they end up providing a remarkably close fit with the final results.\nFor the exploration of the voting data I\u0026rsquo;ll use Python, and so here is all the How to Vote information as a dictionary:\nIn [ ]: htv = dict() htv[\u0026#39;Hayward\u0026#39;]=[1,10,7,6,8,5,12,11,3,2,4,9] htv[\u0026#39;Sanaghan\u0026#39;]=[3,1,2,5,6,7,8,9,10,11,12,4] htv[\u0026#39;Thorpe\u0026#39;]=[6,9,1,3,10,8,12,2,7,4,5,11] htv[\u0026#39;Lenk\u0026#39;]=[7,8,3,1,5,11,12,2,9,4,6,10] htv[\u0026#39;Chipp\u0026#39;]=[10,12,4,5,1,6,7,3,11,9,2,8] htv[\u0026#39;Cooper\u0026#39;]=[5,12,8,6,2,1,7,3,11,9,10,4] htv[\u0026#39;Rossiter\u0026#39;]=[6,12,9,11,2,7,1,5,8,10,3,4] htv[\u0026#39;Burns\u0026#39;]=[10,12,5,3,2,4,6,1,11,9,8,7] htv[\u0026#39;Toscano\u0026#39;]=[3,4,2,5,6,7,8,9,1,10,11,12] htv[\u0026#39;Edwards\u0026#39;]=[2,10,4,3,8,9,12,6,5,1,7,11] htv[\u0026#39;Spirovska\u0026#39;]=[2,12,3,7,4,5,6,8,10,9,1,11] htv[\u0026#39;Fontana\u0026#39;]=[2,3,4,5,6,7,8,9,10,11,12,1] In [ ]: cands = list(htv.keys()) voting took place at different voting centres (also known as \u0026ldquo;booths\u0026rdquo;), and the first preferences for each candidate at each booth can be found at the Victorian Electoral Commission. I copied this information into a spreadsheet and saved it as a CSV file. I then used the data analysis library pandas to read it in as a DataFrame:\nIn [ ]: import pandas as pd firstprefs = pd.read_csv(\u0026#39;northcote_results.csv\u0026#39;) firsts = firstprefs.loc[:,\u0026#39;Hayward\u0026#39;:\u0026#39;Fontana\u0026#39;].sum(axis=0) firsts Out[ ]: Hayward 354 Sanaghan 208 Thorpe 16254 Lenk 770 Chipp 1149 Cooper 433 Rossiter 1493 Burns 12721 Toscano 329 Edwards 154 Spirovska 214 Fontana 1857 dtype: int64 As Thorpe has more votes than any other candidate, then by the voting system of simple plurality (or First Past The Post) she would win. This system is used in the USA, and is possibly the worst of all systems for more than two candidates.\nChecking IRV So let\u0026rsquo;s first check how IRV works, with a little program that starts with a dictionary and first preferences of each candidate. Recall our simplifying assumption that all voters vote according to the How to Vote cards, which means that when a candidate is eliminated, all those votes will go to just one other remaining candidate. In practice, of course, those ballots would be redistributed across a number of candidates.\nHere\u0026rsquo;s a simple program to manage this version of IRV:\ndef IRV(votes): # performs an IRV simulation on a list of first preferences: at each stage # deleting the candidate with the lowest current score, and distributing # that candidates votes to the highest remaining candidate vote_counts = votes.copy() for i in range(10): m = min(vote_counts.items(), key = lambda x: x[1]) ind = next(j for j in range(2,11) if cands[htv[m[0]].index(j)] in vote_counts) c = cands[htv[m[0]].index(ind)] vote_counts += m[1] del(vote_counts[m[0]]) return(vote_counts) We could make this code a little more efficient by stopping when any candidate has amassed over 50% pf the votes. But for simplicity we\u0026rsquo;ll eliminate 10 of the 12 candidates, so it will be perfectly clear who has won. Let\u0026rsquo;s try it out:\nIn [ ]: IRV(firsts) Out[ ]: Thorpe 18648 Burns 17288 dtype: int64 Note that this is very close to the results listed on the VEC site:\nThorpe: 18380 Burns: 14410 Fontana: 3298 At this stage it doesn\u0026rsquo;t matter where Fontana\u0026rsquo;s votes go (in fact they would go to Burns), as Thorpe already has a majority. But the result we obtained above with our simplifying assumptions gives very similar values.\nNow lets see what happens if we work through each booth independently:\nIn [ ]: finals = {\u0026#39;Thorpe\u0026#39;:0,\u0026#39;Burns\u0026#39;:0} In [ ]: for i in firstprefs.index: ...: booth = dict(firstprefs.loc[i,\u0026#39;Hayward\u0026#39;:\u0026#39;Fontana\u0026#39;]) ...: f = IRV(booth) ...: finals[\u0026#39;Thorpe\u0026#39;] += f[\u0026#39;Thorpe\u0026#39;] ...: finals[\u0026#39;Burns\u0026#39;] += f[\u0026#39;Burns\u0026#39;] ...: print(firstprefs.loc[i,\u0026#39;Booth\u0026#39;],\u0026#39;: \u0026#39;,f) ...: Alphington : {\u0026#39;Thorpe\u0026#39;: 524, \u0026#39;Burns\u0026#39;: 545} Alphington North : {\u0026#39;Thorpe\u0026#39;: 408, \u0026#39;Burns\u0026#39;: 485} Bell : {\u0026#39;Thorpe\u0026#39;: 1263, \u0026#39;Burns\u0026#39;: 893} Croxton : {\u0026#39;Thorpe\u0026#39;: 950, \u0026#39;Burns\u0026#39;: 668} Darebin Parklands : {\u0026#39;Thorpe\u0026#39;: 180, \u0026#39;Burns\u0026#39;: 204} Fairfield : {\u0026#39;Thorpe\u0026#39;: 925, \u0026#39;Burns\u0026#39;: 742} Northcote : {\u0026#39;Thorpe\u0026#39;: 1043, \u0026#39;Burns\u0026#39;: 875} Northcote North : {\u0026#39;Thorpe\u0026#39;: 1044, \u0026#39;Burns\u0026#39;: 1012} Northcote South : {\u0026#39;Thorpe\u0026#39;: 1392, \u0026#39;Burns\u0026#39;: 1137} Preston South : {\u0026#39;Thorpe\u0026#39;: 677, \u0026#39;Burns\u0026#39;: 639} Thornbury : {\u0026#39;Thorpe\u0026#39;: 1158, \u0026#39;Burns\u0026#39;: 864} Thornbury East : {\u0026#39;Thorpe\u0026#39;: 1052, \u0026#39;Burns\u0026#39;: 804} Thornbury South : {\u0026#39;Thorpe\u0026#39;: 1310, \u0026#39;Burns\u0026#39;: 1052} Westgarth : {\u0026#39;Thorpe\u0026#39;: 969, \u0026#39;Burns\u0026#39;: 536} Postal Votes : {\u0026#39;Thorpe\u0026#39;: 1509, \u0026#39;Burns\u0026#39;: 2262} Early Votes : {\u0026#39;Thorpe\u0026#39;: 5282, \u0026#39;Burns\u0026#39;: 3532} In [ ]: finals Out[ ]: {\u0026#39;Burns\u0026#39;: 16250, \u0026#39;Thorpe\u0026#39;: 19686} Note again that the results are surprisingly close to the \u0026ldquo;two-party preferred\u0026rdquo; results as reported again on the VEC site. This adds weight to the notion that our assumptions, although crude, do in fact provide a reasonable way of experimenting with the election results.\nBorda counts These are named for Jean Charles de Borda (1733 \u0026ndash; 1799) an early voting theorist. The idea is to weight all the preferences, so that a preference of 1 has a higher weighting that a preference of 2, and so on. All the weights are added, and the candidate with the greatest total is deemed to be the winner. With candidates, there are different methods of determining weighting; probably the most popular is a simple linear weighting, so that a preference of is weighted as . This gives weightings from down to zero. Alternatively a weighting of can be used, which gives weights of down to\nBoth are equivalent in determining a winner. Another possible weighting is .\nHere\u0026rsquo;s a program to compute Borda counts, again with our simplification:\ndef borda(x): # x is 0 or 1 borda_count = dict() for c in cands: borda_count=0.0 for c in cands: v = firsts # number of 1st pref votes for candidate c for i in range(1,13): appr = cands[htv.index(i)] # the candidate against position i on c htv card if x==0: borda_count[appr] += v/i else: borda_count[appr] += v*(11-i) if x==0: for k, val in borda_count.items(): borda_count[k] = float(\u0026#34;{:.2f}\u0026#34;.format(val)) else: for k, val in borda_count.items(): borda_count[k] = int(val) return(borda_count) Now we can run this, and to make our lives easier we\u0026rsquo;ll sort the results:\nIn [ ]: sorted(borda(1).items(), key = lambda x: x[1], reverse = True) Out[ ]: [(\u0026#39;Burns\u0026#39;, 308240), (\u0026#39;Thorpe\u0026#39;, 279392), (\u0026#39;Lenk\u0026#39;, 266781), (\u0026#39;Chipp\u0026#39;, 179179), (\u0026#39;Cooper\u0026#39;, 167148), (\u0026#39;Spirovska\u0026#39;, 165424), (\u0026#39;Edwards\u0026#39;, 154750), (\u0026#39;Hayward\u0026#39;, 136144), (\u0026#39;Fontana\u0026#39;, 88988), (\u0026#39;Toscano\u0026#39;, 80360), (\u0026#39;Rossiter\u0026#39;, 75583), (\u0026#39;Sanaghan\u0026#39;, 38555)] In [ ]: sorted(borda(0).items(), key = lambda x: x[1], reverse = True) Out[ ]: [(\u0026#39;Burns\u0026#39;, 22409.53), (\u0026#39;Thorpe\u0026#39;, 20455.29), (\u0026#39;Lenk\u0026#39;, 11485.73), (\u0026#39;Chipp\u0026#39;, 10767.9), (\u0026#39;Spirovska\u0026#39;, 6611.22), (\u0026#39;Cooper\u0026#39;, 6592.5), (\u0026#39;Edwards\u0026#39;, 6569.93), (\u0026#39;Hayward\u0026#39;, 6186.93), (\u0026#39;Fontana\u0026#39;, 6006.25), (\u0026#39;Rossiter\u0026#39;, 5635.08), (\u0026#39;Toscano\u0026#39;, 4600.15), (\u0026#39;Sanaghan\u0026#39;, 4196.47)] Note that in both cases Burns has the highest output. This is in general to be expected of Borda counts: that the highest value does not necessarily correspond to the candidate which is seen as better overall. For this reason Borda counts are rarely used in modern systems, although they can be used to give a general picture of an electorate.\nCondorcet criteria There are a vast number of voting systems which treat the vote as simultaneous pairwise contests. For example in a three way contest, between Alice, Bob, and Charlie the system considers the contest between Alice and Bob, between Alice and Charlie, and between Bob and Charlie. Each of these contests will produce a winner, and the outcome of all the pairwise contests is used to determine the overall winner. If there is a single person who is preferred, by a majority of voters, in each of their pairwise contests, then that person is called a Condorcet winner. This is named for the Marquis de Condorcet (1743 \u0026ndash; 1794) another early voting theorist. The Condorcet criterion is one of many criteria considered appropriate for a voting system; it says that if the ballots return a Condorcet winner, then that winner should be chosen by the system. This is one of the faults of IRV: that it does not necessarily return a Condorcet winner.\nLet\u0026rsquo;s look again at the How to Vote preferences, and the numbers of voters of each:\nIn [ ]: htvd = pd.DataFrame(list(htv.values()),index=htv.keys(),columns=htv.keys()).transpose() In [ ]: htvd.loc[\u0026#39;Firsts\u0026#39;]=list(firsts.values) In [ ]: htvd Out[ ]: Hayward Sanaghan Thorpe Lenk Chipp Cooper Rossiter Burns Toscano Edwards Spirovska Fontana Hayward 1 3 6 7 10 5 6 10 3 2 2 2 Sanaghan 10 1 9 8 12 12 12 12 4 10 12 3 Thorpe 7 2 1 3 4 8 9 5 2 4 3 4 Lenk 6 5 3 1 5 6 11 3 5 3 7 5 Chipp 8 6 10 5 1 2 2 2 6 8 4 6 Cooper 5 7 8 11 6 1 7 4 7 9 5 7 Rossiter 12 8 12 12 7 7 1 6 8 12 6 8 Burns 11 9 2 2 3 3 5 1 9 6 8 9 Toscano 3 10 7 9 11 11 8 11 1 5 10 10 Edwards 2 11 4 4 9 9 10 9 10 1 9 11 Spirovska 4 12 5 6 2 10 3 8 11 7 1 12 Fontana 9 4 11 10 8 4 4 7 12 11 11 1 Firsts 354 208 16254 770 1149 433 1493 12721 329 154 214 1857 Here the how to vote information is in the columns. If we look at just the first two candidates, we see that Hayward is preferred to Sanaghan by all voters except for those who voted for Sanaghan. Thus a majority (in fact, nearly all) voters preferred Hayward to Sanaghan.\nFor each pair of candidates, the number of voters preferring one to the other can be computed by this program:\ndef condorcet(): condorcet_table = pd.DataFrame(columns=cands,index=cands).fillna(0) for c in cands: hc = htv for i in range(12): for j in range(12): if hc[i] \u0026amp;lt; hc[j]: condorcet_table.loc[cands[i],cands[j]] += firsts return(condorcet_table) We can see the results of this program:\nIn [ ]: ct = condorcet(); ct Out[ ]: Hayward Sanaghan Thorpe Lenk Chipp Cooper Rossiter Burns Toscano Edwards Spirovska Fontana Hayward 0 35728 4505 5042 19370 21633 20573 3116 35607 4888 3335 18283 Sanaghan 208 0 2065 2394 18648 3164 19926 2748 2835 2394 2394 17715 Thorpe 31431 33871 0 21504 20140 20935 34010 19370 33760 35428 32726 32153 Lenk 30894 33542 14432 0 19926 33442 34229 3886 33760 33935 32726 31945 Chipp 16566 17288 15796 16010 0 18895 34443 6037 18845 18404 18960 33871 Cooper 14303 32772 15001 2494 17041 0 34443 3395 18075 18404 15548 31608 Rossiter 15363 16010 1926 1707 1493 1493 0 4101 18075 18404 17041 15906 Burns 32820 33188 16566 32050 29899 32541 31835 0 35099 35428 32726 32024 Toscano 329 33101 2176 2176 17091 17861 17861 837 0 3887 2902 18075 Edwards 31048 33542 508 2001 17532 17532 17532 508 32049 0 20359 18075 Spirovska 32601 33542 3210 3210 16976 20388 18895 3210 33034 15577 0 20717 Fontana 17653 18221 3783 3991 2065 4328 20030 3912 17861 17861 15219 0 What we want to see, of course, if anybody has obtained a majority of preferences against everybody else. To do this we can find all the values greater than the majority, and add up their number. A value of 11 indicates a Condorcet winner:\nIn [ ]: maj = firsts.sum()//2 + 1; maj Out[ ]: 17969 In [ ]: ((ct \u0026amp;gt;= maj)*1).sum(axis = 1) Out[ ]: Hayward 6.0 Sanaghan 2.0 Thorpe 11.0 Lenk 9.0 Chipp 6.0 Cooper 5.0 Rossiter 2.0 Burns 10.0 Toscano 2.0 Edwards 5.0 Spirovska 6.0 Fontana 2.0 dtype: float64 So in this case we do indeed have a Condorcet winner in Thorpe, and this election (at least with our simplifying assumptions) is also one in which IRV returned the Condorcet winner.\nRange and approval voting If you go to rangevoting.org you\u0026rsquo;ll find a nspirited defense of a system called range voting. To vote in such a system, each voter gives an \u0026ldquo;approval weight\u0026rdquo; for each candidate. For example, the voter may mark off a value between 0 and 10 against each candidate, indicating their level of approval. There is no requirement for a voter to mark candidates differently: a voter might give all candidates a value of 10, or of zero, or give one candidate 10 and all the others zero. One simplified version of range voting is approval voting, where the voter simply indicates as many or as few candidates as she or he approves of. A voter may approve of just one candidate, or all of them. As with range voting, the winner is the one with the maximum number of approvals. A system where each voter approves of just one candidate is the First Past the Post system, and as we have seen previously, this is equivalent to simply counting only the first preferences of our ballots.\nWe can\u0026rsquo;t possibly know how voters may have approved of the candidates, but we can run a simple simulation: given a number between 1 and 12, suppose that each voter approves of their first preferences. Given the preferences and numbers, we can easily tally the approvals for each voter:\ndef approvals(n): # Determines the approvals result if voters took their # first n preferences as approvals approvals_result = dict() for c in cands: approvals_result = 0 firsts = firstprefs.loc[:,\u0026#39;Hayward\u0026#39;:\u0026#39;Fontana\u0026#39;].sum(axis=0) for c in cands: v = firsts # number of 1st pref votes for candidate c for i in range(1,n+1): appr = cands[htv.index(i)] # the candidate against position i on c htv card approvals_result[appr] += v return(approvals_result) Now we can see what happens with approvals for :\nIn [1 ]: for i in range(1,7): ...: si = sorted(approvals(i).items(),key = lambda x: x[1],reverse=True) ...: print([i]+[s[0] for s in si]) ...: [1, \u0026#39;Thorpe\u0026#39;, \u0026#39;Burns\u0026#39;, \u0026#39;Fontana\u0026#39;, \u0026#39;Rossiter\u0026#39;, \u0026#39;Chipp\u0026#39;, \u0026#39;Lenk\u0026#39;, \u0026#39;Cooper\u0026#39;, \u0026#39;Hayward\u0026#39;, \u0026#39;Toscano\u0026#39;, \u0026#39;Spirovska\u0026#39;, \u0026#39;Sanaghan\u0026#39;, \u0026#39;Edwards\u0026#39;] [2, \u0026#39;Burns\u0026#39;, \u0026#39;Thorpe\u0026#39;, \u0026#39;Chipp\u0026#39;, \u0026#39;Hayward\u0026#39;, \u0026#39;Fontana\u0026#39;, \u0026#39;Rossiter\u0026#39;, \u0026#39;Spirovska\u0026#39;, \u0026#39;Lenk\u0026#39;, \u0026#39;Edwards\u0026#39;, \u0026#39;Cooper\u0026#39;, \u0026#39;Toscano\u0026#39;, \u0026#39;Sanaghan\u0026#39;] [3, \u0026#39;Burns\u0026#39;, \u0026#39;Lenk\u0026#39;, \u0026#39;Thorpe\u0026#39;, \u0026#39;Chipp\u0026#39;, \u0026#39;Hayward\u0026#39;, \u0026#39;Spirovska\u0026#39;, \u0026#39;Sanaghan\u0026#39;, \u0026#39;Fontana\u0026#39;, \u0026#39;Rossiter\u0026#39;, \u0026#39;Toscano\u0026#39;, \u0026#39;Edwards\u0026#39;, \u0026#39;Cooper\u0026#39;] [4, \u0026#39;Burns\u0026#39;, \u0026#39;Lenk\u0026#39;, \u0026#39;Thorpe\u0026#39;, \u0026#39;Edwards\u0026#39;, \u0026#39;Chipp\u0026#39;, \u0026#39;Cooper\u0026#39;, \u0026#39;Fontana\u0026#39;, \u0026#39;Spirovska\u0026#39;, \u0026#39;Hayward\u0026#39;, \u0026#39;Sanaghan\u0026#39;, \u0026#39;Rossiter\u0026#39;, \u0026#39;Toscano\u0026#39;] [5, \u0026#39;Thorpe\u0026#39;, \u0026#39;Lenk\u0026#39;, \u0026#39;Burns\u0026#39;, \u0026#39;Spirovska\u0026#39;, \u0026#39;Edwards\u0026#39;, \u0026#39;Chipp\u0026#39;, \u0026#39;Cooper\u0026#39;, \u0026#39;Fontana\u0026#39;, \u0026#39;Hayward\u0026#39;, \u0026#39;Sanaghan\u0026#39;, \u0026#39;Rossiter\u0026#39;, \u0026#39;Toscano\u0026#39;] [6, \u0026#39;Lenk\u0026#39;, \u0026#39;Thorpe\u0026#39;, \u0026#39;Burns\u0026#39;, \u0026#39;Hayward\u0026#39;, \u0026#39;Spirovska\u0026#39;, \u0026#39;Chipp\u0026#39;, \u0026#39;Edwards\u0026#39;, \u0026#39;Cooper\u0026#39;, \u0026#39;Rossiter\u0026#39;, \u0026#39;Fontana\u0026#39;, \u0026#39;Sanaghan\u0026#39;, \u0026#39;Toscano\u0026#39;] It\u0026rsquo;s remarkable, that after , the first number of approvals required for Thorpe again to win is .\nOther election methods There are of course many many other methods of selecting a winning candidate from ordered ballots. And each of them has advantages and disadvantages. Some of the disadvantages are subtle (although important); others have glaring inadequacies, such as first past the post for more than two candidates. One such comparison table lists voting methods against standard criteria. Note that IRV \u0026ndash; the Australian preferential system \u0026ndash; is one of the very few methods to fail monotonicity. This is seen as one of the system\u0026rsquo;s worst failings. You can see an example of this in an old blog post.\nRather than write our own programs, we shall simply dump our information into the Ranked-ballot voting calculator page and see what happens. First the data needs to be massaged into an appropriate form:\nIn [ ]: for c in cands: ...: st = str(firsts)+\u0026#34;:\u0026#34;+c ...: for i in range(2,13): ...: st += \u0026#34;\u0026amp;gt;\u0026#34;+cands[htv.index(i)] ...: print(st) ...: 354:Hayward\u0026amp;gt;Edwards\u0026amp;gt;Toscano\u0026amp;gt;Spirovska\u0026amp;gt;Cooper\u0026amp;gt;Lenk\u0026amp;gt;Thorpe\u0026amp;gt;Chipp\u0026amp;gt;Fontana\u0026amp;gt;Sanaghan\u0026amp;gt;Burns\u0026amp;gt;Rossiter 208:Sanaghan\u0026amp;gt;Thorpe\u0026amp;gt;Hayward\u0026amp;gt;Fontana\u0026amp;gt;Lenk\u0026amp;gt;Chipp\u0026amp;gt;Cooper\u0026amp;gt;Rossiter\u0026amp;gt;Burns\u0026amp;gt;Toscano\u0026amp;gt;Edwards\u0026amp;gt;Spirovska 16254:Thorpe\u0026amp;gt;Burns\u0026amp;gt;Lenk\u0026amp;gt;Edwards\u0026amp;gt;Spirovska\u0026amp;gt;Hayward\u0026amp;gt;Toscano\u0026amp;gt;Cooper\u0026amp;gt;Sanaghan\u0026amp;gt;Chipp\u0026amp;gt;Fontana\u0026amp;gt;Rossiter 770:Lenk\u0026amp;gt;Burns\u0026amp;gt;Thorpe\u0026amp;gt;Edwards\u0026amp;gt;Chipp\u0026amp;gt;Spirovska\u0026amp;gt;Hayward\u0026amp;gt;Sanaghan\u0026amp;gt;Toscano\u0026amp;gt;Fontana\u0026amp;gt;Cooper\u0026amp;gt;Rossiter 1149:Chipp\u0026amp;gt;Spirovska\u0026amp;gt;Burns\u0026amp;gt;Thorpe\u0026amp;gt;Lenk\u0026amp;gt;Cooper\u0026amp;gt;Rossiter\u0026amp;gt;Fontana\u0026amp;gt;Edwards\u0026amp;gt;Hayward\u0026amp;gt;Toscano\u0026amp;gt;Sanaghan 433:Cooper\u0026amp;gt;Chipp\u0026amp;gt;Burns\u0026amp;gt;Fontana\u0026amp;gt;Hayward\u0026amp;gt;Lenk\u0026amp;gt;Rossiter\u0026amp;gt;Thorpe\u0026amp;gt;Edwards\u0026amp;gt;Spirovska\u0026amp;gt;Toscano\u0026amp;gt;Sanaghan 1493:Rossiter\u0026amp;gt;Chipp\u0026amp;gt;Spirovska\u0026amp;gt;Fontana\u0026amp;gt;Burns\u0026amp;gt;Hayward\u0026amp;gt;Cooper\u0026amp;gt;Toscano\u0026amp;gt;Thorpe\u0026amp;gt;Edwards\u0026amp;gt;Lenk\u0026amp;gt;Sanaghan 12721:Burns\u0026amp;gt;Chipp\u0026amp;gt;Lenk\u0026amp;gt;Cooper\u0026amp;gt;Thorpe\u0026amp;gt;Rossiter\u0026amp;gt;Fontana\u0026amp;gt;Spirovska\u0026amp;gt;Edwards\u0026amp;gt;Hayward\u0026amp;gt;Toscano\u0026amp;gt;Sanaghan 329:Toscano\u0026amp;gt;Thorpe\u0026amp;gt;Hayward\u0026amp;gt;Sanaghan\u0026amp;gt;Lenk\u0026amp;gt;Chipp\u0026amp;gt;Cooper\u0026amp;gt;Rossiter\u0026amp;gt;Burns\u0026amp;gt;Edwards\u0026amp;gt;Spirovska\u0026amp;gt;Fontana 154:Edwards\u0026amp;gt;Hayward\u0026amp;gt;Lenk\u0026amp;gt;Thorpe\u0026amp;gt;Toscano\u0026amp;gt;Burns\u0026amp;gt;Spirovska\u0026amp;gt;Chipp\u0026amp;gt;Cooper\u0026amp;gt;Sanaghan\u0026amp;gt;Fontana\u0026amp;gt;Rossiter 214:Spirovska\u0026amp;gt;Hayward\u0026amp;gt;Thorpe\u0026amp;gt;Chipp\u0026amp;gt;Cooper\u0026amp;gt;Rossiter\u0026amp;gt;Lenk\u0026amp;gt;Burns\u0026amp;gt;Edwards\u0026amp;gt;Toscano\u0026amp;gt;Fontana\u0026amp;gt;Sanaghan 1857:Fontana\u0026amp;gt;Hayward\u0026amp;gt;Sanaghan\u0026amp;gt;Thorpe\u0026amp;gt;Lenk\u0026amp;gt;Chipp\u0026amp;gt;Cooper\u0026amp;gt;Rossiter\u0026amp;gt;Burns\u0026amp;gt;Toscano\u0026amp;gt;Edwards\u0026amp;gt;Spirovska \u0026lt;/pre\u0026gt; The above can be copied and pasted into the given text box. Then the page returns:\nwinner method(s) Thorpe Baldwin Black Carey Coombs Copeland Dodgson Hare Nanson Raynaud Schulze Simpson Small Tideman Burns Borda Bucklin You can see that Thorpe would be the winner under almost every other voting system. This indicates that Thorpe being returned by IRV seems not just an artifact of the system, but represents the genuine wishes of the electorate.\n","link":"https://numbersandshapes.net/posts/analysis_recent_election/","section":"posts","tags":["voting","python"],"title":"Analysis of a recent election"},{"body":"Every few years I decide to have a go at using a CAD package for the creation of 3D diagrams and shapes, and every time I give it up. There\u0026rsquo;s simply too much to learn in terms of creating shapes, moving them about, and so on, and every system seems to have its own ways of doing things. My son (who is an expert in Blender) recommended that I experiment with Tinkercad, and indeed this is probably a pretty easy way of getting started with 3D CAD. But it didn\u0026rsquo;t suit me: I wanted to place things precisely in relation to each other, and fiddling with dragging and dropping with the mouse was harder and more inconvenient than it should have been. No doubt there are ways of getting exact line ups, but it isn\u0026rsquo;t obvious to the raw beginner.\nI then discovered that there are lots of different CAD \u0026ldquo;programming languages\u0026rdquo;; or more properly scripting languages, where the user describes how the figure is to be built in the system\u0026rsquo;s language. Then the system builds it from the script. In this sense these systems are descendants of the venerable VRML, of which you can see some examples here, and its modern version X3D.\nSome of the systems that I looked at were:\nOpenSCAD, which uses its own scripting language OpenJSCAD, based on JavaScript implicitCAD, based on Haskell, No doubt there are others. All of these systems have primitive shapes (spheres, cubes, cylinders etc), operations on shapes (shifting, stretching, rotating, extruding etc) so a vast array of different forms can be generated. Some systems allow for a great deal of flexibility, so that a cylinder with a radius of zero at one end will be a cone, or of different radii at each end a frustum.\nI ended up choosing OpenJSCAD, which is being actively developed, is based on a well known and robust language, and is also great fun to use. Here is a simple example, to construct a tetrahedron whose vertices are chosen from the vertices of a cube with vertices . The vertices whose product is 1 will be the vertices of a tetrahedron. We can make a nice tetrahedral shape by putting a small sphere at each vertex, and joining each sphere by a cylinder of the same radius:\nThe code should be fairly self-explanatory. And here is the tetrahedron:\nI won\u0026rsquo;t put these models in this post, as one of them is slow to render: but look at a coloured tetrahedron, and an icosahedron.\nNote that CAD design of this sort is not so much for animated media so much as precise designs for 3D printing. But I like it for exploring 3D geometry.\n","link":"https://numbersandshapes.net/posts/programmable-cad/","section":"posts","tags":["CAD"],"title":"Programmable CAD"},{"body":" #\nSelf avoiding walks A self avoiding walk is a path through a graph where no vertex is visited more than once. One problem is to consider the cartesian lattice, and the paths from $(0,0)$ to $(m,n)$. If we only allow paths in the \u0026#34;positive direction\u0026#34;, so that from $(i,j)$ the path can only proceed to $(i+1,j)$ or $(i,j+1)$ we have what are sometimes called \u0026#34;staircase paths\u0026#34;:\nThe number of such paths can be easily shown to be\n\\[ \\dbinom{m+n}{n}. \\]\nSuppose that the number of ways of reaching $(x,y)$ is $N(x,y)$. There is only one way to reach either of $(i,0)$ or $(0,j)$, that is $N(i,0) = N(0,j) = 1$. Also, the number of ways of reaching $(x,y)$ is the sum of the numbers of ways of reaching $(x-1,y)$ and $(x,y-1)$; that is $N(x,y) = N(x-1,y)+N(x,y-1)$.\nWe thus have the following numbers:\n$$\\begin{array}{rrrrrrr} \\vdots \\\\ 1\u0026amp;6\u0026amp;21\u0026amp;56\u0026amp;126\u0026amp;252\\\\ 1\u0026amp;5\u0026amp;15\u0026amp;35\u0026amp;70\u0026amp;126\\\\ 1\u0026amp;4\u0026amp;10\u0026amp;20\u0026amp;35\u0026amp;56\\\\ 1\u0026amp;3\u0026amp;6\u0026amp;10\u0026amp;15\u0026amp;21\\\\ 1\u0026amp;2\u0026amp;3\u0026amp;4\u0026amp;5\u0026amp;6\\\\ 1\u0026amp;1\u0026amp;1\u0026amp;1\u0026amp;1\u0026amp;1\u0026amp;\\ldots \\end{array}$$\nwhich are obviously the binomial coefficients, and with\n\\[ N(x,y) = \\dbinom{x+y}{x}. \\]\nSo for the $10\\times 10$ grid above, the number of staircase walks is\n\\[ \\dbinom{20}{10}=184{,}756. \\]\nIf we allow walks that can go in any direction (with the only restriction being that they stay within the grid, and never visit a lattice point more than once), the number of self avoiding walks becomes very large.\nThere is in fact no known formula for the number of such paths on an $n\\times n$ grid, but the numbers are given as [sequence A007764 of the OEIS](https://oeis.org/A007764), and for a $10\\times 10$ grid there are\n\\[ 1{,}568{,}758{,}030{,}464{,}750{,}013{,}214{,}100 \\]\nsuch paths. That\u0026#39;s about 1.6 heptillion.\nThirty years of the ATCM conference The [Asian Technology Conference in Mathematics](https://atcm.mathandtech.org) is a friendly, genial conference which has been held almost every year since 1996. All the proceedings are freely available on that website, but there is no search method. If you want to find out if anybody has presented a paper about, say, teaching linear algebra with Python, you\u0026#39;re stuck.\nTo help, I created a database of all papers and authors since 1997. The 1996 proceedings only seem to exist as a PDF file for which each page is an image; to create a databse of authors, affiliations, papers, titles and abstracts would require more copying by hand that I want. Maybe some day…\nThis database has been put online using the excellent open-source self-hosted system [Mathesar](https://mathesar.org) - named, you\u0026#39;ll be delighted to know, for Enrico Colantoni\u0026#39;s brilliant character in the utterly brilliant film [Galaxy Quest](https://www.imdb.com/title/tt0177789/).\nMy implementation is +at https://mathesar.numbersandshapes.net, and if you want to play around you can log in with the username guest and password g.u.e.s.t..+ currently off-line, but will be made available again soon (I hope).\nThe database has two tables: one consists of all papers (well, all presentations really), which includes papers, workshops, posters and panel sessions. There is one record for each paper. The other table lists all the authors: there is a record for every author/paper combination. So if a paper has four authors, there will be four records in this table.\nThere are many faults with this database:\nAs much of the information has been scraped from the ATCM website using the Python library [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/), there is text which hasn\u0026#39;t survived the scraping. A single author might exist in different forms, depending on whther initials are used, or various accented letters. For example, \u0026#34;Nguyễn Ngọc Trường Sơn\u0026#34; might appear as \u0026#34;Ngọc Trường Sơn Nguyễn\u0026#34;, or without any of the accents as \u0026#34;Truong Son Ngoc Nguyen\u0026#34; or as \u0026#34;Truong Son N Nguyen\u0026#34; or as \u0026#34;T S N Nguyen\u0026#34;, and so on. This partly depends on the spelling in the proceedings. Ensuring that all authors appear with only one spelling of their name is nearly impssible - at least for me. This in turn makes it impossible to determine precisely the number of different authors and speakers. Some titles and abstracts include mathematics typeset with LaTeX. This generally becomes garbled by the time it makes it into the database. A single author might appear several times with a different affiliation, if they\u0026#39;ve changed jobs, or maybe were on a sabbatical. Thus \u0026#34;Wolfgang\nMozart, University of Vienna\u0026#34;, and \u0026#34;Wolfgang A. Mozart, University of Salzburg\u0026#34;, will appear as 2 different authors.\nThen again, two different people might share the same name. Are \u0026#34;Li Chen, Normal Universty of Beijing\u0026#34;, and \u0026#34;Li Chen, University of Shanghai\u0026#34; different people or not? However, even with those caveats, we can obtain a pretty good idea of the conference statistics since 1997.\nSome numbers There have been 2027 papers published, by about 2013 different authors. (The exact number of authors will be a bit less than this, but I\u0026#39;m hoping not by too much.)\nThe country which has provided the largest number of different authors is China; the following table shows the top ten:\nCountry Number of authors China 244 Malaysia 228 Japan 201 USA 175 Philippines 149 Singapore 90 Australia 81 Taiwan 75 South Korea 72 India 69 The most published author has appeared 37 times. (I\u0026#39;ve got a measly 19). Authors have come from 75 different countries. By region:\nRegion Number of authors Asia 1233 Europe 204 North America 192 Oceania 101 Middle East 77 Africa 17 Independent States 12 The \u0026#34;Independent states\u0026#34; includes Russia and previous members of the USSR.\nThe Aberth-Ehrlich method Recall from the previous post that the Weierstrass-Durand-Kerner (WDK) method finds all roots of a polynomial equation in a manner similar to Newton\u0026#39;s method, and so converges quadratically. The Aberth-Ehrlich method (more commonly called Aberth\u0026#39;s method), is similar, but converges cubically, and as we will see, is similar to Halley\u0026#39;s method for root finding.\n* Derivation from Halley\u0026#39;s method\nHalley\u0026#39;s root finding method is sometimes called the most often rediscovered method in numerical analysis. It is an iterative method defined by\n\\[ x \\leftarrow x - \\frac{f(x)f\u0026#39;(x)}{f\u0026#39;(x)^2 - \\dfrac{1}{2}f(x)f\u0026#39;\u0026#39;(x)}. \\]\nSee the [Wikipedia page](https://en.wikipedia.org/wiki/Halley%27s_method) for the derivation.\nWe can divide through by $f(x)f\u0026#39;(x)$ to obtain the formulation\n\\[ x \\leftarrow x - \\frac{1}{\\dfrac{f\u0026#39;(x)}{f(x)} - \\dfrac{1}{2}\\dfrac{f\u0026#39;\u0026#39;(x)}{f\u0026#39;(x)}} = x - \\left[\\frac{f\u0026#39;(x)}{f(x)} - \\frac{1}{2}\\frac{f\u0026#39;\u0026#39;(x)}{f\u0026#39;(x)}\\right]^{-1}. \\]\nwhich is very slightly easier to write,\nSuppose now we are trying to find all roots simultaneously of a monic quartic polynomial $p(x)$, and that the current approximations are $a,b,c,d$. And as before, define the temporary function\n\\[ t(x) = (x-a)(x-b)(x-c)(x-d). \\]\nThen we have\n$$\\begin{aligned} t\u0026#39;(x) \u0026amp;= (x-a)(x-b)(x-c)+(x-a)(x-b)(x-d)\\\\ \u0026amp;\\qquad +(x-a)(x-c)(x-d)+(x-b)(x-c)(x-d) \\end{aligned}$$\nand\n$$\\begin{aligned} t\u0026#39;\u0026#39;(x) \u0026amp;= 2\\bigl((x-a)(x-b)+(x-a)(x-c)+(x-a)(x-d)\\bigr.\\\\ \u0026amp;\\qquad +\\bigl.(x-b)(x-c)+(x-b)(x-d)+(x-c)(x-d)\\bigr). \\end{aligned}$$\nThen\n$$\\begin{aligned} \\frac{t\u0026#39;\u0026#39;(a)}{t\u0026#39;(a)} \u0026amp;= \\frac{2\\left((x-b)(x-c)+(x-b)(x-d)+(x-c)(x-d)\\right)}{(x-b)(x-c)(x-d)}\\\\ \u0026amp; = 2\\left(\\frac{1}{x-b}+\\frac{1}{x-c}+\\frac{1}{x-d}\\right). \\end{aligned}$$\nSubstituting this into the second fraction in the denominator above produces the iteration:\n\\[ a\\leftarrow a - \\left[\\frac{p\u0026#39;(a)}{p(a)}-\\left(\\frac{1}{x-b}+\\frac{1}{x-c}+\\frac{1}{x-d}\\right)\\right]^{-1}. \\]\nMore generally, if the current approximations are $a_1,a_2,\\ldots,a_n$, then\n\\[ a_k \\leftarrow a_k - \\left[\\frac{p\u0026#39;(a_k)}{p(a_k)}-\\sum\\limits_{\\substack{i=1\\\\i\\ne k}}^n\\frac{1}{a_k-a_i}\\right]^{-1}. \\]\nThis is the Aberth-Ehrlich method.\nIt will be seen that our derivation is quite general, and will work for polynomials of any degree. If the polynonmial is not monic, or if we don\u0026#39;t want to divide through by the leading coefficient, the only difference is that we define\n\\[ t(x) = a_0(x-a)(x-b)(x-c)(x-d) \\]\nwhere $a_0$ is the leading coefficient; in this case the coefficient of $x^4$.\n* Implementation\nAs previously, we\u0026#39;ll use PARI/GP, and write the method in the form\n\\[ a_k \\leftarrow a_k - \\left[\\frac{p\u0026#39;(a_k)}{p(a_k)}-\\frac{1}{2}\\frac{t\u0026#39;\u0026#39;(a_k)}{t\u0026#39;(a_k)}\\right]^{-1} \\]\nwhere $t(x)$ is the product of $x - x_k$ for all current root approximations $a_k$.\nThe function definition is very similar to that for WDK (note that we\u0026#39;ve included the use of the leading coefficient):\naberth(f,eps,N = 200) = { local(deg,xs,df,ys,dt,dfrac,single_diffs,mean_diffs_vector,mean_diffs,count); deg = poldegree(f); lead = pollead(f); xs = vector(deg,k,(0.6 + 0.8*I)^k); \\\\ starting values df = deriv(f); \\\\ derivative of f ys = vector(deg); \\\\ next computed values single_diffs = vector(deg); mean_diffs = 1.0; mean_diffs_vector = Vec([1.0]); count = 1; while(mean_diffs \u0026gt; eps \u0026amp;\u0026amp; count \u0026lt; N, t = lead*prod(k = 1,deg,x - xs[k]); \\\\ create temporary function t(x) dt = deriv(t,x); ddt = deriv(dt,x); \\\\ compute its first two derivatives for(k = 1, deg, dtk = subst(dt,\u0026#39;x,xs[k]); ddtk = subst(ddt,\u0026#39;x,xs[k]); fk = subst(f,\u0026#39;x,xs[k]); dfk = subst(df,\u0026#39;x,xs[k]); ys[k] = xs[k] - 1/(dfk/fk - 0.5*ddtk/dtk); single_diffs[k] = abs(ys[k] - xs[k]); xs[k] = ys[k] ); mean_diffs = vecsum(single_diffs)/deg; mean_diffs_vector = concat(mean_diffs_vector,[mean_diffs]); count = count + 1; ); return([count,mean_diffs_vector,xs]); } To see how fast it converges, we\u0026#39;ll try it out on a fifth degree polynomial:\n\\p2000 s = aberth(x^5 + x^2 - 7,0.1^1500); printoutput(s) which produces:\nnumber of iterations: 12 1.188550919 0.9079189478 0.09240861551 0.001667886271 7.282133417 e-9 6.098846060 e-25 3.582735741 e-73 7.263000773 e-218 6.050900797 e-652 3.498902657 e-1954 1.384316604 + 0.e-3032I 0.5300508632 + 1.457706660I -1.222209165 + 0.7797478348I -1.222209165 - 0.7797478348I 0.5300508632 - 1.457706660I You can see that the accuracy basically triples each time.\nThe Weierstrass-Durand-Kerner method This is a method for finding all the roots of a polynomial simultaneously, by applying a sort of Newton-Raphson method. It gets its name from everybody associated wth it: Weierstrass published a version of the algorithm in 1891, and then rediscovered (independently) by Durand in 1960 and Kerner in 1966, as you can see at its [Wikipedia page](https://en.wikipedia.org/wiki/Durand%E2%80%93Kerner_method).\nIt\u0026#39;s easier to describe by an example:\n\\[ x^4 - 26x^2 - 75x -56 = 0 \\]\nwhich has roots $(5\\pm\\sqrt{57})/2, (-5\\pm\\sqrt{3} i)/2$. Numerically, the roots are\n\\[ 6.27491721763537,\\; -1.27491721763537,\\; -2.5 + 0.866025403784439i,\\; -2.5 - 0.866025403784439i \\]\nFor a fourth degree equation such as above, if one approximation to the roots is given by $x_0, x_1,x_2,x_3$, then the next approximation is given by\n$$\\begin{aligned} y_0 \u0026amp;= x_0 - \\frac{f(x_0)}{(x_0-x_1)(x_0-x_2)(x_0-x_3)}\\\\ y_1 \u0026amp;= x_1 - \\frac{f(x_1)}{(x_1-x_2)(x_1-x_3)(x_1-x_0)}\\\\ y_2 \u0026amp;= x_2 - \\frac{f(x_2)}{(x_2-x_3)(x_2-x_0)(x_2-x_1)}\\\\ y_3 \u0026amp;= x_3 - \\frac{f(x_3)}{(x_3-x_0)(x_3-x_1)(x_3-x_2)} \\end{aligned}$$\nSuppose however we create the temporary polynomial (for this to work, we make the assumption that $f$ is a monic polynomial; that is, that its leading coefficient is 1):\n\\[ t(x) = (x-x_0)(x-x_1)(x-x_2)(x-x_3). \\]\nSince\n$$\\begin{aligned} t\u0026#39;(x) \u0026amp;= (x-x_1)(x-x_2)(x-x_3) + (x-x_0)(x-x_2)(x-x_3) \\\\ \u0026amp; \\qquad + (x-x_0)(x-x_1)(x-x_3) + (x-x_0)(x-x_1)(x-x_2) \\end{aligned}$$\nwe can write the above expressions for $y_k$ more simply as\n\\[ y_k = x_k - \\frac{f(x_k)}{t\u0026#39;(x_k)}. \\]\nwhich shows the connection with the Newton-Raphson method. Here\u0026#39;s how it might be done in [PARI/GP](https://pari.math.u-bordeaux.fr) - more specifically, in the scripting language gp:\ndurandkerner(f,eps,N = 200) = { local(pd,xs,ys,dsa,dss,count,dt); pd = poldegree(f); xs = vector(pd,k,(0.6 + 0.8*I)^k); ys = vector(pd); ds = vector(pd); dsa = 1.0; dss = Vec([1.0]); count = 1; while(dsa \u0026gt; eps \u0026amp;\u0026amp; count \u0026lt; N, dt = deriv(prod(k = 1,pd,x - xs[k]),x); for(k = 1,pd, ys[k] = xs[k] - subst(f,\u0026#39;x,xs[k])/subst(dt,\u0026#39;x,xs[k]); ds[k] = abs(ys[k] - xs[k]); xs[k] = ys[k] ); dsa = vecsum(ds)/pd; dss = concat(dss,[dsa]); count = count + 1; ); return([count,dss,xs]); } We\u0026#39;ve chosen as our beginning values $(0.6 + 0.8i)^k$ - one of the requirements of the method is that all computations must be in complex numbers.\nHere goes:\n\\p20 f = x^4 - 26*x^2 - 75*x - 56 s = durandkerner(f,0.1^20); print(\u0026#34;number of iterations: \u0026#34;,s[1]); print() vp = vecextract(s[2],\u0026#34;-10..-1\u0026#34;); \\\\ extracts the last ten differences {for(k = 1,10, printf(\u0026#34;%.10g\\n\u0026#34;,vp[k]))} \\\\ and prints them print() { for(k = 1,poldegree(f), \\\\ prints out the solutions printf(\u0026#34;%.15g\\n\u0026#34;,s[3][k]) ) } which produces the output:\nnumber of iterations: 17 1.123079176 0.6846519849 0.3550410771 0.1757081770 0.04978125947 0.003740061706 2.016207001 e-5 5.729914134 e-10 4.478174636 e-19 2.615741385 e-37 6.27491721763537-1.45113252725757 e-89I -2.50000000000000+0.866025403784439I -1.27491721763537+3.98066040407694 e-74I -2.50000000000000-0.866025403784439I As you see, the average absolute difference between successive iterates is about $3.7\\times 10^{-37}$, obtained in 17 iterations. And the values are certainly correct. You\u0026#39;ll also notice that the imaginary parts of the first and third solutions are vanishingly small; in effect, zero.\nFor a bit of fun, let\u0026#39;s try with a high precision:\n\\p1000 f = x^4 - 26*x^2 - 75*x - 56 s = durandkerner(f,0.1^1000); the result of which (using the same printing script as above) is\nnumber of iterations: 22 0.003740061706 2.016207001 e-5 5.729914134 e-10 4.478174636 e-19 2.615741385 e-37 8.409722008 e-74 8.032655807 e-147 6.583139811 e-293 3.794211385 e-585 9.907223519 e-1170 6.27491721763537+2.96119399627067 e-2361I -2.50000000000000+0.866025403784439I -1.27491721763537+1.01326544935096 e-2339I -2.50000000000000-0.866025403784439I It\u0026#39;s only taken 5 more steps to get over 1000 place precision! This is because, once it hits its stride, so to speak, this method converges quadratically, which you can see in that the accuracy of the iteration bascally doubles each step.\nWe can also experiment with a large polynomial:\ndeg = 50; f = x^deg + sum(k=0,deg-1,(random(20)-10)*x^k); s = durandkerner(f,0.1^1000); print(\u0026#34;number of iterations: \u0026#34;,s[1]); print() vp = vecextract(s[2],\u0026#34;-10..-1\u0026#34;); \\\\ extracts the last ten differences {for(k = 1,10, printf(\u0026#34;%.10g\\n\u0026#34;,vp[k]))} which produces the output\nnumber of iterations: 56 4.520878142 e-6 7.817132307 e-9 9.072720843 e-14 1.471330591 e-23 3.873798136 e-43 2.685287774 e-82 1.290323278 e-160 2.979297878 e-317 1.588344551 e-630 4.514465082 e-1257 (If you try this you\u0026#39;ll get different numbers, seeing as you\u0026#39;ll be sarting with a different random polynomial.) We won\u0026#39;t print out all the solutions, as there are too many, but we can print out the absolute values $\\|f(x_k)\\|$ of the function at each solution:\n{ for(k=1,deg, printf(\u0026#34;%.10g\\n\u0026#34;,abs(subst(f,\u0026#39;x,s[3][k]))) ) } which provides a list of 50 valaus beginning (in my case) with:\n2.395686462 e-2332 9.801085659 e-2386 2.362011441 e-2291 4.227497886 e-2284 4.335737935 e-2330 These are vanishingly small; in effect zero.\nPlaying with the algorithm To try this algorithms, open up the [PARI/GP](https://pari.math.u-bordeaux.fr) site, and go to \u0026#34;Try GP in your browser\u0026#34;, or just go [here](https://pari.math.u-bordeaux.fr/gpwasm.html). Then you can cut and paste the Durand-Kerner function into a cell and play with it.\nFor ease of playing, here are a couple of \u0026#34;helper\u0026#34; functions for printing the output. The function printcomplex simply adds more space around the + or - in a complex number, and the function printoutput does just that.\nprintcomplex(z) = { local(rz,iz); rz = real(z); iz = imag(z); if(imag(z) \u0026lt; 0, printf(\u0026#34;%.10g - %.10gI\\n\u0026#34;,rz,-iz), printf(\u0026#34;%.10g + %.10gI\\n\u0026#34;,rz,iz)); } printoutput(s) = { print(\u0026#34;number of iterations: \u0026#34;,s[1]); print(); if(length(s[2]) \u0026lt; 11, foreach(s[2],z,printf(\u0026#34;%.10g\\n\u0026#34;,z)), foreach(vecextract(s[2],\u0026#34;-10..-1\u0026#34;),z,printf(\u0026#34;%.10g\\n\u0026#34;,z))); print(); foreach(s[3],z,printcomplex(z)); } So, open up GP \u0026#34;in your browser\u0026#34;, and in a cell enter the durandkerner, printcomplex, and printoutput functions. You can enlarge the cell from the bottom right. Press the \u0026#34;Evaluate with PARI\u0026#34; function, or simply Shift+Enter.\nOpen up a new cell, and enter something like\n\\p50 f = x^4 + x^3 - 19 sol = durandkerner(f,0.1^50); printoutput(sol) Enjoy!\nParabolas, numerically Recap, and background Two posts ago we showed how, given four points in the plane in general position, but with a few restrictions, it was possible to find two parabolas through those points. We used computer algebra.\nThe steps were:\nCreate four equations \\[ (Ax_i+By_i)^2+Cx_i+Dy_i+E=0 \\] for each of the four $(x_i,y_i)$ coordinates. Solve the first three equations for $C$, $D$ and $E$: the results will be expressions in $A$ and $B$. Substitute the values from the previous step into the last equation and solve for $A$ and $B$ - there will in general be two solutions. Substitute those $A$ and $B$ values into the expressions for $C$, $D$ and $E$ to obtain the parabola equations. An example As an example, with the four points:\n\\[ (2,3),\\quad (2,1),\\quad (-4,1),\\quad (1,0). \\]\nwe have the equations (which we note are linear in $C$, $D$ and $E$):\n$$\\begin{gathered} (2A+3B)^2+2C+3D+E=0\\\\ (2A+B)^2+2C+D+E=0\\\\ (-4A+B)^2-4C+D+E=0\\\\ (A)^2+C+E=0\\end{gathered}$$\nfor which we find from the first three that\n$$\\begin{aligned} C\u0026amp;=2A^2-2AB\\\\ D\u0026amp;=-4AB-4B^2\\\\ E\u0026amp;=-8A^2+4AB+3B^2 \\end{aligned}$$\nor alternatively, that\n$$\\begin{bmatrix}C\\\\ D\\\\ E\\end{bmatrix}=\\begin{bmatrix*}[r]2\u0026amp;-1\u0026amp;0\\\\ 0\u0026amp;-2\u0026amp;-4\\\\ -8\u0026amp;2\u0026amp;3\\end{bmatrix*}\\begin{bmatrix}A^2\\\\ 2AB\\\\ B^2\\end{bmatrix}$$\nSubstituting these expressions into the last equation produces (after a bit of algebraic simplification):\n\\[ -5A^2+2AB+3B^2=0 \\]\nwhich has the two solutions\n\\[ A = s, B = -\\frac{5}{4}s\\qquad\\mathrm{and}\\qquad A=t,\\;B=t \\]\nfor arbitrary $s$ and $t$. Substituting these back into the expressions for $C$, $D$ and $E$:\n$$\\begin{aligned} C\u0026amp;=\\frac{16}{3}s^2\u0026amp;D\u0026amp;=-\\frac{40}{9}s^2\u0026amp;E\u0026amp;=-\\frac{19}{3}s^2\\\\\\[2mm] C\u0026amp;=0\u0026amp;D\u0026amp;=-8t^2\u0026amp;E\u0026amp;=-t^2 \\end{aligned}$$\nThese expressions can now be used for the parabola equations. In each equation, the squared parameter $s^2$ or $t^2$ can be factored out.\nNumerical approach Here we show how to do the same thing numerically.\nRather than explain in full algebraic detail, we\u0026#39;ll simply work through an example, from which the general method will be obvious.\nWe start with the four points $(x_i,y_i)$:\n\\[ (2,3),\\quad (2,1),\\quad (-4,1),\\quad (1,0). \\]\nWe first create a $3\\times 3$ matrix $M$ for which the first two columns are the first three $x$ and $y$ values, and the last column is all ones:\n\\[ M=\\begin{bmatrix*}[r]2\u0026amp;3\u0026amp;1\\\\ 2\u0026amp;1\u0026amp;1\\\\ -4\u0026amp;1\u0026amp;1\\end{bmatrix*} \\]\nNext is a $4\\times 3$ matrix $N$ whose rows consist of the values $x^2,\\;2xy,\\; y^2$ for each $(x,y)$ coordinates:\n\\[ N = \\begin{bmatrix*}[r]4\u0026amp;12\u0026amp;9\\\\ 4\u0026amp;4\u0026amp;1\\\\ 16\u0026amp;-8\u0026amp;1\\\\ 1\u0026amp;0\u0026amp;0\\end{bmatrix*} \\]\nNext multiply the inverse of $M$ by the top three rows of $N$:\n$$ P=\\begin{bmatrix*}[r]2\u0026amp;3\u0026amp;1\\\\ 2\u0026amp;1\u0026amp;1\\\\ -4\u0026amp;1\u0026amp;1\\end{bmatrix*}^{-1}\\begin{bmatrix*}[r]4\u0026amp;12\u0026amp;9\\\\ 4\u0026amp;4\u0026amp;1\\\\16\u0026amp;-8\u0026amp;1\\end{bmatrix*}=\\begin{bmatrix*}[r]2\u0026amp;-2\u0026amp;0\\\\ 0\u0026amp;-4\u0026amp;-4\\\\ -8\u0026amp;4\u0026amp;3\\end{bmatrix*} $$\nWhat this matrix contains, of course, are the coefficients of $A^2$, $AB$ and $B^2$ in the initial expressions for $C$, $D$ and $E$.\nThen compute\n\\[ Q=\\begin{bmatrix}1\u0026amp;0\u0026amp;1\\end{bmatrix}\\begin{bmatrix*}[r]2\u0026amp;-2\u0026amp;0\\\\ 0\u0026amp;-4\u0026amp;-4\\\\ -8\u0026amp;4\u0026amp;3\\end{bmatrix*}+\\begin{bmatrix}1\u0026amp;0\u0026amp;0\\end{bmatrix}=\\begin{bmatrix*}[r]-5\u0026amp;2\u0026amp;3\\end{bmatrix*} \\]\nHere, the first matrix on the right consists of the last (so far unused) coordinate values and a 1; the central matrix is $P$ (just computed) and the last matrix is the bottom row of $N$. Thus:\n\\[ Q=\\begin{bmatrix}x_4\u0026amp;y_4\u0026amp;1\\end{bmatrix}P+\\begin{bmatrix}n_{41}\u0026amp; n_{42}\u0026amp; n_{43}\\end{bmatrix} \\]\nThe values of $Q$ are the coefficients $a,b,c$ for the final equation $aA^2+2bAB+cB^2=0$ for $A$ and $B$.\nNow, it is easy to show that the solutions of the equation\n\\[ aA^2+2bAB+cB^2=0 \\]\nare\n\\[ A=s,\\quad B = \\frac{-b+\\sqrt{b^2-ac}}{c}s\\qquad\\mathrm{and}\\qquad A=t,\\quad B=\\frac{-b-\\sqrt{b^2-ac}}{c}t \\]\nfor arbitrary values of $s$ and $t$. Since in our case we have $a=-5$, $b=1$ and $c=3$:\n\\[ A=s,\\quad B = \\frac{-1+\\sqrt{1+15}}{3}s=s\\qquad\\mathrm{and}\\qquad A=t,\\quad B=\\frac{-1-\\sqrt{1+15}}{3}t=-\\frac{5}{3}t \\]\nWe can set $s=t=1$, or we can set $s=1$ and $t=3$ (which eliminates fractions) and substitute those values for $A$ and $B$ into the equations for $C$, $D$ and $E$. So putting $s=1$:\n$$\\begin{bmatrix}C\\\\D\\\\ E\\end{bmatrix}=\\begin{bmatrix*}[r]2\u0026amp;-2\u0026amp;0\\\\ 0\u0026amp;-4\u0026amp;-4\\\\ -8\u0026amp;4\u0026amp;3\\end{bmatrix*}\\begin{bmatrix*}[r]1\\\\ 1\\\\ 1\\end{bmatrix*}=\\begin{bmatrix*}[r]0\\\\ -8\\\\ -1\\end{bmatrix*}$$\nIf we put $t=1$, in the second solution, then $A = 1$ and $B = -5/3$, from which $C$, $D$ and $E$ can be easily obtained.\nFinishing off From above, we have the two solutions, for $$\\begin{aligned} A,B,C,D,E\u0026amp;=1,1,0,-8,1\\\\ \u0026amp; = 1,\\dfrac{5}{3},\\dfrac{16}{3},-\\dfrac{40}{9},-\\dfrac{19}{3} \\end{aligned}$$\nThis produces the two parabolas:\n$$\\begin{aligned} \u0026amp;(x+y)^2-8y+1=0\\\\ \u0026amp;(3x+5y)^2+48x-40y-57=0 \\end{aligned}$$\nwhere in the last equation every coefficient has been multiplied by 9 to clear all the fractions.\nUsing Python With NumPy:\n[] import numpy as np [] import numpy.linalg as la [] xs = np.array([2,2,-4,1]) [] ys = np.array([3,1,1,0]) [] M = np.matrix([xs[:-1],ys[:-1],[1,1,1]]).T [] N = np.matrix([xs**2,2*xs*ys,ys**2]).T [] P = -la.inv(M1)*N[:-1,:] [] Q = np.matrix([[xs[-1],ys[-1],1]])*P+N[-1,:] [] [a,b,c] = Q.tolist()[0] [] b/=2 [] b0 = (-b+np.sqrt(b*b-a*c))/c [] b1 = (-b-np.sqrt(b*b-a*c))/c [] AB = np.matrix([[1,b0,b0*b0],[1,b1,b1*b1]]).T [] R = P*AB [] coeffs = np.vstack([AB[:-1,:],R]).T [] display(coeffs) matrix([[ 1.00000000e+00, 1.00000000e+00, 2.77555756e-17, -8.00000000e+00, -1.00000000e+00], [ 1.00000000e+00, -1.66666667e+00, 5.33333333e+00, -4.44444444e+00, -6.33333333e+00]]) The parabolas can then be plotted using the parameterization given in the previous post.\nParameterization of the parabola It is (well?) known that if $x = at^2+bt+c$ and $y=pt^2+qt+r$, then\n\\[ (Ax+By)^2+Cx+Dy+E=0 \\]\nwhere\n$$\\begin{aligned} A\u0026amp;=p\\\\ B\u0026amp;=-a\\\\ C\u0026amp;=qv_2-2pv_1\\\\ D\u0026amp;=-bv_2+2av_1\\\\ E\u0026amp;=v_1^2-v_2v_3 \\end{aligned}$$\nwith $\\langle v1, v2, v3\\rangle =\\langle a,b,c\\rangle \\times \\langle p,q,r\\rangle$; that is, the $v_i$ values are the elements of the cross product of the vectors of the coefficients.\nIn other words, two quadratic functions parameterize a parabola.\nFinding the equations which parameterise a given parabola Suppose we have a general parabola given by $(Ax+By)^2+Cx+Dy+E=0$. Clearly we need to choose coefficients of $t^2$ in $x$ and $y$ in such a way that they cancel out in the first bracket.\nWe can start with, say\n$$\\begin{aligned} x\u0026amp;=Bt^2+bt+c\\\\ y\u0026amp;=-(At^2+qt+r) \\end{aligned}$$\nThen\n$$\\begin{aligned} (Ax+By)^2\u0026amp;=((ABt^2+Abt+Ac)-(BAt^2+Bqt+Br))^2\\\\ \u0026amp;=((Ab-Bq)t+(Ac-Br))^2\\\\ \u0026amp;=(Ab-Bq)t^2+2(Ab-Bq)(Ac-Br)t+(Ac-Br)^2 \\end{aligned}$$\nAdding this to the rest of the expression (Cx+Dy+E) and collecting \u0026#34;like terms\u0026#34;, we end up with:\n$$\\begin{aligned} t^2:\u0026amp;\\quad (Ab-Bq)^2+CB-DA=0\\\\ t:\u0026amp;\\quad 2(Ab-Bq)(Ac-Br)+Cb-Dq = 0\\\\ 1:\u0026amp;\\quad (Ac-Br)^2+Cc-Dr+E=0 \\end{aligned}$$\nFrom the first equation, if $b=s$, say, then\n\\[ q = \\frac{\\sqrt{DA-CB}-As}{B}. \\]\nBut we don\u0026#39;t want square roots if we can avoid them. One thing we can do is to introduce an extra multiplicative variable into the original equations:\n$$\\begin{aligned} x\u0026amp;=k(Bt^2+bt+c)\\\\ y\u0026amp;=-k(At^2+qt+r) \\end{aligned}$$ Then the three equations corresponding to the coefficients $t^2,t,1$ can be written as\n$$\\begin{aligned} \u0026amp;(Ab+Bq)^2k^2+(BC-AD)k=0\\\\ \u0026amp;2(A b + B q)(A c + B r)k^2 + (Cb+Dq)k=0\\\\ \u0026amp;(Ac+Br)^2k^2+(Cc+Dr)k+E=0 \\end{aligned}$$\nThe first equation can be solved to produce two solutions:\n\\[ b=s,\\quad q = \\frac{-Aks+\\sqrt{(AD-BC)k}}{Bk} \\]\nand\n\\[ b=t,\\quad q = \\frac{-Akt-\\sqrt{(AD-BC)k}}{Bk} \\]\nClearly to eliminate the square root we can set $k=1/(AD-BC)$. This produces the solutions\n\\[ b=s,\\quad q = \\frac{-As-(AD-BC)}{B} \\]\nand\n\\[ b=t,\\quad q = \\frac{-At+(AD-BC)}{B} \\]\nSince $s$ and $t$ are arbitrary values, and we only want one solution to our equations, we can choose $s=-D$ and $t=D$. Then both solutions collapse to:\n\\[ b=D,\\quad q = \\pm C. \\]\nIn fact, we can do the entire solution using a computer algebra system (in this case [SageMath](https://www.sagemath.org)). We start by creating the variables we need, finishing with a polynomial in $t$:\n\u0026lt;Sage\u0026gt;: var(\u0026#39;A,B,C,D,E,x,y,t,b,c,q,r\u0026#39;) \u0026lt;Sage\u0026gt;: x = k*(B*t^2 + b*t +c) \u0026lt;Sage\u0026gt;: y = -k*(A*t^2 + q*t + r) \u0026lt;Sage\u0026gt;: p = (A*x + B*y)^2 + C*x + D*y + E \u0026lt;Sage:\u0026gt; pc = p.expand().poly(t) The equations are the coefficients of this polynomial, which we require to be equal to zero:\n\u0026lt;Sage\u0026gt;: eqns = [pc.coefficient(t,i).subs({k:1/(A*D-B*C)}).factor().numerator() for i in range(3)] \u0026lt;Sage\u0026gt;: sols = solve(eqs,[b,c,q,r],solution_dict=True) Now we substitute in D for the free variable given by $p$, and simplify:\n\u0026lt;Sage\u0026gt;: s = sols[0][b].free_variables()[0] \u0026lt;Sage\u0026gt;: [sols[0][z].subs({s:-D}).full_simplify() for z in [b,c,q,r]] This produces the output\n\\[ [-D, BE, -C, AE] \\]\nwhich means that the general parabola $(Ax+By)^2+Cx+Dy+E=0$ can be parameterized by\n\\[ x = \\frac{Bt^2 - Dt +BE}{AD-BC},\\qquad y = \\frac{-At^2 + Ct -AE}{AD-BC} \\]\nFour point parabolas Introduction It is (or should be) well known that a parabola has the cartesian form\n\\[ (Ax+By)^2+Cx+Dy+E = 0. \\]\nThis looks as though there are five values needed, but we can divide through in such a way as to make any of the coefficients we like equal to 1:\n\\[ (Px+Qy)^2+Rx+Sy+1 = 0. \\]\nand so we see that only four values are needed to define a parabola. A parabola thus has one more degree of freedom than a circle, for which only three values are needed:\n\\[ (x-A)^2+(y-B)^2 = C^2. \\]\nFor any four points \u0026#34;in general position\u0026#34; in the plane, there will be two parabolas that pass through those points. The purpose here is to explore how that can be done.\nThere are some explanations at [mathpages](https://www.mathpages.com/home/kmath037/kmath037.htm) and also [here](https://www.mathpages.com/home/kmath546/kmath546.htm) but the confusion of algebra and the old-fashioned typesetting makes these articles hard to read.\nIt\u0026#39;s easiest to do it by an example first, then discuss the general method afterwords.\nOne way - a bit more complicated than necessary Suppose we start with four points\n\\[ (0,-3),\\quad (-3,3),\\quad (1,-3),\\quad (-3,-1) \\]\nBy substituting each of these values into the first equation above, we obtain four equations:\n\\begin{gather} (-3B)^2 + 3D + E = 0\\\\ (-3A+3B)^2-3C+3D+E=0\\\\ (A-3B)^2+C-3D+E=0\\\\ (-3A-B)^2-3C-D-E=0 \\end{gather} The plan of attack is this:\nSolve the first three equations for $C$, $D$ and $E$. The results will be expressions in $A$ and $B$. Substitute the results from step 1 into the last equation, and solve for $A$ and $B$. There will in general be two solutions. Substitute the newly found values of $A$ and $B$ into the expressions for $C$ and $D$ to obtain the parabola equations. And of course we can do all of this in Sage. Here\u0026#39;s step 1:\nvar(\u0026#39;A,B,C,D,E,x,y\u0026#39;) xs,ys = [0,-3,1,-3],[-3,3,-3,-1] eqns = [(A*x+B*y)^2+C*x+D*y+E for x,y in zip(xs,ys)] sols1 = solve(eqns[:-1],[C,D,E],solution_dict=True) sols1 This produces:\n\\[ \\left[\\left\\{C : -A^{2} + 6 \\, A B, D : -2 \\, A^{2} + 6 \\, A B, E : -6 \\, A^{2} + 18 \\, A B - 9 \\, B^{2}\\right\\}\\right] \\]\nNow for step 2:\nsols2 = solve(eqns[-1].subs(sols1[0]),[A,B], solution_dict=True) This produces:\n\\[ \\left[\\left\\{A : r_{1}, B : -r_{1}\\right\\}, \\left\\{A : r_{2}, B : r_{2}\\right\\}\\right] \\]\nNote that the parameters in which $A$ and $B$ are given may well depend on how many such computations you\u0026#39;ve done. Sage will simple keep increasing the parameter index each time parameters are needed. (When I performed this particular calculation, the indices were in fact 925 and 926.) You can however re-set the indices by prefixing the above command as follows:\nmaxima_calculus.reset() sols2 = solve(eqns[-1].subs(sols1[0]),[A,B], solution_dict=True) Before we substitute back into the expressions for $C$, $D$, $E$, we turn the parameters into variables:\np = sols2[0][A].free_variables()[0] q = sols2[1][A].free_variables()[0] Now for step 3:\ns0 = sols2[0] s1 = sols2[1] para0 = (s0[A]*x+s0[B]*y)^2+sols1[C].subs(s0)*x+sols1[D].subs(s0)*y+sols1[E].subs(s0) parab0 = (para0.factor()/p^2).numerator().expand() para1 = (s1[A]*x+s1[B]*y)^2+sols1[C].subs(s1)*x+sols1[D].subs(s1)*y+sols1[E].subs(s1) parab1 = (para1.factor()/q^2).numerator().expand() At this point, the equations of the two parabolas are\n\\[ x^{2} - 2 x y + y^{2} - 7 x - 8 y - 33 = 0,\\qquad x^{2} + 2 x y + y^{2} + 5 x + 4 y + 3 = 0 \\]\nor, to conform with the general equation from above:\n\\[ (x-y)^2-7x-8y-33=0,\\quad (x+y)^2+5x+4y+3=0. \\]\nThey can be displayed, along with the initial points, like this:\nb = 3 xmin,xmax = min(xs)-b,max(xs)+b ymin,ymax = min(ys)-b,max(ys)+b p1 = implicit_plot(parab0,(x,xmin,xmax),(y,ymin,ymax)) p2 = implicit_plot(parab1,(x,xmin,xmax),(y,ymin,ymax),color=\u0026#39;green\u0026#39;) p3 = list_plot([(s,t) for s,t in zip(xs,ys)],plotjoined=False,marker=\u0026#34;o\u0026#34;,size=60,color=\u0026#39;red\u0026#39;,faceted=True) p1+p2+p3 A slightly simpler way Shift the points so that one of them is at the origin. For example, with the points above, we could shift by $(0,3)$ to obtain\n\\[ (0,0),\\quad (-3,6),\\quad (1,0),\\quad (-3,2) \\]\nSince the parabolas must pass through the origin, we must have $E=0$, so that we are looking for expressions of the form\n\\[ (Ax+By)^2+Cx+Dy=0 \\]\nand we only need to use the three points away from the origin. Then we work very similarly to above, except that we will only use three points and three equations.\na,b = xs[0],ys[0] xs,ys = [x-a for x in xs[1:]],[y-b for y in ys[1:]] eqns = [(A*x+B*y)^2+C*x+D*y+E for x,y in zip(xs,ys)] sols1 = solve(eqns[:-1],[C,D,E],solution_dict=True)[0] sols1 The outcome of all this is\n\\[ \\left\\{C : -A^{2}, D : -2 \\, A^{2} + 6 \\, A B - 6 \\, B^{2}\\right\\} \\]\nNow we can substitute and solve for $A$ and $B$ as above:\nmaxima_calculus.reset() sols2 = solve(eqns[-1].subs(sols1),[A,B], solution_dict=True) Continuing as before:\np = sols2[0][A].free_variables()[0] q = sols2[1][A].free_variables()[0] s0 = sols2[0] s1 = sols2[1] para0 = (s0[A]*x+s0[B]*y)^2+sols1[C].subs(s0)*x+sols1[D].subs(s0)*y parab0 = (para0.factor()/p^2).numerator().expand() para1 = (s1[A]*x+s1[B]*y)^2+sols1[C].subs(s1)*x+sols1[D].subs(s1)*y parab1 = (para1.factor()/q^2).numerator().expand() The two parabolas here are\n\\[ x^{2} - 2 x y + y^{2} - x - 14y ,\\quad x^{2} + 2 x y + y^{2} - x - 2 y \\]\nTo get back to the parabolas we want, simply shift back:\nparab0.subs({x:x-a,y:y-b}).expand() parab1.subs({x:x-a,y:y-b}).expand() Another example \\[ (x,y) = (2,3),\\quad (2,1),\\quad (-4,1),\\quad (1,0). \\]\nRepeat the above commands with the points shifted to include the origin:\nxs0, ys0 = [2,2,-4,1], [3,1,1,0] a,b = xs0[0],ys0[0] xs,ys = [x-a for x in xs0[1:]],[y-b for y in ys0[1:]] eqns = [(A*x+B*y)^2+C*x+D*y+E for x,y in zip(xs,ys)] sols1 = solve(eqns[:-1],[C,D,E],solution_dict=True)[0] Now substitute into the last equation and solve for $A$ and $B$:\nsols2 = solve(eqns[-1].subs(sols1),[A,B], solution_dict=True) Finally, substitute both equations back to obtain $C$ and $D$, and factor out the parameter:\np = sols2[0][A].free_variables()[0] q = sols2[1][A].free_variables()[0] s0 = sols2[0] s1 = sols2[1] para0 = (s0[A]*x+s0[B]*y)^2+sols1[C].subs(s0)*x+sols1[D].subs(s0)*y parab0 = (para0.factor()/p^2).numerator().expand() para1 = (s1[A]*x+s1[B]*y)^2+sols1[C].subs(s1)*x+sols1[D].subs(s1)*y parab1 = (para1.factor()/q^2).numerator().expand() We could obtain the same result a bit more easily by letting the variables p and q be in a tuple, similarly for s0 and s1, and for the parabolas. This would cut the commands down to four instead of eight.\nBefore plotting, we need to substitute back in the original $x$ and $y$ values by another shift:\nparab0 = parab0.subs({x:x-a,y:y-b}).expand() parab1 = parab1.subs({x:x-a,y:y-b}).expand() Now plot them; we need to refer to the original xs0, ys0 values:\nb = 3 xmin,xmax = min(xs0)-b,max(xs0)+b ymin,ymax = min(ys0)-b,max(ys0)+b p1 = implicit_plot(parab0,(x,xmin,xmax),(y,ymin,ymax)) p2 = implicit_plot(parab1,(x,xmin,xmax),(y,ymin,ymax),color=\u0026#39;green\u0026#39;) p3 = list_plot([(s,t) for s,t in zip(xs0,ys0)],plotjoined=False,marker=\u0026#34;o\u0026#34;,size=60,color=\u0026#39;red\u0026#39;,faceted=True) p1+p2+p3 General expressions Although the method is simple to describe, the algebra becomes messy when written in full generality. For example, suppose we use the second method, with three points $(x_1,y_1)$, $(x_2,y_2)$, $(x_3,y_3)$ none of which are at the origin.\nThe three equations are\n\\begin{gather} (Ax_1+By_1)^2+Cx_1+Dy_1=0\\\\ (Ax_2+By_2)^2+Cx_2+Dy_2=0\\\\ (Ax_3+By_3)^2+Cx_3+Dy_3=0 \\end{gather} Solving the first two for $C$ and $D$ produces:\n\\[ C = \\frac{(Ax_2+By_2)^2y_1-(Ax_1+By_1)^2y_2}{x_1y_2-x_2y_1},\\quad D = \\frac{(Ax_1+By_1)^2x_2-(Ax_2+By_2)^2x_1}{x_1y_2-x_2y_1} \\]\nIt will simplify matters to introduce the notation\n\\[ v_{ij}=x_iy_j-x_jy_i. \\]\nThe discussion at [mathpages](https://www.mathpages.com/home/kmath546/kmath546.htm) does much the same thing, but treats the $v$ values as the elements of the cross product of the vectors $[x_1,x_2,x_3]$ and $[y_1,y_2,y_3]$.\nNow, substituting into the last equation produces an equation of the form\n\\[ aA^2+2bAB+cB^2=0 \\]\nwhere\n\\begin{aligned} a \u0026amp; = -v_{23}x_1^2+v_{13}x_2^2-v_{12}x_3^2\\\\ b \u0026amp; = -v_{23}x_1y_1+v_{13}x_2y_2-v_{12}x_3y_3\\\\ c \u0026amp; = -v_{23}y_1^2+v_{13}y_2^2-v_{12}y_3^2 \\end{aligned} The solutions are then\n\\begin{aligned} A\u0026amp;=r, \u0026amp; B \u0026amp;= \\frac{-br+\\sqrt{b^2-acr}}{c}\\\\ A\u0026amp;=s, \u0026amp; B \u0026amp;= \\frac{-bs-\\sqrt{b^2-acs}}{c} \\end{aligned} The values $a$, $b$ and $c$ can all be expressed as the negative determinants:\n\\[ a = -\\begin{vmatrix}x_1^2\u0026amp;x_2^2\u0026amp;x_3^2\\\\ x_1\u0026amp;x_2\u0026amp;x_3\\\\ y_1\u0026amp;y_2\u0026amp;y_3\\end{vmatrix},\\qquad b = -\\begin{vmatrix}x_1y_1\u0026amp;x_2y_2\u0026amp;x_3y_3\\\\ x_1\u0026amp;x_2\u0026amp;x_3\\\\ y_1\u0026amp;y_2\u0026amp;y_3\\end{vmatrix},\\qquad c = -\\begin{vmatrix}y_1^2\u0026amp;y_2^2\u0026amp;y_3^2\\\\ x_1\u0026amp;x_2\u0026amp;x_3\\\\ y_1\u0026amp;y_2\u0026amp;y_3\\end{vmatrix}. \\]\nThe next step would be to substitute these values into the expressions above for $C$ and $D$, but as you see we\u0026#39;re already getting to the reasonable limit of complexity for algebraic expressions. Substituting the first pair of values for $A$ and $B$ into the equation for $C$ produces an utterly hideous expression!\nBicentric heptagons A bicentric heptagon is one for which all vertices lie on a circle, and for which all edges are tangential to another circle. If $R$ and $r$ are the radii of the outer and inner circles respectively, and $d$ is the distance between their centres, there is an expression which relates the three values when a bicentric heptagon can be formed.\nTo start, define\n\\[ a = \\frac{1}{R+d},\\quad b = \\frac{1}{R-d},\\quad c = \\frac{1}{r} \\]\nand then:\n\\[ E_1 = -a^2+b^2+c^2,\\quad E_2 = a^2-b^2+c^2,\\quad E_3 = a^2+b^2-c^2 \\]\nThe expression we want is:\n\\[ E_1E_2E_3+2abE_1E_2 -2bcE_2E_3-2acE_1E_3=0. \\]\nSee the [page at Wolfram Mathworld](https://mathworld.wolfram.com/PonceletsPorism.html) for details.\nHowever, a bicentric heptagon can exist in three forms: a convex polygon, and two stars.\nThe above expression, impressive though it is (even more so when it is rewritten in terms of $R$, $r$ and $d$), doesn\u0026#39;t give any hint as to which values give rise to which form of polygon.\nHowever, suppose we scale the heptagon by setting $R=1$. We can then rewrite the above expression as a polynomial is $r$, whose coefficients are functions of $d$:\n\\begin{multline*} 64d^2r^6-32(d^2+1)(d^4-1)r^5-16d^2(d^2-1)^2r^4+8(d^2-1)^3(3d^2+1)r^3\\\\ -4(d^2-1)^4r^2-4(d^2-1)^5r+(d^2-1)^6=0. \\end{multline*} and this can be simplified with the substitutions $u=d^2-1$ and $x=2r$:\n\\[ (u+1)x^6-u(u+1)(u+2)^2x^5-u^2(u+1)x^4+u^3(3u+4)x^3-u^4x^2-2u^5x+u^6=0. \\]\nSince $R=1$, it follows that $d$ (and so also $u$) is between 0 and 1, and it turns out that in this range the sextic polynomial equation above has four real roots, of which only three can be used. For the other root $d+r\u0026gt;1$, which would indicate the inner circle not fully contained in the outer circle.\nYou can play with this polynomial here:\nThen the different forms of the bicentric heptagon correspond with the different roots; the root with the largest absolute value produces a convex polygon, the root with the smallest absolute value produces the star with [Schläfli symbol](https://en.wikipedia.org/wiki/Schl%C3%A4fli_symbol) ${7:3}$ (which is the \u0026#34;pointiest\u0026#34; star), and the other root to the star with symbol ${7:2}$. Look at the table on the Wikipedia page just linked, and the column for heptagons.\nHere are the heptagons, which because of Poncelet\u0026#39;s Porism, can be dragged around (if the diagram doesn\u0026#39;t update, refresh the page; it should work):\nPoncelet\u0026#39;s porism on non-circular conic sections\u0026#xa0;\u0026#xa0;\u0026#xa0;mathematics\u0026#xa0;geometry\u0026#xa0;jsxgraph Introduction Poncelet\u0026#39;s porism or Poncelet\u0026#39;s closure theorem is one of the most remarkable results in plane geometry. It is most easily described in terms of circles: suppose we have two circles $C$ and $D$, with $D$ lying entirely inside $C$. Pick a point $p_0$ on $C$, and find the tangent from $p_0$ to $D$. Let $p_1$ be the other intersection of the tangent line at $C$. So the line $p_0 - p_1$ is a chord of $C$ which is tangential to $D$. Continue creating $p_2$, $p_3$ and so on. The porism claims that: If at some stage these tangents \u0026#34;join up\u0026#34;; that is, if there is a point $p_k$ equal to $p_0$, then the tangents will join up for any initial choice of $p_0$ on $C$.\nThe polygon so created from the vertices $p_0,\\, p_1,\\,\\cdots,p_k$ is called a bicentric polygon: all its vertices lie on one circle $C$, and all its edges are tangential to another circle $D$.\nIf $r$ and $R$ are the radii of $D$ and $C$ respectively, and $d$ is the distance between their centres, much effort has been expended over the past two centuries determining conditions on these three values for an $n$ sided bicentric polygon to exist. Euler established that for triangles:\n\\[ \\frac{1}{R+d}+\\frac{1}{R-d}=\\frac{1}{r} \\]\nor that\n\\[ R^2-2Rr-d^2=0. \\]\nEuler\u0026#39;s amanuensis, Nicholas Fuss (who would marry one of Euler\u0026#39;s granddaughters) determined that for bicentric quadrilaterals:\n\\[ \\frac{1}{(R+d)^2}+\\frac{1}{(R-d)^2}=\\frac{1}{r^2} \\]\nor that\n\\[ (R^2-d^2)^2=2r^2(R^2+d^2). \\]\nLooking at the first expressions, you might hope that $n$ sided polygons might have similarly nice expressions. Unfortunately, the expressions get considerably more complicated as $n$ increases, and the only way to write them succinctly is with a sequence of substitutions.\nThere is a good demonstration and explanation at [Wolfram Mathworld](https://mathworld.wolfram.com/PonceletsPorism.html) which has examples of some further expressions.\nAn example with two circles Here\u0026#39;s an example with a quadrilateral. To use it, move the point $A$ along the $x$ axis. You\u0026#39;ll see that the inner circle changes size according to Fuss\u0026#39; formula. Then you can drag the circled point around the outer circle to demonstrate the porism.\nAn example with non-circular conic sections Poncelet\u0026#39;s porism ia in fact a result for conic sections, not just circles. However, circles are easy to work with and define - as seen above, just three parameters are needed to define two circles. This means that nobody has tried to develop similar formulas to Euler and Fuss for general conic sections: the complexity is simply too great. In the most general form, five points are needed to fix a conic section. That is: given any five points in general position, there will be a unique conic section passing through all of them. Here\u0026#39;s how this figure works:\nThe green dots define the interior ellipse (two foci and a point on the ellipse). They can be moved any way you like. The red points on the ellipse: $p_0$, $p_1$, $p_2$, $p_3$ and $p_4$ can be slid around the ellipse. The tangents to these points and their intersections define a pentagon, whose vertices define a larger ellipse. When you have a nice shape that you like, use the button \u0026#34;Hide initial pentagon\u0026#34;. All current labels will vanish, and you\u0026#39;ll have one circled point which can be dragged around the outer ellipse to demonstrate the porism. What happens if you allow two of the points $p_i$ to \u0026#34;cross over\u0026#34;?\nA note on the diagrams These were created with the amazing JavaScript library [JSXGraph](https://jsxgraph.uni-bayreuth.de/wp/index.html) which is a very powerful tool for creating interactive diagrams. I am indebted to the many answers I\u0026#39;ve received to questions on its Google group, and in particular to its lead developer, [Professor Dr Albert Wassermann](https://jsxgraph.uni-bayreuth.de/~alfred/home/) from the [University of Bayreuth](https://www.uni-bayreuth.de/en), who has been unfailingly generous with his time and detail in answering my many queries.\nImage dithering (2): error diffusion\u0026#xa0;\u0026#xa0;\u0026#xa0;julia\u0026#xa0;image_processing A totally different approach to dithering is error diffusion. Here, the image is scanned pixel by pixel. Each pixel is thresholded t0 1 or 0 depending on whether the pixel value is greater than 0.5 or not, and the error - the difference between the pixel value and its threshold - is diffuse across neighbouring pixels.\nThe first method was developed by Floyd and Steinberg, who proposed the following diffusion:\n$$\\frac{1}{16}\\begin{bmatrix} \u0026amp;*\u0026amp;7\\\\ 3\u0026amp;5\u0026amp;1 \\end{bmatrix}$$\nWhat this means is that if the current pixel\u0026#39;s value is $m_{ij}$, it will be thresholded to 1 or 0 depending on whether its value is greater than 0.5 or not. We set\n$$t_{ij}=\\left\\{\\begin{array}{ll}1\u0026amp;\\mbox{if $m_{ij}\u0026gt;0.5$}\\\\ 0\u0026amp;\\mbox{otherwise}\\end{array}\\right.$$\nAlternatively,\n$$t_{ij} = m_{ij}\u0026gt;0.5$$\nassuming that the right hand expression returns 1 for true, 0 for false. Then the error is\n$$e_{ij}=m_{ij}-t_{ij}$$.\nThe surrounding pixels are then updated by fractions of this error:\n$$\\begin{aligned} m_{i,j+1}\u0026amp;=m_{i,j+1}+\\frac{7}{16}e_{ij}\\\\ m_{i+1,j-1}\u0026amp;=m_{i+1,j-1}+\\frac{3}{16}e_{ij}\\\\ m_{i+1,j}\u0026amp;=m_{i+1,j}+\\frac{5}{16}e_{ij}\\\\ m_{i+1,j+1}\u0026amp;=m_{i+1,j+1}+\\frac{1}{16}e_{ij} \\end{aligned}$$\nThere is a Julia package for computing error diffusion called [DitherPunk](https://github.com/JuliaImages/DitherPunk.jl), but in fact the basic logic can be easily managed:\nJulia\u0026gt; r,c = size(img) Julia\u0026gt; out = Float64.(image) Julia\u0026gt; fs = [0 0 7;3 5 1]/16 Julia\u0026gt; for i = :r-1 for j = 2:c-1 old = out[i,j] new = round(old) out[i,j] = new error = old - new out[i:i+1,j-1:j+1] += error * fs end end To ensure that the final image is the same size as the original, we can just take the central, changed pixels, and pad them by replication:\nJulia\u0026gt; padding = Pad(:replicate, (0,1),(1,1)) Julia\u0026gt; out = paddarray(out[1:r-1, c-1:c+1], padding) Julia\u0026gt; out = Gray.(abs.(out)) Here\u0026#39;s the result applied to the bridge image from before, again with the original image for comparison:\nIf you compare this dithered image with the halftoned image from the last blog post, you\u0026#39;ll notice some slight cross-hatching in the halftoned image; this is an artefact of the repetition of the B4 Bayer matrix. Such an artefact doesn\u0026#39;t exist in the error-diffused image. To make that comparison easier, here they both are, with the half-tone image on the left, and the image with error diffusion on the right:\nError diffusion has the added advantage of being able to produce an output of more than two levels; this allows the number of colours in an image to be reduced while at the same time reducing the degradation of the image. If we want $k$ output grey values, from 0 to $k-1$, all we do is change the definition of new in the loop given above to\nnew = round(old * (k-1)) / (k-1) Here, for example, is the same image with 4 and with 8 grey levels:\nThe above process for dithering can easily be adjusted for any other dither matrices, of which there are many, for example:\nJarvice-Judice-Ninke:\n$$\\frac{1}{48}\\begin{bmatrix} 0\u0026amp; 0\u0026amp; *\u0026amp; 7\u0026amp; 5\\\\3\u0026amp; 5\u0026amp; 7\u0026amp; 5\u0026amp; 3\\\\1\u0026amp; 3\u0026amp; 5\u0026amp; 3\u0026amp; 1 \\end{bmatrix}$$\nStucki:\n$$\\frac{1}{42}\\begin{bmatrix} 0\u0026amp; 0\u0026amp; *\u0026amp; 8\u0026amp; 4\\\\2\u0026amp; 4\u0026amp; 8\u0026amp; 4\u0026amp; 2\\\\1\u0026amp; 2\u0026amp; 4\u0026amp; 2\u0026amp; 1 \\end{bmatrix}$$\nAtkinson:\n$$\\frac{1}{8}\\begin{bmatrix} 0\u0026amp; *\u0026amp; 1\u0026amp; 1\\\\1\u0026amp; 1\u0026amp; 1\u0026amp; 0\\\\0\u0026amp; 1\u0026amp; 0\u0026amp; 0 \\end{bmatrix}$$\nBurke:\n$$\\frac{1}{32}\\begin{bmatrix} 0\u0026amp; 0\u0026amp; *\u0026amp; 8\u0026amp; 4\\\\2\u0026amp; 4\u0026amp; 8\u0026amp; 4\u0026amp; 2 \\end{bmatrix}$$\nSierra:\n$$\\frac{1}{32}\\begin{bmatrix} 0\u0026amp; 0\u0026amp; *\u0026amp; 5\u0026amp; 3\\\\2\u0026amp; 4\u0026amp; 5\u0026amp; 4\u0026amp; 2\\\\0\u0026amp; 2\u0026amp; 3\u0026amp; 2\u0026amp; 0 \\end{bmatrix}$$\nTwo row Sierra:\n$$\\frac{1}{16}\\begin{bmatrix} 0\u0026amp; 0\u0026amp; *\u0026amp; 4\u0026amp; 3\\\\1\u0026amp; 2\u0026amp; 3\u0026amp; 2\u0026amp; 1 \\end{bmatrix}$$\nSierra Lite\n$$\\frac{1}{4}\\begin{bmatrix} 0\u0026amp; *\u0026amp; 2\\\\ 1\u0026amp; 1\u0026amp; 0 \\end{bmatrix}$$\nNote that the DitherPunk package is named after [a very nice blog post](https://surma.dev/things/ditherpunk/) discussing dithering. This post makes reference to [another post](https://forums.tigsource.com/index.php?topic=40832.msg1363742#msg1363742) which discusses dithering in the context of the game [Return of the Obra Dinn](https://en.wikipedia.org/wiki/Return_of_the_Obra_Dinn). This game is set in the year 1807, and to obtain a sort of \u0026#34;antique\u0026#34; look, its developer used dithering techniques extensively to render all scenes using \u0026#34;1-bit graphics\u0026#34;. A glimpse at its [trailer](https://www.dailymotion.com/video/x8afie2) will show you how well this has been done. (Note: I haven\u0026#39;t played the game; I\u0026#39;m not a game player. But as with all games there are plenty of videos - some many hours long - about the game and playing it.)\nColour images Dithering of colour images is very simple: simple dither each of the red, green and blue colour planes, then put them all back together. Assuming the above process for grey level dithering has been encapsulated in a function called dither, we can write:\nfunction cdither(image;nlevels = 2) image_rgb = channelview(image) rd = dither(image_rgb[1,:,:],nlevels) gd = dither(image_rgb[2,:,:],nlevels) bd = dither(image_rgb[3,:,:],nlevels) image_d = colorview(RGB, rd,gd,bd) return(image_d) end Here, for example, are the results of dithering at two and four levels of the \u0026#34;lighthouse\u0026#34; image from the test images database:\nThe first image shows some artefacts as noise, but is still remarkably clear; the second image is surprisingly good. If the three images (original, two-level dither, four-level dither) are saved into variables img, img_d2, img_d4, then:\nJulia\u0026gt; [length(unique(x) for x in [img,img_d2,img_d4]]\u0026#39; 1×3 adjoint(::Vector{Int64}) with eltype Int64: 29317 8 34 So the second image, clear as it is, uses only 34 distinct colours as opposed to the nearly 30000 of the original image.\nImage dithering (1): half toning\u0026#xa0;\u0026#xa0;\u0026#xa0;julia\u0026#xa0;image_processing Image dithering, also known as half-toning, is a method for reducing the number of colours in an image, while at the same time trying to retain as much of its \u0026#34;look and feel\u0026#34; as possible. Originally this was required for newspaper printing, where no shades of grey were possible, and only black and white could be printed. So a light grey area would be printed as a few dots of black, but mostly white, and a dark grey area with mostly black, with some white spots. One of the obvious problems is that the image resolution would be decreased, but in fact the human visual system can interpret an image even after a loss of information.\nAssuming an image to have grey scales between 0.0 (black) and 1.0 (white), one way of half-toning is to threshold the image against copies of the so-called \u0026#34;Bayer\u0026#34; matrices, of which the first two are:\n$$B_2 = \\frac{1}{4}\n\\begin{bmatrix} 0\u0026amp;2\\\\ 3\u0026amp;1 \\end{bmatrix}, \\qquad B_4 = \\frac{1}{16} \\begin{bmatrix} 0\u0026amp;8\u0026amp;2\u0026amp;10\\\\ 12\u0026amp;4\u0026amp;14\u0026amp;6\\\\ 3\u0026amp;11\u0026amp;1\u0026amp;9\\\\ 15\u0026amp;7\u0026amp;13\u0026amp;5 \\end{bmatrix}$$ See the [Wikipedia page](https://en.wikipedia.org/wiki/Ordered_dithering) for discussions and derivation. And here\u0026#39;s a quick example in Julia: #+BEGIN_SRC Julia Julia\u0026gt; using Images, FileIO, TestImages Julia\u0026gt; img = Gray.(testimage(\u0026#34;walkbridge.tif\u0026#34;)); Julia\u0026gt; B4 = Gray.(N0f8.(1/16*[0 8 2 10;12 4 14 6;3 11 1 9;15 7 13 5])); Julia\u0026gt; B512 = repeat(B4,128,128); Julia\u0026gt; img_halftone = Gray(img .\u0026gt; B512); Julia\u0026gt; mosaic(img,img_halftone,nrow=1,npad=10,fillcolour=1) #+END_SRC [[file:/montage_halftone.png]] Note that #+BEGIN_SRC Julia Julia\u0026gt; length(unique(img)), length(unique(img_halftone)) 256, 2 #+END_SRC The original image had 256 different grey levels, the new image has only 2 - yet, even if it is a much poorer image, it still retains a lot of the pictorial aspects of the original. * The Pegasus and related methods for solving equations :mathematics:julia:computation: :PROPERTIES: :EXPORT_FILE_NAME: pegasus_method :EXPORT_DATE: 2023-07-06 :END: In the previous post, we saw that a small change to the method of false position provided much faster convergence, while retaining its bracketing. This was the Illinois method which is only one of a whole host of similar methods, some of which converge even faster. And as a reminder, here\u0026#39;s its definition, with a very slight change: Given $x_{i-1}$ and $x_i$ that bracket a root and their function values $f_{i-1}$, $f_i$, first compute the secant value \\[ x_{i+1}=\\frac{x_{i-1} f_i - x_i f_{i-1}}{f_i - f_{i-1}}. \\] and let $f_{i+1}=f(x_{i+1})$. Then: 1. if $f_if_{i+1}\u0026lt;0$, replace $(x_{i-1},f_{i-1})$ with $(x_i,f_i)$ 2. if $f_if_{i+1}\u0026gt;0$, replace $(x_{i-1},f_{i-1})$ with $(x_{i-1},\\gamma f_{i-1})$ with $\\gamma=0.5$. In each case we replace $(x_i,f_i)$ by $(x_{i+1},f_{i+1})$. Much research since has been investigating possible scaling values for $\\gamma$. If $\\gamma$ is to be constant, then it can be shown that $\\gamma=0.5$ is optimal. But $\\gamma$ need not be constant. ** The Pegasus method This was defined by Dowell \u0026amp; Jarratt, whose form of the Illinois method we used in the last post; see their article \u0026#34;The \u0026#39;Pegasus\u0026#39; method for computing the root of an equation.\u0026#34; /BIT Numerical Mathematics/ 12 (1972),pp503-508. Here we use \\[ \\gamma = \\frac{f_i}{f_i+f_{i+1}} \\] And here\u0026#39;s the Julia function for defining it; basically the same as the Illinois function of the previous post, with the differences both of $\\gamma$ and of showing the absolute difference between successive iterations: #+BEGIN_SRC Julia function pegasus(f,a,b;num_iter = 20) if f(a)*f(b) \u0026gt; 0 error(\u0026#34;Values given are not guaranteed to bracket a root\u0026#34;) else fa, fb = f(a), f(b) c = b for k in 1:num_iter c_old = c c = b - fb*(b-a)/(fb-fa) fc = f(c) if fb*fc \u0026lt; 0 a, fa = b, fb else fa = fa*fb/(fb+fc) end b, fb = c, fc @printf(\u0026#34;%2d: %1.60f, %1.15e\\n\u0026#34;,k,c,abs(c_old-c)) #println(c,\u0026#34;, \u0026#34;,abs(c_old-c)) end end end #+END_SRC And with the same function $f(x)=x^5-2$: #+BEGIN_SRC Julia Julia\u0026gt; pegasus(f,BigFloat(\u0026#34;2\u0026#34;),BigFloat(\u0026#34;3\u0026#34;)) 1: 1.032258064516129032258064516129032258064516129032258064516128, 9.677419354838710e-01 2: 1.058249216160286723536401978566436704093370062081736874593804, 2.599115164415769e-02 3: 1.095035652659330505424147240084976534578418952240763525441535, 3.678643649904378e-02 4: 1.131485704080638653175790037904708402277455906284414289399986, 3.645005142130815e-02 5: 1.147884687198048718506398066361222614002745776909137282797727, 1.639898311741007e-02 6: 1.148720321893174344989720370927796480725850707839146637042703, 8.356346951256265e-04 7: 1.148698323855563082475350143443841164951825177665336110156093, 2.199803761126251e-05 8: 1.148698354995843974508573437584337692570858791644071608835544, 3.114028089203322e-08 9: 1.148698354997035006796886004382924209539326506468986782249386, 1.191032288312567e-12 10: 1.148698354997035006798626946777931199507739956238584846007635, 1.740942395006990e-21 11: 1.148698354997035006798626946777927589443850889097797494571041, 3.610063889067141e-33 12: 1.148698354997035006798626946777927589443850889097797505513712, 1.094267079108742e-53 13: 1.148698354997035006798626946777927589443850889097797505513712, 0.000000000000000e+00 14: 1.148698354997035006798626946777927589443850889097797505513711, 1.244603055572228e-60 15: 1.148698354997035006798626946777927589443850889097797505513711, 0.000000000000000e+00 16: 1.148698354997035006798626946777927589443850889097797505513712, 1.244603055572228e-60 17: 1.148698354997035006798626946777927589443850889097797505513711, 1.244603055572228e-60 18: 1.148698354997035006798626946777927589443850889097797505513711, 0.000000000000000e+00 19: 1.148698354997035006798626946777927589443850889097797505513712, 1.244603055572228e-60 20: 1.148698354997035006798626946777927589443850889097797505513711, 1.244603055572228e-60 #+END_SRC This is indeed faster than the Illinois method, with an efficiency index of $E\\approx 1.64232$$. For another example, compute that value of Lambert\u0026#39;s W function $W(100)$; this is the solution of $xe^x-100=0$; Lambert\u0026#39;s function is the inverse of $y=xe^x$. #+BEGIN_SRC Julia Julia\u0026gt; pegasus(x-\u0026gt;x*exp(x)-100,BigFloat(\u0026#34;3\u0026#34;),BigFloat(\u0026#34;4\u0026#34;),num_iter=11) 1: 3.251324125460162218273541775021703846514592591436893826064677, 7.486758745398378e-01 2: 3.340634428196726809051441645629843725138212599102446141079158, 8.931030273656459e-02 3: 3.380778785367380815168698736373250080216344738301354721341292, 4.014435717065401e-02 4: 3.385665875268764090984779722929318950325131118803147296896820, 4.887089901383276e-03 5: 3.385630033731458485133511320033990390684414498737973972888242, 3.584153730560585e-05 6: 3.385630140287712138407043176029520651934410763166067578057173, 1.065562536532735e-07 7: 3.385630140290050184887119964591950428421045978948047751759305, 2.338046480076789e-12 8: 3.385630140290050184888244364529728481623112988246686294800948, 1.124399937778053e-21 9: 3.385630140290050184888244364529726867491694170157806679271792, 1.614131418818089e-33 10: 3.385630140290050184888244364529726867491694170157806680386175, 1.114382727073817e-54 11: 3.385630140290050184888244364529726867491694170157806680386175, 0.000000000000000e+00 #+END_SRC ** Other similar methods These differ only in the definition of the scaling value $\\gamma$; several are given by J. A. Ford in \u0026#34;Improved Algorithms of Illinois-Type for the Numerical Solution of Nonlinear Equations\u0026#34; /University of Essex, Department of Computer Science/ (1995). This article is happily [available online](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.53.8676\u0026amp;rep=rep1\u0026amp;type=pdf). Following Ford, define \\[ \\phi_k=\\frac{f_{k+1}}{f_k} \\] and then: | Method | $\\gamma$ | Efficiency Index | |-------------------+-----------------------------------------------+--------------------| | Anderson \u0026amp; Björck | if $f_i\u0026gt;f_{i+1}$ then $1-\\phi_i$ else 0.5 | 1.70998 or 1.68179 | | Ford method 1 | $(1-\\phi_i-\\phi_{i-1})/(1+\\phi_i-\\phi_{i-1})$ | 1.55113 | | Ford method 2 | $(1-\\phi_i)/(1-\\phi_{i-1})$ | 1.61803 | | Ford method 3 | $1-(\\phi_i/(1-\\phi_{i-1}))$ | 1.70998 | | Ford method 4 | $1-\\phi_i-\\phi_{i-1}$ | 1.68179 | | Ford method 5 | $(1-\\phi_i)/(1+\\phi_i-\\phi_{i-1})$ | not given | The efficiency of the Anderson-Björck method depends on whether the sign of \\[ K = \\left(\\frac{c_2}{c_1}\\right)^{\\!2}-\\frac{c_3}{c_1} \\] is positive or negative, where \\[ c_k = \\frac{f^{(k)}(x^*)}{k!},\\;k\\ge 1 \\] and $x^*$ is the solution. Note that the $c_k$ values are simply the coefficients of the Taylor series expansion of $f(x)$ about the root $x^*$; that is \\[ f(x-x^*) = c_1x+c_2x^2+c_3x^3+\\cdots \\] * The Illinois method for solving equations :mathematics:julia:computation: :PROPERTIES: :EXPORT_FILE_NAME: illinois_method :EXPORT_DATE: 2023-07-05 :END: Such a long time since a last post! Well, that\u0026#39;s academic life for you ... If you look at pretty much any modern textbook on numerical methods, of which there are many, you\u0026#39;ll find that the following methods will be given for the solution of a single non-linear equation $f(x)=0$: - direct iteration, also known as [fixed-point iteration](https://en.wikipedia.org/wiki/Fixed-point_iteration) - [bisection method](https://en.wikipedia.org/wiki/Bisection_method) - method of false position, also known as [regula falsi](https://en.wikipedia.org/wiki/Regula_falsi) - [secant method](https://en.wikipedia.org/wiki/Secant_method) - [Newton\u0026#39;s method](https://en.wikipedia.org/wiki/Newton%27s_method), also known as the Newton-Raphson method Occasionally a text might specify, or mention, one or two more, but these five seem to be the \u0026#34;classic\u0026#34; methods. All of the above have their advantages and disadvantages: - direct iteration is easy, but can\u0026#39;t be guaranteed to converge, nor converge to a particular solution - bisection /is/ guaranteed to work (assuming the function $f(x)$ is continuous in a neighbourhood of the solution), but is very slow - method of false position is supposed to improve on bisection, but has its own problems, including also slow convergence - the secant method is quite fast (order of convergence is about 1.6) but is not guaranteed to converge always - Newton\u0026#39;s method is fast (quadratic convergence), theoretically very straightforward, but does require the computation of the derivative $f\u0026#39;(x)$ and is not guaranteed to converge. Methods such as the Brent-Dekker-van Wijngaarden method - also known as [Brent\u0026#39;s method](https://en.wikipedia.org/wiki/Brent%27s_method), [Ridder\u0026#39;s method](https://en.wikipedia.org/wiki/Ridders%27_method), both of which may be considered as a sort of amalgam of bisection and inverse quadratic interpolation, are generally not covered in introductory texts, although some of the newer methods are both simple and guaranteed to converge quickly. All these methods have the advantage of not requiring the computation of the derivative. This blog post is about a variation of the method of false position, which is amazingly simple, and yet extremely fast. ** The Illinois method This method seems to go back to 1953 when it was published in an internal memo at the University of Illinois Computer Laboratory by J. N. Snyder, \u0026#34;Inverse interpolation, a real root of $f(x)=0\u0026#34;, /University of Illinois Digital Computer Laboratory, ILLIAC I Library Routine H1-71 4/ Since then it seems to have been called the \u0026#34;Illinois method\u0026#34; by almost everybody, although a few writers are now trying to name it \u0026#34;Snyder\u0026#39;s method\u0026#34;. To start, note a well known problem with false position: if the function is concave (or convex) in a neighbourhood of the root including the bracketing interval, then the values will converge from one side only: [[file:/false_position.png]] This slows down convergence. Snyder\u0026#39;s insight was that if this behaviour started: that is, if there were two consecutive iterations $x_i$, $x_{i+1}$ on one side of the root, then for the next iteration the secant would be computed not with the function value $f(x_n)$ on the other side of the root, but /half/ that function value $f(x_n)/2$. The algorithm for making this work has been described thus by M. Dowell and P. Jarratt (\u0026#34;A modified regula falsi method for computing the root of an equation\u0026#34;, /BIT 11/, 1971, pp168 - 174.) At each stage we keep track of the $x$ values $x_i$ and the corresponding function values $f_i=f(x_i)$. As usual $x_{i-1}$ and $x_i$ bracket the root: Start by performing the usual secant operation \\[ x_{i+1}=x_i-f(x_i)\\frac{x_i-x_{i-1}}{f(x_i)-f(x_{i-1})}=\\frac{x_{i-1}f(x_i)-x_if(x_{i-1})}{f(x_i)-f(x_{i-1})} \\] and set $f_{i+1}=f(x_{i+1})$. Then: 1. if $f_if_{i+1}\u0026lt;0$, replace $(x_{i-1},f_{i-1})$ with $(x_i,f_i)$ 2. if $f_if_{i+1}\u0026gt;0$, replace $(x_{i-1},f_{i-1})$ with $(x_{i-1},f_{i-1}/2)$ In each case we replace $(x_i,f_i)$ by $(x_{i+1},f_{i+1})$. Before we show how fast this can be, here\u0026#39;s a function to perform false position, in Julia: #+BEGIN_SRC Julia function false_pos(f,a,b;niter = 20) if f(a)*f(b)\u0026gt;0 error(\u0026#34;Values are not guaranteed to bracket a root\u0026#34;) else for k in 1:niter c = b - f(b)*(b-a)/(f(b)-f(a)) if f(b)*f(c)\u0026lt;0 a = b end b = c @printf(\u0026#34;%2d: %.15f, %.15f, %.15f\\n\u0026#34;,k,a,b,abs(a-b)) end end end #+END_SRC Note that we have adopted the Dowell-Jarratt logic, so that if $f_if_{i+1}\u0026gt;0$, then we do nothing. And here $a$, $b$, $c$ correspond to $x_{i-1}$, $x_i$, and $x_{i+1}$. This function, as you see, doesn\u0026#39;t so much return a value, but simply prints out the current bracketing values, along with their difference. Here\u0026#39;s an example: #+BEGIN_SRC Julia Julia\u0026gt; f(x) = x^5 - 2 Julia\u0026gt; false_pos(f,0.5,1.5) 1: 1.500000000000000, 0.760330578512397, 0.739669421487603 2: 1.500000000000000, 0.936277160385007, 0.563722839614993 3: 1.500000000000000, 1.041285513445667, 0.458714486554333 4: 1.500000000000000, 1.097156710176020, 0.402843289823980 5: 1.500000000000000, 1.124679454971997, 0.375320545028003 6: 1.500000000000000, 1.137668857062543, 0.362331142937457 7: 1.500000000000000, 1.143668984638562, 0.356331015361438 8: 1.500000000000000, 1.146412444361109, 0.353587555638891 9: 1.500000000000000, 1.147660927013766, 0.352339072986234 10: 1.500000000000000, 1.148227852400553, 0.351772147599447 11: 1.500000000000000, 1.148485034668113, 0.351514965331887 12: 1.500000000000000, 1.148601651598731, 0.351398348401269 13: 1.500000000000000, 1.148654519726769, 0.351345480273231 14: 1.500000000000000, 1.148678485212590, 0.351321514787410 15: 1.500000000000000, 1.148689348478166, 0.351310651521834 16: 1.500000000000000, 1.148694272572126, 0.351305727427874 17: 1.500000000000000, 1.148696504543074, 0.351303495456926 18: 1.500000000000000, 1.148697516236793, 0.351302483763207 19: 1.500000000000000, 1.148697974810134, 0.351302025189866 20: 1.500000000000000, 1.148698182668834, 0.351301817331166 #+END_SRC In fact, because of the problem we showed earlier, which this example exemplifies, the distance between bracketing values converges to a non-zero value (hence is more-or-less irrelevant as a measure of convergence), and it\u0026#39;s the values in the second column which converge to the root. Since $2^{1/5}=1.148698354997035$, we have got only 6 decimal places at 20 iterations, which is no faster than bisection. Here\u0026#39;s the Illinois method. Note that it is very similar to the above false position function, except that we are also keeping track of function values: #+BEGIN_SRC Julia function illinois(f,a,b;num_iter = 20) if f(a)*f(b) \u0026gt; 0 error(\u0026#34;Values given are not guaranteed to bracket a root\u0026#34;) else fa, fb = f(a), f(b) for k in 1:num_iter c = b - fb*(b-a)/(fb-fa) fc = f(c) if fb*fc \u0026lt; 0 a, fa = b, fb else fa = fa/2 end b, fb = c, fc @printf(\u0026#34;%2d: %.15f, %.15f, %.15f\\n\u0026#34;,k,a,b,abs(a-b)) end end end #+END_SRC And with the same function and initial bracketing values as before: #+BEGIN_SRC Julia julia\u0026gt; illinois(f,0.5,1.5) 1: 1.500000000000000, 0.760330578512397, 0.739669421487603 2: 1.500000000000000, 0.936277160385007, 0.563722839614993 3: 1.500000000000000, 1.113315730198992, 0.386684269801008 4: 1.113315730198992, 1.179659804462764, 0.066344074263773 5: 1.179659804462764, 1.146786019205345, 0.032873785257419 6: 1.179659804462764, 1.148597847114352, 0.031061957348412 7: 1.148597847114352, 1.148787731780184, 0.000189884665832 8: 1.148787731780184, 1.148698339356448, 0.000089392423736 9: 1.148787731780184, 1.148698354994601, 0.000089376785583 10: 1.148698354994601, 1.148698354999468, 0.000000000004867 11: 1.148698354994601, 1.148698354997035, 0.000000000002434 12: 1.148698354994601, 1.148698354997035, 0.000000000002434 13: 1.148698354997035, 1.148698354997035, 0.000000000000000 14: 1.148698354997035, 1.148698354997035, 0.000000000000000 15: 1.148698354997035, 1.148698354997035, 0.000000000000000 16: 1.148698354997035, 1.148698354997035, 0.000000000000000 17: 1.148698354997035, 1.148698354997035, 0.000000000000000 18: 1.148698354997035, 1.148698354997035, 0.000000000000000 19: 1.148698354997035, 1.148698354997035, 0.000000000000000 20: 1.148698354997035, 1.148698354997035, 0.000000000000000 #+END_SRC Here the bracketing values do indeed get close together so that their difference converges to zero (as you\u0026#39;d expect), and also by 13 iterations we have obtained 15 decimal place accuracy. Dowell and Jarratt show that this method has an efficiency index $E = 3^{1/3}\\approx 1.442$; here $E=p/C$ where $p$ is the order of convergence and $C$ is the \u0026#34;cost\u0026#34; per iteration (measured in terms of arithmetic operations). As we see in the example below, the number of correct significant figures roughly triples every three iterations. A more useful output can be obtained by slightly adjusting the above function so that it prints out the most recent values, and their successive absolute differences. Even better, we can use =BigFloat= to see a few more digits: #+BEGIN_SRC Julia Julia\u0026gt; setprecision(200) Julia\u0026gt; illinois(f,BigFloat(\u0026#34;0.5\u0026#34;),BigFloat(\u0026#34;1.5\u0026#34;)) 1: 0.760330578512396694214876033057851239669421487603305785123967, 7.396694214876033e-01 2: 0.936277160385007078769511771881822637553598228912035496092550, 1.759465818726104e-01 3: 1.113315730198991546263163399473256828664229343275663554233298, 1.770385698139845e-01 4: 1.179659804462764330162645293397126720145901264006900448944524, 6.634407426377278e-02 5: 1.146786019205344856070782528108967774498747386650573905592277, 3.287378525741947e-02 6: 1.148597847114352112643459413647963815311973435038986609382286, 1.811827909007257e-03 7: 1.148787731780183703868308039549154052483519742061776232119911, 1.898846658315912e-04 8: 1.148698339356448159061880000968862518745749828816440530467845, 8.939242373554481e-05 9: 1.148698354994601301571564951679026859974778940076712488922696, 1.563815314250968e-08 10: 1.148698354999467954514826379562077090994393405782639396087335, 4.866652943261428e-12 11: 1.148698354997035006798616637583092713250045478960431687721718, 2.432947716209742e-12 12: 1.148698354997035006798626946777927545774018995974486715443598, 1.030919483483252e-23 13: 1.148698354997035006798626946777927633113682781851136780227430, 8.733966378587665e-35 14: 1.148698354997035006798626946777927589443850889097797505513711, 4.366983189275334e-35 15: 1.148698354997035006798626946777927589443850889097797505513711, 0.000000000000000e+00 16: 1.148698354997035006798626946777927589443850889097797505513712, 1.244603055572228e-60 17: 1.148698354997035006798626946777927589443850889097797505513711, 1.244603055572228e-60 18: 1.148698354997035006798626946777927589443850889097797505513711, 0.000000000000000e+00 19: 1.148698354997035006798626946777927589443850889097797505513712, 1.244603055572228e-60 20: 1.148698354997035006798626946777927589443850889097797505513711, 1.244603055572228e-60 #+END_SRC The speed of reaching 60 decimal place accuracy is very much in keeping with the order of convergence being about 1.4. Alternatively, we\u0026#39;d expect the number of correct significant figures to roughly triple each three iterations. The Illinois method is disarmingly simple, produces excellent results, and since it\u0026#39;s a bracketing method, will be guaranteed to converge. What\u0026#39;s not to like? Time to get it back in the textbooks! * Carroll\u0026#39;s \u0026#34;improved\u0026#34; Doublets: allowing permutations :programming:julia: :PROPERTIES: :EXPORT_FILE_NAME: doublets_with_permutations :EXPORT_DATE: 2022-11-07 :END: Carroll originally invented his Doublets in 1877, they were published in \u0026#34;Vanity Fair\u0026#34; (the magazine, not the Thackeray novel) in 1879. Some years later, in an 1892 letter, Carroll added another rule: that permutations were allowed. This allows very neat chains such as: #+begin_export html \u0026lt;div style = \u0026#34;background-color:#DDDDDD;font-size:0.8em\u0026#34;\u0026gt;\u0026lt;samp\u0026gt; roses, noses, notes, steno, stent, scent \u0026lt;/samp\u0026gt;\u0026lt;/div\u0026gt; #+end_export Because the words stay the same length here, but more connectivity is allowed, we would expect that not only would the largest connected component of the graph be bigger than before, but that chains would be shorter. And this time we can connect \u0026#34;chair\u0026#34; with \u0026#34;table\u0026#34;: #+begin_export html \u0026lt;div style = \u0026#34;background-color:#DDDDDD;font-size:0.8em\u0026#34;\u0026gt;\u0026lt;samp\u0026gt; chair, chain, china, chink, clink, blink, blind, blend, blent, bleat, table \u0026lt;/samp\u0026gt;\u0026lt;/div\u0026gt; #+end_export To test if two words are permutations, we can create two small function: #+begin_src Julia Julia\u0026gt; ssort(x) = join(sort(collect(x))) Julia\u0026gt; scmp(x,y) = cmp(ssort(x),ssort(y)) #+end_src The first function =ssort= simply alphabetises a string; the second function =scmp= compares two sorted strings. The function returns zero if the sorted strings are identical. We can then create the graph. As before, we\u0026#39;ll start with the \u0026#34;medium\u0026#34; word list, and its sublist of five-letter words. #+begin_src Julia Julia\u0026gt; nw = length(words5) Julia\u0026gt; G5 = Graph(nw) Julia\u0026gt; for i in 1:nw for j in i+1:nw wi = words5[i] wj = words5[j] if (Hamming()(wi,wj) == 1) | (scmp(wi,wj) == 0) add_edge!(G5,i,j) end end end #+end_src This graph G5 has 4388 vertices, 11107 edges. As before, find the largest connected component: #+begin_src Julia Julia\u0026gt; CC = connected_components(G5) Julia\u0026gt; CL = map(length, CC) Julia\u0026gt; mx, indx = findmax(CL) Julia\u0026gt; C1, vmap1 = induced_subgraph(G5,CC[indx]) #+end_src This new graph has 3665 vertices and 10946 edges. This is larger than the graph using only the Hamming distance, which had 4072 vertices. Rather than just find aloof words, we\u0026#39;ll find the number of connected components of all sizes; it turns out that there are only a small number of different such sizes: #+begin_src Julia Julia\u0026gt; u = sort(unique(CL), rev = true) Julia\u0026gt; show(u) [3665, 12, 6, 5, 4, 3, 2, 1] Julia\u0026gt; freqs = zeros(Int16,2,length(u)) Julia\u0026gt; for i in 1:length(u) freqs[1,i] = u[i] freqs[2,i] = count(x -\u0026gt; x == u[i], CL)) end Julia\u0026gt; display(freqs) 2×8 Matrix{Int16}: 3665 12 6 5 4 3 2 1 1 1 2 4 4 18 62 485 #+end_src We see that there are 485 aloof words (less than before), and various small components. The components between 4 and 12 words are: #+begin_export html \u0026lt;div style = \u0026#34;background-color:#DDDDDD;font-size:0.8em\u0026#34;\u0026gt;\u0026lt;samp\u0026gt; alive, voice, alike, olive, voile allow, local, loyal, royal, vocal, aglow, allay, alley, allot, alloy, focal, atoll group, tutor, trout, croup, grout radio, ratio, patio, radii amber, embed, ebbed, ember, umbel, umber chief, thief, fiche, niche fizzy, fuzzy, dizzy, tizzy moron, bacon, baron, baton, boron ocean, canoe, canon, capon pupil, papal, papas, pupae, pupal buddy, giddy, muddy, ruddy, biddy, middy \u0026lt;/samp\u0026gt;\u0026lt;/div\u0026gt; #+end_export And now for the longest ladder: #+begin_src Julia Julia\u0026gt; eccs = eccentricity(CL); Julia\u0026gt; mx = maximum(eccs) Julia\u0026gt; inds = findall(x -\u0026gt; x==mx, eccs) Julia\u0026gt; show(inds) [83, 2984, 3024] #+end_src These correspond to the words \u0026#34;court\u0026#34;, \u0026#34;cabby\u0026#34;, \u0026#34;gabby\u0026#34;. The last two words are adjacent in the graph, but each of the other pairs produces a ladder of maximum length of 27. Here\u0026#39;s one of them: #+begin_src Julia Julia\u0026gt; ld = ladder(subwords[inds[1]], subwords[inds[2]]) Julia\u0026gt; println(join(ld,\u0026#34;, \u0026#34;),\u0026#34;\\nlength is \u0026#34;,length(ld)) court, count, mount, mound, wound, would, could, cloud, clout, flout, flour, floor, flood, blood, brood, broad, board, boars, boors, boobs, booby, bobby, hobby, hubby, tubby, tabby, cabby length is 27 #+end_src * Super Doublets: more word ladders with Julia :programming:julia: :PROPERTIES: :EXPORT_FILE_NAME: super_doublets_more_word_ladders_with_julia :EXPORT_DATE: 2022-11-05 :END: Apparently there\u0026#39;s a version of Doublets (see previous post) which allows you to add or delete a letter each turn. Thus we can go from WHEAT to BREAD as WHEAT, HEAT, HEAD, READ, BREAD which is shorter than the ladder given in that previous post. However, we can easily adjust the material from that post to implement this new version. There are two major differences: 1. We have to use all the words in the list, since with additions and deletions, all words are potential elements in a ladder. That is, we can\u0026#39;t restrict words by length. 2. The distance between words is no long the Hamming distance. For this version we need the [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance). This counts the number of additions, deletions, and replacements, to go from one string to another. The new version of Doublets thus requires that the Levenshtein distance between two words is 1. Other than that it\u0026#39;s all the same as previously. Starting with the list =medium.wds=, constructing the graph (which has 59577 vertices and 78888 edges), and determining the eccentricities of the largest connected component now take a longer time: Constructing the graph took over 24 minutes, and the eccentricities took a bit under 4 1/2 minutes on my machine. The largest connected component has 23801 words; the next largest has 41 words (see below). There are also 14688 aloof words; here\u0026#39;s a random 10 of them: #+begin_export html \u0026lt;div style = \u0026#34;background-color:#DDDDDD;font-size:0.8em\u0026#34;\u0026gt;\u0026lt;samp\u0026gt; evading, foliage, irksome, thalami, discrediting, embodiment, absorption, persisted, supplementing, dispatch\u0026lt;/samp\u0026gt;\u0026lt;/div\u0026gt; #+end_export We do see some reductions of ladder lengths. Getting \u0026#34;scent\u0026#34; from \u0026#34;roses\u0026#34; took 11 words, but in this new version it\u0026#39;s quicker: #+begin_export html \u0026lt;div style = \u0026#34;background-color:#DDDDDD;font-size:0.8em\u0026#34;\u0026gt;\u0026lt;samp\u0026gt; roses, roes, res, rest, rent, cent, scent \u0026lt;/samp\u0026gt;\u0026lt;/div\u0026gt; #+end_export And the longest word ladder has 42 words: #+begin_export html \u0026lt;div style = \u0026#34;background-color:#DDDDDD;font-size:0.8em\u0026#34;\u0026gt;\u0026lt;samp\u0026gt; hammerings, hammering, hampering, pampering, papering, capering, catering, cantering, bantering, battering, bettering, fettering, festering, pestering, petering, peering, peeing, pieing, piing, ping, pine, pane, pale, paled, pealed, peeled, peered, petered, pestered, festered, fettered, bettered, battered, bantered, cantered, catered, capered, tapered, tampered, hampered, hammered, yammered \u0026lt;/samp\u0026gt;\u0026lt;/div\u0026gt; #+end_export Interestingly, we might expect that most words are connected, but in fact there are 25337 connected components. One of them consists of #+begin_export html \u0026lt;div style = \u0026#34;background-color:#DDDDDD;font-size:0.8em\u0026#34;\u0026gt;\u0026lt;samp\u0026gt; metabolise, metabolised, metabolises, metabolism, metabolisms, metabolite, metabolites, metabolize, metabolized, metabolizes \u0026lt;/samp\u0026gt;\u0026lt;/div\u0026gt; #+end_export Many of the connected components seem to be like this: all based around one particular word with its various grammatical forms. The two second biggest connected component contain 41 words each. Here\u0026#39;s one: #+begin_export html \u0026lt;div style = \u0026#34;background-color:#DDDDDD;font-size:0.8em\u0026#34;\u0026gt;\u0026lt;samp\u0026gt; complete, completed, completes, compose, composed, composes, compute, computed, computer, computers, computes, comfort, compete, competed, competes, composer, composers, comforted, comforts, commune, communed, communes, commute, commuted, commuter, commuters, commutes, completer, completest, complexes, compost, composted, composts, comforter, comforters, complected, comport, comported, comports, compote, compotes \u0026lt;/samp\u0026gt;\u0026lt;/div\u0026gt; #+end_export * Word ladders with Julia :programming:julia: :PROPERTIES: :EXPORT_FILE_NAME: word_ladders_with_julia :EXPORT_DATE: 2022-11-03 :END: ** Lewis Carroll\u0026#39;s game of Doublets Such a long time since my last post! Well, that\u0026#39;s the working life for you. Anyway, recently I was reading about Lewis Carroll - always one of my favourite people - and was reminded of his word game \u0026#34;Doublets\u0026#34; in which one word is turned into another by changing one letter at a time, each new word being English. You can read Carroll\u0026#39;s original description [here](https://lewiscarrollresources.net/doublets/). Note his last sentence: \u0026#34;It is, perhaps, needless to state that it is /de rigueur/ that the links should be English words, such as might be used in good society.\u0026#34; Carroll, it seemed, frowned on slang words, or \u0026#34;low\u0026#34; words - very much in keeping with his personality and with his social and professional positions. One of his examples was \u0026#34;Change WHEAT into BREAD\u0026#34; which has an answer: WHEAT CHEAT CHEAP CHEEP CREEP CREED BREED BREAD. Clearly we would want the length of the chain of words to be as short as possible; with a lower bound being the Hamming distance between the words: the number of places in which letters are different. This distance is 3 for WHEAT and BREAD and so the minimum changes will be 3. But in practice chains are longer, as the one above. English simply doesn\u0026#39;t contain all possible words of 5 letters, and so we can\u0026#39;t have, for example: WHEAT WHEAD WREAD BREAD This form of word puzzle, so simple and addictive, has been resurrected many times, often under such names as \u0026#34;word ladders\u0026#34;, or \u0026#34;laddergrams\u0026#34;. ** Obtaining word lists Every computer system will have a spell-check list on it; on a Linux system these are usually found under =/usr/share/dict= . My system, running Arch Linux has these lists: #+begin_src bash $ wc -l /usr/share/dict/* 123115 /usr/share/dict/american-english 127466 /usr/share/dict/british-english 189057 /usr/share/dict/catalan 54763 /usr/share/dict/cracklib-small 88328 /usr/share/dict/finnish 221377 /usr/share/dict/french 304736 /usr/share/dict/german 92034 /usr/share/dict/italian 76258 /usr/share/dict/ogerman 56329 /usr/share/dict/spanish 123115 /usr/share/dict/usa #+end_src These all come from the package [words](https://archlinux.org/packages/community/any/words/), which is a collection of spell-check dictionaries. Although the sizes of the English word lists may seem impressive, there are bigger lists available. One of the best is at [SCOWL](http://wordlist.aspell.net) (Spell Checker Oriented Word Lists). You can download the compressed SCOWL file, and when uncompressed you\u0026#39;ll find it contains a directory called =final=. In this directory, the largest files are those of the sort =english-words.*=, and here\u0026#39;s how big they are: #+begin_src bash $ wc -l english-words.* 4373 english-words.10 7951 english-words.20 36103 english-words.35 6391 english-words.40 23796 english-words.50 6233 english-words.55 13438 english-words.60 33270 english-words.70 139209 english-words.80 219489 english-words.95 490253 total #+end_src These lists contain increasingly more abstruse and unusual words. Thus =english-words.10= contains words that most adept speakers of English would know: #+begin_src bash $ shuf -n 10 english-words.10 errors hints green connections still mouth category\u0026#39;s pi won\u0026#39;s varied #+end_src At the other end, =english-words.95= consists of unusual, obscure words unlikely to be in a common vocabulary; many of them are seldom used, have very specific meanings, or are technical: #+begin_src bash $ shuf -n 10 english-words.95 deutschemark\u0026#39;s disingenious retanner advancer\u0026#39;s shlimazl unpontifical nonrequirement peccancy\u0026#39;s photozinco nonuniting #+end_src This is the list which contains some splendid biological terms: \u0026#34;bdelloid\u0026#34;, which is a class of microscopic water animals called [rotifers](https://en.wikipedia.org/wiki/Bdelloidea); \u0026#34;ctenizidae\u0026#34;, a small class of spiders (but the list does not contain \u0026#34;ctenidae\u0026#34; the much larger class of wandering spiders, including the infamous Brazilian wandering spider); \u0026#34;cnidocyst\u0026#34; which forms the stinging mechanism in the cells of the tentacles of jellyfish. Putting the lists 10, 25, 30 together makes a \u0026#34;small\u0026#34; list; adding in 40 and 50 gives a \u0026#34;medium\u0026#34; list; next include 55, 65, 70 for \u0026#34;large\u0026#34;; \u0026#34;80\u0026#34; for \u0026#34;huge\u0026#34;, and 95 for \u0026#34;insane\u0026#34;. The author of SCOWL claims that 60 is the right level for spell checking, while: \u0026#34;The 95 contains just about every English word in existence and then some. Many of the words at the 95 level will probably not be considered valid English words by most people.\u0026#34; (This is not true. I\u0026#39;ve discovered some words not in any list. One such is \u0026#34;buckleys\u0026#34;, as in \u0026#34;He hansn\u0026#39;t got buckleys chance\u0026#34;, sometimes spelled with a capital B, and meaning \u0026#34;He hasn\u0026#39;t got a chance\u0026#34;; that is, no chance at all. \u0026#34;Buckley\u0026#34; is only in the list of proper names, but given this usage it should be at least in the Australian words list. Which it isn\u0026#39;t.) You will also see from above that some words contain apostrophes: these words will need to be weeded out. Here\u0026#39;s how to make a list and clean it up: #+begin_src bash $ cat english-words.10 english-words.25 english-words.30 \u0026gt; english-words-small.txt $ grep -v \u0026#34;\u0026#39;\u0026#34; english-words-small.txt \u0026gt; small.wds #+end_src This approach will yield five lists with the following numbers of words: #+begin_src bash $ wc -l *.wds 234563 huge.wds 414365 insane.wds 107729 large.wds 59577 medium.wds 38013 small.wds 854247 total #+end_src These can be read into Julia. These lists are unsorted, but that won\u0026#39;t be an issue for our use of them. But you can certainly include a sort along the way. Another list is available at https://github.com/dwyl/english-words which claims to have \u0026#34;over 466k English words\u0026#34;. However, this list is not as carefully curated as is SCOWL. Finally note that our lists are not disjoint, as are the original lists. Each list includes its predecessor, so that =insane.wds= contains all of the words in all of the lists. ** Using graph theory The computation of word ladders can easily be managed using the tools of graph theory, with vertices being the words, and two vertices being adjacent if their Hamming distance is 1. Then finding a word ladder is easily done by a shortest path. There is a problem though, as Donald Knuth discovered when he launched the first computerization of this puzzle, of which an explanation is available in his 1994 book [The Stanford GraphBase](https://www-cs-faculty.stanford.edu/~knuth/sgb.html). This page, you\u0026#39;ll notice, contains \u0026#34;the 5757 five-letter words of English\u0026#34;. However, deciding what is and what isn\u0026#39;t an English word can be tricky: American versus English spellings, dialect words, newly created words and so on. I touched on this in an [earlier post](https://numbersandshapes.net/posts/five_letter_words_in_english/). Knuth also found that there were 671 words which were connected to no others; he called these \u0026#34;aloof words\u0026#34;, with ALOOF, of course, being one of them. ** Using Julia Although Python has its mature and powerful [NetworkX](https://networkx.org) package for graph theory and network analysis, Python is too slow for this application: we are looking at very large graphs of many thousands of vertices, and computing the edges is a non-trivial task. So our choice of language is Julia. Julia\u0026#39;s graph theory packages are in a bit of state of flux, an old package ~Graphs.jl~ is unmaintained, as is ~LightGraphs.jl~. However, this latter package is receiving a new lease of life with the unfortunately confusing name of ~Graphs.jl~, and which is designed to be \u0026#34;functionally equivalent\u0026#34; to ~LightGraphs.jl~. This is the package I\u0026#39;ll be using. ** Setting up the graph We\u0026#39;ll use the ~medium.wds~ dictionary since it\u0026#39;s relatively small, and we\u0026#39;ll look at five letter words. Using six-letter words or the larger list will then be a simple matter of changing a few parameters. We start by open the list in Julia: #+BEGIN_SRC Julia Julia\u0026gt; f = open(\u0026#34;medium.wds\u0026#34;, \u0026#34;r\u0026#34;) Julia\u0026gt; words = readlines(f) Julia\u0026gt; length(words) 59577 #+END_SRC Now we can easily extract the five letter words, and set up the graph, first of all loading the Graphs package. We also need the StringDistances package to find the Hamming distance. #+BEGIN_SRC Julia Julia\u0026gt; using Graphs, StringDistances Julia\u0026gt; words5 = filter(x -\u0026gt; length(x)==5, words) Julia\u0026gt; w5 = length(words5) 4388 Julia\u0026gt; G5 = Graph() Julia\u0026gt; add_vertices!(G5, w5); #+END_SRC Now the edges: #+BEGIN_SRC Julia Julia\u0026gt; for i in 1:w5 for j in i+1:w5 if Hamming()(words5[i],words5[j]) == 1 add_edge!(G5,i,j) end end end Julia\u0026gt; #+END_SRC Note that there is a Julia package ~MetaGraph.jl~ which allows you to add labels to edges. However, it\u0026#39;s just as easy to use the vertex numbers as indices into the list of words. We can\u0026#39;t use the graph G5 directly, as it is not a connected graph (remember Donald Knuth\u0026#39;s \u0026#34;aloof\u0026#34; words?) We\u0026#39;ll do two things: find the aloof words, and choose from G5 the largest connected component. First the aloof words: #+BEGIN_SRC Julia Julia\u0026gt; aloofs = map(x-\u0026gt;words5[x],findall(iszero, degree(G5))) Julia\u0026gt; show(aloofs) #+END_SRC I won\u0026#39;t include this list, as it\u0026#39;s too long - it contains 616 words. But if you do so, you\u0026#39;ll see some surprises here: who\u0026#39;d have though that such innocuous words as \u0026#34;opera\u0026#34;, \u0026#34;salad\u0026#34; or \u0026#34;wagon\u0026#34; were aloof? But they most certainly are, at least within this set of words. And now for the connected component: #+BEGIN_SRC Julia Julia\u0026gt; CC = connected_components(G5) Julia\u0026gt; CL = map(length,CC) # size of each component Julia? mx, idx = findmax(CL) Julia\u0026gt; C5,vmap = induced_subgraph(G5, CC[idx]) #+END_SRC You will find that the maximum connected component ~C5~ has 3315 vertices, and the value of ~idx~ above is 3. Here ~vmap~ is a list of length 3315 which is the set of indices into the original list. Or, if the original list of vertices consisted of the numbers 1,2, up to 4388, then ~vmap~ is a sublist of length 3315. And we can consider all those numbers as indices into the ~words5~ list. Now we can write a little function to produce word ladders; here using the Julia function ~a_star~ to find a shortest path: #+BEGIN_SRC Julia Julia\u0026gt; subwords = words5[vmap] Julia\u0026gt; function ladder(w1,w2) i = findfirst(x -\u0026gt; words5[x] == w1, vmap) j = findfirst(x -\u0026gt; words5[x] == w2, vmap) P = a_star(C5,i,j) verts = append!([i],map(dst,P)) subwords[verts] end #+END_SRC And of course try it out: #+BEGIN_SRC Julia Julia\u0026gt; show(ladder(\u0026#34;wheat\u0026#34;,\u0026#34;bread\u0026#34;)) [\u0026#34;wheat\u0026#34;, \u0026#34;cheat\u0026#34;, \u0026#34;cleat\u0026#34;, \u0026#34;bleat\u0026#34;, \u0026#34;bleak\u0026#34;, \u0026#34;break\u0026#34;, \u0026#34;bread\u0026#34;] #+END_SRC Note that a word ladder can only exist when both words have indices in the chosen largest connected component. For example: #+BEGIN_SRC Julia Julia\u0026gt; show(ladder(\u0026#34;roses\u0026#34;,\u0026#34;scent\u0026#34;)) [\u0026#34;roses\u0026#34;, \u0026#34;ruses\u0026#34;, \u0026#34;rusts\u0026#34;, \u0026#34;rests\u0026#34;, \u0026#34;tests\u0026#34;, \u0026#34;teats\u0026#34;, \u0026#34;seats\u0026#34;, \u0026#34;scats\u0026#34;, \u0026#34;scans\u0026#34;, \u0026#34;scant\u0026#34;, \u0026#34;scent\u0026#34;] #+END_SRC but #+BEGIN_SRC Julia Julia\u0026gt; show(ladder(\u0026#34;chair\u0026#34;,\u0026#34;table\u0026#34;)) #+END_SRC will produce an error, as neither of those words have indices in the largest connected component. In fact \u0026#34;chair\u0026#34; sits in a small connected component of 3 words, and \u0026#34;table\u0026#34; is in another connected component of 5 words. ** The longest possible ladder We have seen that the length of a word ladder will often exceed the Hamming distance between the start and ending words. But what is the maximum length of such a ladder? Here the choice of function is the /eccentricities/ of a graph: for each vertex, find the shortest path to every other vertex. The length of the longest such path is the /eccentricity/ of that vertex. #+BEGIN_SRC Julia Julia\u0026gt; eccs = eccentricity(C5) #+END_SRC This command will take a noticeable amount of time - only a few seconds, it\u0026#39;s true, but it is far from instantaneous. The intense computation needed here is one of the reasons that I prefer Julia to Python for this experiment. Now we can use the eccentricities to find the longest path. Since this is an undirected graph, there must be at least two vertices with equal largest eccentricities. #+BEGIN_SRC Julia Julia\u0026gt; mx = maximum(eccs) Julia\u0026gt; inds = findall(x -\u0026gt; x == mx, eccs) Julia\u0026gt; ld = ladder(subwords[inds[1]], subwords[inds[2]]) Julia\u0026gt; print(join(ld,\u0026#34;, \u0026#34;),\u0026#34;\\nwhich has length \u0026#34;,length(ld)) aloud, cloud, clout, flout, float, bloat, bleat, cleat, cleft, clefs, clews, slews, sleds, seeds, feeds, feels, fuels, furls, curls, curly, curvy, curve, carve, calve, valve, value, vague, vogue, rogue\u0026#34; which has length 29 #+END_SRC Who\u0026#39;d have guessed? ** Results from other lists With the 12478 five-letter words from the \u0026#34;large\u0026#34; list, there are 748 aloof words, and the longest ladder is #+BEGIN_EXPORT HTML \u0026lt;div style = \u0026#34;background-color:#DDDDDD;font-size:0.8em\u0026#34;\u0026gt;\u0026lt;samp\u0026gt; rayon, racon, bacon, baron, boron, boson, bosom, besom, besot, beset, beret, buret, curet, cures, curds, surds, suras, auras, arras, arias, arils, anils, anile, anole, anode, abode, abide, amide, amine, amino, amigo\u0026lt;/samp\u0026gt;\u0026lt;/div\u0026gt; #+END_EXPORT which has length 31. We might in general expect a shorter \u0026#34;longest ladder\u0026#34; than from using smaller list: the \u0026#34;large\u0026#34; list has far more words, hence greater connectivity, which would lead in many cases to shorter paths. The \u0026#34;huge\u0026#34; and \u0026#34;insane\u0026#34; lists have longest ladders of length 28 and respectively: #+BEGIN_EXPORT HTML \u0026lt;div style = \u0026#34;background-color:#DDDDDD;font-size:0.8em\u0026#34;\u0026gt;\u0026lt;samp\u0026gt; essay, assay, asway, alway, allay, alley, agley, aglet, ablet, abled, ailed, aired, sired, sered, seres, serrs, sears, scars, scary, snary, unary, unarm, inarm, inerm, inert, inept, inapt, unapt\u0026lt;/samp\u0026gt;\u0026lt;/div\u0026gt; #+END_EXPORT and #+BEGIN_EXPORT HTML \u0026lt;div style = \u0026#34;background-color:#DDDDDD;font-size:0.8em\u0026#34;\u0026gt;\u0026lt;samp\u0026gt; entry, entsy, antsy, artsy, artly, aptly, apply, apple, ample, amole, amoke, smoke, smote, smite, suite, quite, quote, quott, quoit, qubit, oubit, orbit, orbic, urbic, ureic, ureid, ursid\u0026lt;/samp\u0026gt;\u0026lt;/div\u0026gt; #+END_EXPORT For six-letter words, here are longest ladders in each of the lists: Small, length 32: #+BEGIN_EXPORT HTML \u0026lt;div style = \u0026#34;background-color:#DDDDDD;font-size:0.8em\u0026#34;\u0026gt;\u0026lt;samp\u0026gt; steady, steamy, steams, steals, steels, steers, sheers, cheers, cheeks, checks, chicks, clicks, slicks, slices, spices, spites, smites, smiles, smiled, sailed, tailed, tabled, tabbed, dabbed, dubbed, rubbed, rubber, rubier, rubies, rabies, babies, babied\u0026lt;/samp\u0026gt;\u0026lt;/div\u0026gt; #+END_EXPORT Medium, length 44: #+BEGIN_EXPORT HTML \u0026lt;div style = \u0026#34;background-color:#DDDDDD;font-size:0.8em\u0026#34;\u0026gt;\u0026lt;samp\u0026gt; trusty, trusts, crusts, crests, chests, cheats, cleats, bleats, bloats, floats, flouts, flours, floors, floods, bloods, broods, brooks, crooks, crocks, cracks, cranks, cranes, crates, crated, coated, boated, bolted, belted, belied, belies, bevies, levies, levees, levers, lovers, hovers, hovels, hotels, motels, models, modals, morals, morays, forays\u0026lt;/samp\u0026gt;\u0026lt;/div\u0026gt; #+END_EXPORT Large, length 43: #+BEGIN_EXPORT HTML \u0026lt;div style = \u0026#34;background-color:#DDDDDD;font-size:0.8em\u0026#34;\u0026gt;\u0026lt;samp\u0026gt; uneasy, unease, urease, crease, creese, cheese, cheesy, cheeky, cheeks, creeks, breeks, breeds, breads, broads, broods, bloods, bloops, sloops, stoops, strops, strips, stripe, strive, shrive, shrine, serine, ferine, feline, reline, repine, rapine, raping, raring, haring, hiring, siring, spring, sprint, splint, spline, saline, salina, saliva\u0026lt;/samp\u0026gt;\u0026lt;/div\u0026gt; #+END_EXPORT Huge, length 63: #+BEGIN_EXPORT HTML \u0026lt;div style = \u0026#34;background-color:#DDDDDD;font-size:0.8em\u0026#34;\u0026gt;\u0026lt;samp\u0026gt; aneath, sneath, smeath, smeeth, smeech, sleech, fleech, flench, clench, clunch, clutch, crutch, crotch, crouch, grouch, grough, trough, though, shough, slough, slouch, smouch, smooch, smooth, smoots, smolts, smalts, spalts, spaits, spains, spaing, spring, sprint, splint, spline, upline, uplink, unlink, unkink, unkind, unkend, unkent, unsent, unseat, unseal, unseel, unseen, unsewn, unsews, unmews, enmews, emmews, emmers, embers, embars, embark, imbark, impark, impart, import, impost, impose, impone \u0026lt;/samp\u0026gt;\u0026lt;/div\u0026gt; #+END_EXPORT Insane, length 49: #+BEGIN_EXPORT HTML \u0026lt;div style = \u0026#34;background-color:#DDDDDD;font-size:0.8em\u0026#34;\u0026gt;\u0026lt;samp\u0026gt; ambach, ambash, ambush, embush, embusk, embulk, embull, emball, empall, emparl, empark, embark, embars, embers, emmers, emmews, enmews, endews, enders, eiders, ciders, coders, cooers, cooees, cooeed, coomed, cromed, cromes, crimes, crises, crisis, crisic, critic, iritic, iridic, imidic, amidic, amidin, amidon, amydon, amylon, amylin, amyrin, amarin, asarin, asaron, usaron, uzaron, uzarin\u0026lt;/samp\u0026gt;\u0026lt;/div\u0026gt; #+END_EXPORT The length of time taken for the computations increases with size of the lists, and up to a certain extent, the length of the words. On my somewhat mature laptop (Lenovo X1 Carbon 3rd Generation), computing the eccentricities for six-letter words and the \u0026#34;insane\u0026#34; list took over 6 minutes. ** A previous experiment with Mathematica You can see this (with some nice graph diagrams) at https://blog.wolfram.com/2012/01/11/the-longest-word-ladder-puzzle-ever/ from about 10 years ago. This experiment used Mathematica\u0026#39;s English language list, about which I can find nothing other than it exists. However, in the comments, somebody shows that Mathematica 8 had 92518 words in its English dictionary. And this number is dwarfed by Finnish, with 728498 words. But that might just as much be an artifact of Finnish lexicography. Deciding whether or nor a word is English is very hard indeed, and any lexicographer will decide on whether two words need separate definitions, or whether one can be defined within the other, so to speak - that is, under the same [headword](https://www.thefreedictionary.com/headword). MOUSE, MOUSY, MOUSELIKE - do these require three definitions, or two, or just one? * Every academic their own text-matcher :Python:Education: :PROPERTIES: :EXPORT_FILE_NAME: academic_text_matching :EXPORT_DATE: 2022-06-19 :END: ** Plagiarism, text matching, and academic integrity Every modern academic teacher is in thrall to giant text-matching systems such as [Ouriginal](https://www.ouriginal.com) or [Turnitin](https://en.wikipedia.org/wiki/Turnitin). These systems are sold as \u0026#34;plagiarism detectors\u0026#34;, which they are not - they are text matching systems, and they generally work by providing a report showing how much of a student\u0026#39;s submitted work matches text from other sources. It is up to the academic to decide if the level of text matching constitutes plagiarism. Although Turnitin sells itself as a plagiarism detector, or at any rate a tool for supporting academic integrity, its software is closed source, so, paradoxically, there\u0026#39;s no way of knowing if any of its source code has been plagiarized from another source. Such systems work by having access to a giant corpus of material: published articles, reports, text on websites, blogs, previous student work obtained from all over, and so on. The more texts a system can try to match a submission against, the more confident an academic is supposed to have in its findings. (And the more likely an administration will see fit to paying the yearly licence costs.) Of course in the arms-race of academic integrity, you\u0026#39;ll find plenty of websites offering advice on \u0026#34;how to beat Turnitin\u0026#34;; but in the interests of integrity I\u0026#39;m not going to link to any, but they\u0026#39;re not hard to find. And of course Turnitin will presumably up its game to counter these methods, and the sites will be rewritten, and so on. ** My problem I have been teaching a fully online class; although my university is slowly trying to move back (at least partially) into on-campus delivery after 2 1/2 years of Covid remote learning, some classes will still run online. My students were completing an online \u0026#34;exam\u0026#34;: a timed test (un-invigilated) in which the questions were randomized so that no students got the same set of questions. They were all \u0026#34;Long Answer\u0026#34; questions in the parlance of our learning management system; at any rate for each question a text box was given for the student to enter their answer. The test was to be marked \u0026#34;by hand\u0026#34;. That is, by me. Many of my students speak English as a second language, and although they are supposed to have a basic competency sufficient for tertiary study, many of them struggle. And if a question asks them to define, for example, \u0026#34;layering\u0026#34; in the context of cybersecurity, I have not the slightest problem with them searching for information online, finding it, and copying it into the textbox. If they can search for the correct information and find it, that\u0026#39;s good enough for me. This exam is also open book. As far as I\u0026#39;m concerned, finding correct information is a useful and valuable skill; testing for the use of what they might remember, and \u0026#34;in their own words\u0026#34; is pedagogically indefensible. So, working my way grimly through these exams, I had a \u0026#34;this seems familiar...\u0026#34; moment. And indeed, searching through some previous submissions I found exactly the same answer submitted by another student. Well, that can happen. What is less likely to happen, at least by chance, is for almost all of the 16 questions to have the same submissions as other students. People working in the area of academic integrity sometimes speak of a \u0026#34;spidey sense\u0026#34; a sort of sixth sense that alerts you that something\u0026#39;s not right, even if you can\u0026#39;t quite yet pinpoint the issue. This was that sense, and more. It turned out that the entire test and all answers could be downloaded and saved as a CSV File, and hence loaded into Python as a Pandas DataFrame. My first attempt had me looking at all pairs of students and their test answers, to see if any of the answer text strings matched. And some indeed did. Because of the randomized nature of the test, one student might receive as question 7, say, the same question that another student might see as question 5, or question 8. The data I had to work with consisted of two DataFrames. Once contained all the exam information: #+BEGIN_SRC Python examdata.dtypes Username object FirstName object LastName object Q # int64 Q Text object Answer object Score float64 Out Of float64 dtype: object #+END_SRC This DataFrame was ordered by student, and then by question number. This meant that every student had up to 16 rows of the DataFrame. I had another DataFrame containing just the names and cohorts (there were two distinct cohorts, and this information was not given in the dump of exam data to the CSV file.) #+BEGIN_SRC Python names.dtypes Username object FirstName object LastName object Cohort object dtype: object #+END_SRC I added the cohorts by hand. This could then be merged with the exam data: #+BEGIN_SRC Python data = examdata.merge(names,on=[\u0026#34;Username\u0026#34;,\u0026#34;FirstName\u0026#34;,\u0026#34;LastName\u0026#34;],how=\u0026#39;left\u0026#39;).reset_index(drop=True) #+END_SRC ** String similarity Since the exam answers in my DataFrame were text strings, any formatting that the student might have given in an answer, such as bullet points or a numbered list, a table, font changes, were ignored. All I had to work in were ascii strings. However, exact string matching led to very few results. This is because there might have been a difference in starting or ending whitespace or other characters, or even if one student\u0026#39;s submission included another student\u0026#39;s submission as a substring. Consider for example these two (synthetic) examples: + \u0026#34;A man-in-the-middle attack is a cyberattack where the attacker secretly relays and possibly alters the communications between two parties who believe that they are directly communicating with each other, as the attacker has inserted themselves between the two parties.\u0026#34; (from the [Wikipedia page](https://en.wikipedia.org/wiki/Man-in-the-middle_attack) on the Man-In-The-Middle attack.) + \u0026#34;I think it\u0026#39;s this: A man-in-the-middle attack is a cyberattack where Mallory secretly relays and possibly alters the communications between Alice and Bob who believe that they are directly communicating with each other, as Mallory has inserted himself between them.\u0026#34; There are various ways of measuring the distance between strings, or alternatively of their similarity. Two much used methods are the /Jaro similarity/ measure (named for Matthew Jaro, who introduced it in 1989), and the /Jaro-Winkler measure/, a version named also for William Winkler, who discussed it in 1990. Both of these are defined on their [Wikipedia page](https://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance). Winkler\u0026#39;s measure adds to the original Jaro measure a factor based on the equality of any beginning substring. It turns out that the Jaro-Winkler similarity of the two strings above is about 0.78. If the first \u0026#34;I think it\u0026#39;s this: \u0026#34; is removed from the second string, then the similarity increases to 0.89. Both the Jaro and Jaro-Winkler measures are happily implemented in the Python [jellyfish](https://github.com/jamesturk/jellyfish/) package. This package also includes some other standard measurements of the closeness of two strings. My approach was to find the number of submissions whose Jaro-Winkler similarity exceeded 0.85. And I found this number empirically, by checking a number of (what appeared to me) to be very similar submissions, and computing their similarities. ** Some results In this cohort there were 39 students, divided into two cohorts: 12 were taught by me, and the rest by another teacher. I was only concerned with mine. There were 16 questions, but not every student answered every question, and so the maximum size of my DataFrame would be $12\\times 16=192$; in fact I had a total of 171 different answers. The numbers of questions submitted by the students were: 11, 16, 14, 16, 16, 16, 15, 13, 12, 12, 16, 14 and so (to avoid comparing pairs of submissions twice) I aimed to compare every student\u0026#39;s submission to the submissions of all students below them in the DataFrame. This makes for 13,383 comparisons. In fact, because I\u0026#39;m a lazy programmer, I simply compared every submission to every submission below it in the DataFrame (which meant that I was comparing submissions from a single student), for a total of 14,535 comparisons. This is how (assuming that the jellyfish package as been loaded as =jf=): #+BEGIN_SRC Python match_list = [] N = my_data.shape[0] for i in range(N): for j in range(i+1,N): jfs = jf.jaro_winkler_similarity(my_data.at[i,\u0026#34;Answer\u0026#34;],my_data.at[j,\u0026#34;Answer\u0026#34;]) if jfs \u0026gt; 0.85: match_list += [[my_data.at[i,\u0026#34;Username\u0026#34;],my_data.at[j,\u0026#34;Username\u0026#34;],my_data.at[i,\u0026#34;Q #\u0026#34;],my_data.at[j,\u0026#34;Q #\u0026#34;],jfs]] #+END_SRC I ended up with 33 matches, which I put into a DataFrame: #+BEGIN_SRC Python matches = pd.DataFrame(match_list,columns=[\u0026#34;ID 1\u0026#34;,\u0026#34;ID 2\u0026#34;,\u0026#34;Q# 1\u0026#34;,\u0026#34;Q# 2\u0026#34;,\u0026#34;Similarity\u0026#34;]) #+END_SRC As you see, each row of the DataFrame contained the two student ID numbers, the relevant question numbers, and the similarity measure. Because of the randomisation of the exam, two students might get the same question but with a different number (as I mentioned earlier). To see if any pair of students appeared more than once, I grouped the DataFrame by their ID numbers: #+BEGIN_SRC Python dg = matches.groupby([\u0026#34;ID 1\u0026#34;,\u0026#34;ID 2\u0026#34;]).size() dg.values array([ 1, 1, 1, 1, 1, 1, 1, 11, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 2, 1]) #+END_SRC Notice something? There\u0026#39;s a pair of students who submitted very similar answers to 11 questions! Now this pair can be isolated: #+BEGIN_SRC Python maxd = max(dg.values) cheats = dg.loc[dg.values==maxdg].index[0] c0, c1 = cheats #+END_SRC The matches can now be listed: #+BEGIN_SRC Python collusion = matches.loc[(matches[\u0026#34;ID 1\u0026#34;]==c0) \u0026amp; (matches[\u0026#34;ID 2\u0026#34;]==c1)].reset_index(drop=True) #+END_SRC and we can print off these matches as evidence. #+BEGIN_SRC Python #+END_SRC * More mapping \u0026#34;not quite how-to\u0026#34; - Voronoi regions :Python:GIS: :PROPERTIES: :EXPORT_FILE_NAME: more_mapping_howto :EXPORT_DATE: 2022-06-18 :END: ** What this post is about In the previous post we showed how to set up a simple interactive map using Python and its folium package. As the example, we used a Federal electorate situated within the city of Melbourne, Australia, and the various voting places, or polling places (also known as polling \u0026#34;booths\u0026#34;) associated with it. This post takes the map a little further, and we show how to use Python\u0026#39;s [geovoronoi](https://github.com/WZBSocialScienceCenter/geovoronoi) package to create [Voronoi regions](https://en.wikipedia.org/wiki/Voronoi_diagram) around each booth. This (we hope) will give a map of where voting for a particular party might be strongest. (We make the assumption that every voter will vote at the booth closest to their home.) Because some voting booths are outside the electorate - this includes early voting centres - we first need to reduce the booths to only those withing the boundary of the electorate. In this case, as the boundary is a simply-connect region, this is straightforward. The we can create the Voronoi regions and map them. ** Message about the underlying software #+BEGIN_EXPORT HTML \u0026lt;div style = \u0026#34;color:darkred;font-size:large\u0026#34;\u0026gt;\u0026lt;b\u0026gt;NOTE:\u0026lt;/b\u0026gt; much of the material and discussion here uses the Python package \u0026#34;folium\u0026#34;, which is a front end to the Javascript package \u0026#34;leaflet.js\u0026#34;. The lead developer of leaflet.js is \u0026lt;a href=\u0026#34;https://agafonkin.com\u0026#34;\u0026gt;Volodymyr Agafonkin\u0026lt;/a\u0026gt;, a Ukrainian up until recently living and working in Kyiv.\u0026lt;/div\u0026gt; #+END_EXPORT Leaflet version 1.80 was released on April 18, \u0026#34;in the middle of war\u0026#34;, with \u0026#34;air raid sirens sounding outside\u0026#34;. See the full statement [here](https://leafletjs.com/2022/04/18/leaflet-1.8.0.html). #+BEGIN_EXPORT HTML \u0026lt;a href=\u0026#34;https://stand-with-ukraine.pp.ua\u0026#34;\u0026gt;\u0026lt;img src=\u0026#34;https://raw.githubusercontent.com/vshymanskyy/StandWithUkraine/main/banner-direct.svg\u0026#34; alt=\u0026#34;Stand With Ukraine\u0026#34; /\u0026gt;\u0026lt;/a\u0026gt;\u0026lt;/p\u0026gt; #+END_EXPORT (This banner is included here with the kind permission of Mr Agafonkin.) Please consider the Ukrainian people who are suffering under an unjust aggressor, and help them. ** Obtaining interior points and Voronoi regions We start by simplifying a few of the variable names: #+BEGIN_SRC Python hbooths = higgins_booths.copy() #+END_SRC We will also need the boundary of the electorate: #+BEGIN_SRC Python higgins_crs=higgins.to_crs(epsg=4326) higgins_poly = higgins_crs[\u0026#34;geometry\u0026#34;].iat[0] #+END_SRC Now finding the interior points is easy(ish) using [GeoPandas](https://geopandas.org/en/stable/): #+BEGIN_SRC Python higgins_gpd = gpd.GeoDataFrame( hbooths, geometry=gpd.points_from_xy(hbooths.Longitude, hbooths.Latitude)) higgins_crs = higgins_gpd.set_crs(epsg=4326) interior = higgins_crs[higgins_crs.geometry.within(higgins_poly)].reset_index(drop=True) #+END_SRC We should also check if any of the interior points double up. This might be if one location is used say, for both an early voting centre, and a voting booth on election day. The geovoronoi package will throw an error if a location is repeated. #+BEGIN_SRC Python ig = interior.groupby([\u0026#34;Latitude\u0026#34;,\u0026#34;Longitude\u0026#34;]).size() ig.loc[ig.values\u0026gt;1] Latitude Longitude -37.846104 144.998383 2 dtype: int64 #+END_SRC The geographical coordinates can be obtained with, say #+BEGIN_SRC Python double = ig.loc[ig.values\u0026gt;1].index[0] #+END_SRC This means we can find the offending booths in the interior: #+BEGIN_SRC Python interior.loc[(interior[\u0026#34;Latitude\u0026#34;]==double[0]) \u0026amp; (interior[\u0026#34;Longitude\u0026#34;]==double[1])] #+END_SRC | \u0026lt;l\u0026gt; | \u0026lt;r\u0026gt; | \u0026lt;r\u0026gt; | \u0026lt;r\u0026gt; | \u0026lt;r\u0026gt; | | | PollingPlaceNm | Latitude | Longitude | geometry | |--------+--------------------------+------------+------------+-----------------------------| | 28 | South Yarra HIGGINS PPVC | -37.846104 | 144.998383 | POINT (144.99838 -37.84610) | | 29 | South Yarra South | -37.846104 | 144.998383 | POINT (144.99838 -37.84610) | We\u0026#39;ll remove the pre-polling voting centre at row 28: #+BEGIN_SRC Python interior = interior.drop(28) #+END_SRC Now we\u0026#39;re in a position to create the Voronoi regions. We have the external polygon (he boundary of the electorate), and all the internal points we need, with no duplications. #+BEGIN_SRC Python from geovoronoi import voronoi_regions_from_coords, points_to_region interior_coords = np.array(interior[[\u0026#34;Longitude\u0026#34;,\u0026#34;Latitude\u0026#34;]]) polys, pts = voronoi_regions_from_coords(interior_coords, higgins_poly)) #+END_SRC Each of the variables =polys= and =pts= are given as dictionaries. We want to associate each interior voting booth with a given region. This can be done by creating an index between the points and regions, and then adding the regions to the =interior= dataframe: #+BEGIN_SRC Python index = points_to_region(pts) N = len(polys) geometries = [polys[index[k]] for k in range(N)] interior[\u0026#34;geometry\u0026#34;]=geometries #+END_SRC What this has done is replace the original =geometry= data (which were just the coordinates of each voting booth, given as \u0026#34;POINT\u0026#34; datatypes), to regions given as \u0026#34;POLYGON\u0026#34; datatypes. ** Adding some voting data We could simply map the Voronoi regions now: #+BEGIN_SRC Python from geovoronoi.plotting import subplot_for_map, plot_voronoi_polys_with_points_in_area fig, ax = subplot_for_map(figsize=(10,10)) plot_voronoi_polys_with_points_in_area(ax, higgins_poly, polys, interior_coords, pts) plt.savefig(\u0026#39;higgins_voronoi.png\u0026#39;) plt.show() #+END_SRC [[file:/higgins_voronoi.png]] But what we want is a little more control. But first, we\u0026#39;ll add some more information to the DataFrame. The simplest information is the \u0026#34;two candidate preferred\u0026#34; data: these are the number of votes allocated to the two final candidates after the preferential counting. The files are available on the AEC website; they can be downloaded and used: #+BEGIN_SRC Python tcp = pd.read_csv(\u0026#39;HouseTcpByCandidateByPollingPlaceDownload-27966_2.csv\u0026#39;) higgins_tcp = tcp.loc[tcp[\u0026#34;DivisionNm\u0026#34;]==\u0026#34;Higgins\u0026#34;] #+END_SRC Each candidate gets their own row, which means we have to copy the cotes from each candidate into the interior DataFrame. In the case of the 2022 election, the Higgins final candidates represented the Australian Labor Party (ALP), and the Liberal Party (LP). The party is given in the column \u0026#34;PartyAb\u0026#34; in the =higgins_tcp= data frame. Adding them to the interior data frame is only a tiny bit fiddly: #+BEGIN_SRC Python lp_votes = [] alp_votes = [] for index,row in interior.iterrows(): place = row[\u0026#34;PollingPlaceNm\u0026#34;] alp_votes += [higgins_tcp.loc[(higgins_tcp[\u0026#34;PollingPlace\u0026#34;]==place) \u0026amp; (higgins_tcp[\u0026#34;PartyAb\u0026#34;]==\u0026#34;ALP\u0026#34;),\u0026#34;OrdinaryVotes\u0026#34;].iat[0]] lp_votes += [higgins_tcp.loc[(higgins_tcp[\u0026#34;PollingPlace\u0026#34;]==place) \u0026amp; (higgins_tcp[\u0026#34;PartyAb\u0026#34;]==\u0026#34;LP\u0026#34;),\u0026#34;OrdinaryVotes\u0026#34;].iat[0]] interior[\u0026#34;ALP Votes\u0026#34;] = alp_votes interior[\u0026#34;LP Votes\u0026#34;] = lp_votes #+END_SRC ** Creating the map The base map is the same as before: #+BEGIN_SRC Python hmap2 = folium.Map(location=centre,# crs=\u0026#39;EPSG4283\u0026#39;, tiles=\u0026#39;OpenStreetMap\u0026#39;, min_lat=b1-extent, max_lat=b3+extent, min_long=b0-extent, max_long=b2+extent, width=800,height=800,zoom_start=13,scrollWheelZoom=False) #+END_SRC We don\u0026#39;t need to draw the boundary or interior as the Voronoi regions will cover it. What we\u0026#39;ll do instead is draw each Voronoi region, colouring it red (for a Labor majority) or blue (for a Liberal majority). Like this: #+BEGIN_SRC Python for index,row in interior.iterrows(): rloc = [row[\u0026#34;Latitude\u0026#34;],row[\u0026#34;Longitude\u0026#34;]] row_json = gpd.GeoSeries([row[\u0026#34;geometry\u0026#34;]]).to_json() tooltip = (\u0026#34;\u0026lt;b\u0026gt;{s1}\u0026lt;/b\u0026gt;\u0026#34;).format(s1 = row[\u0026#34;PollingPlaceNm\u0026#34;]) if row[\u0026#34;ALP Votes\u0026#34;] \u0026gt; row[\u0026#34;LP Votes\u0026#34;]: folium.GeoJson(data=row_json,style_function=lambda x: {\u0026#39;fillColor\u0026#39;: \u0026#39;red\u0026#39;}).add_to(hmap2) folium.CircleMarker(radius=5,color=\u0026#34;black\u0026#34;,fill=True,location=rloc,tooltip=tooltip).add_to(hmap2) else: folium.GeoJson(data=row_json,style_function=lambda x: {\u0026#39;fillColor\u0026#39;: \u0026#39;blue\u0026#39;}).add_to(hmap2) folium.CircleMarker(radius=5,color=\u0026#34;black\u0026#34;,fill=True,location=rloc,tooltip=tooltip).add_to(hmap2) #+END_SRC And to view the map: #+BEGIN_SRC Python hmap2 #+END_SRC #+BEGIN_EXPORT HTML \u0026lt;iframe width=\u0026#34;810\u0026#34; height=\u0026#34;810\u0026#34; src=\u0026#34;/higgins_regions.html\u0026#34;\u0026gt;\u0026lt;/iframe\u0026gt; #+END_EXPORT * A mapping \u0026#34;not quite how-to\u0026#34; :Python:GIS: :PROPERTIES: :EXPORT_FILE_NAME: mapping_howto :EXPORT_DATE: 2022-06-11 :END: ** Message about the underlying software #+BEGIN_EXPORT HTML \u0026lt;div style = \u0026#34;color:darkred;font-size:large\u0026#34;\u0026gt;\u0026lt;b\u0026gt;NOTE:\u0026lt;/b\u0026gt; much of the material and discussion here uses the Python package \u0026#34;folium\u0026#34;, which is a front end to the Javascript package \u0026#34;leaflet.js\u0026#34;. The lead developer of leaflet.js is \u0026lt;a href=\u0026#34;https://agafonkin.com\u0026#34;\u0026gt;Volodymyr Agafonkin\u0026lt;/a\u0026gt;, a Ukrainian up until recently living and working in Kyiv.\u0026lt;/div\u0026gt; #+END_EXPORT Leaflet version 1.80 was released on April 18, \u0026#34;in the middle of war\u0026#34;, with \u0026#34;air raid sirens sounding outside\u0026#34;. See the full statement [here](https://leafletjs.com/2022/04/18/leaflet-1.8.0.html). #+BEGIN_EXPORT HTML \u0026lt;a href=\u0026#34;https://stand-with-ukraine.pp.ua\u0026#34;\u0026gt;\u0026lt;img src=\u0026#34;https://raw.githubusercontent.com/vshymanskyy/StandWithUkraine/main/banner-direct.svg\u0026#34; alt=\u0026#34;Stand With Ukraine\u0026#34; /\u0026gt;\u0026lt;/a\u0026gt;\u0026lt;/p\u0026gt; #+END_EXPORT (This banner is included here with the kind permission of Mr Agafonkin.) Please consider the Ukrainian people who are suffering under an unjust aggressor, and help them. ** The software libraries and data we need The idea of this post is to give a little insight into how my maps were made. All software is of course open-source, and all data is freely available. The language used is Python, along with the libraries: + [Pandas](https://pandas.pydata.org) for data manipulation and analysis + [GeoPandas](https://geopandas.org/en/stable/) which adds geospatial capabilities to Pandas + [folium](http://python-visualization.github.io/folium/) for map creation - this is basically a Python front-end to creating interactive maps with the powerful [leaflet.js](https://leafletjs.com) JavaScript library. + [geovoronoi](https://github.com/WZBSocialScienceCenter/geovoronoi) for adding Voronoi diagrams to maps Other standard libraries such as [numpy](https://numpy.org) and [matplotlib](https://matplotlib.org) are also used. The standard mapping element is a [shapefile](https://en.wikipedia.org/wiki/Shapefile) which encodes a map element: for example the shape of a country or state; the position of a city. In order to use them, they have to be downloaded from somewhere. For Australian Federal elections, the AEC makes available much relevant [geospatial information](https://www.aec.gov.au/Electorates/gis/index.htm). Victorian geospatial information can be obtained from [Vicmap Admin](https://discover.data.vic.gov.au/dataset/vicmap-admin). Coordinates of polling booths can be obtained again from the AEC for each election. For the recent 2022 election, data is available at their [Tallyroom](https://tallyroom.aec.gov.au/HouseDownloadsMenu-27966-Csv.htm). You\u0026#39;ll see that this page contains geospatial data as well as election results. Polling booth locations, using latitude and longitude, are available here. ** Building a basic map We can download the shapefiles and polling booth information (unzip any zip files to extract the documents as needed), and read them into Python: #+BEGIN_SRC Python vic = gpd.read_file(\u0026#34;E_VIC21_region.shp\u0026#34;) booths = pd.read_csv(\u0026#34;GeneralPollingPlacesDownload-27966.csv\u0026#34;) #+END_SRC Any Division in Victoria can be obtained and quickly plotted; for example Higgins: #+BEGIN_SRC Python higgins = vic.loc[vic[\u0026#34;Elect_div\u0026#34;]==\u0026#34;Higgins\u0026#39;] higgins.plot() #+END_SRC [[file:/higgins.png]] We can also get a list of the Polling Places in the electorate, including their locations: #+BEGIN_SRC Python higgins_booths = booths.loc[booths[\u0026#34;DivisionNm\u0026#34;]==\u0026#34;Higgins\u0026#34;][[\u0026#34;PollingPlaceNm\u0026#34;,\u0026#34;Latitude\u0026#34;,\u0026#34;Longitude\u0026#34;]] #+END_SRC With this shape, we can create an interactive map, showing for example the names of each polling booth. To plot the electorate on a background map we need to first turn the shapefile into a GeoJSON file: #+BEGIN_SRC Python higgins_json = folium.GeoJson(data = higgins.to_json()) #+END_SRC And to plot it, we can find its bounds (in latitude and longitude) and place it in a map made just a little bigger: #+BEGIN_SRC Python b0,b1,b2,b3 = higgins.total_bounds extent = 0.1 centre = [(b1+b3)/2,(b0+b2)/2] hmap = folium.Map(location=centre, min_lat=b1-extent, max_lat=b3+extent, min_long=b0-extent, max_long=b2+extent, width=800,height=800, zoom_start=13, scrollWheelZoom=False ) higgins_json.add_to(hmap) hmap #+END_SRC The various commands and parameters above should be straightforward: from the latitude and longitude given as bounds, the variable =centre= is exactly that. Because our area is relatively small, we can treat the earth\u0026#39;s surface as effectively flat, and treat geographical coordinates as though they were Cartesian coordinates. Thus for this simple map we don\u0026#39;t have to worry about map projections. The defaults will work fine. The variables =min_lat= and the others define the extent of our map; width and height are given in pixels; and an initial zoom factor is given. The final setting ``scrollWheelZoom=False`` stops the map from being inadvertently zoomed in or out by the mouse scrolling on it (very easy to do). The map can be zoomed by the controls in the upper left: #+BEGIN_EXPORT HTML \u0026lt;iframe width=\u0026#34;810\u0026#34; height=\u0026#34;810\u0026#34; src=\u0026#34;/higgins1.html\u0026#34;\u0026gt;\u0026lt;/iframe\u0026gt; #+END_EXPORT We can color the electorate by adding a style to the JSON variable: #+BEGIN_SRC Python higgins_json = folium.GeoJson(data = higgins.to_json(), style_function = lambda feature: { \u0026#39;color\u0026#39;: \u0026#39;green\u0026#39;, \u0026#39;weight\u0026#39;:6, \u0026#39;fillColor\u0026#39; : \u0026#39;yellow\u0026#39;, \u0026#39;fill\u0026#39;: True, \u0026#39;fill_opacity\u0026#39;: 0.4} ) #+END_SRC Because folium is a front end to the Javascript leaflet.js package, much information is available on that site. For instance, all the parameters available to change the colors, border etc of the electorate are listed in the description of the [leaflet Path](https://leafletjs.com/reference.html#path). ** Adding interactivity So far the map is pretty static; we can zoom in and out, but that\u0026#39;s about it. Let\u0026#39;s add the voting booths as circles, each one with with a \u0026#34;tooltip\u0026#34; giving its name. A tooltip is like a popup which automatically appears when the cursor hovers over the relevant marker on the map. A popup, on the other hand, requires a mouse click to be seen. We can create the points and tooltips from the list of booths in Higgins. #+BEGIN_SRC Python for index, row in higgins_booths.iterrows(): loc = [row[\u0026#34;Latitude\u0026#34;],row[\u0026#34;Longitude\u0026#34;]] tooltip = (\u0026#34;\u0026lt;b\u0026gt;{s1}\u0026lt;/b\u0026gt;\u0026#34;).format(s1 = row[\u0026#34;PollingPlaceNm\u0026#34;]) folium.CircleMarker(radius=5,color=\u0026#34;black\u0026#34;,fill=True,location=loc,tooltip=tooltip).add_to(hmap) #+END_SRC The map can now be saved: #+BEGIN_SRC Python hmap.save(\u0026#34;higgins_booths.html\u0026#34;) #+END_SRC and viewed in a web browser: #+BEGIN_EXPORT HTML \u0026lt;iframe width=\u0026#34;810\u0026#34; height=\u0026#34;810\u0026#34; src=\u0026#34;/higgins2.html\u0026#34;\u0026gt;\u0026lt;/iframe\u0026gt; #+END_EXPORT This a very simple example of an interactive map. We can do lots more: display markers as large circles with numbers in them; divide the map into regions and make the regions respond to hovering or selection, add all sorts of text (even html iframes) as popups or tooltips, and so on. * Further mapping: a win and a near miss :GIS:voting: :PROPERTIES: :EXPORT_FILE_NAME: Further_mapping :EXPORT_DATE: 2022-06-09 :END: In this post we look at two Divisions from the recent Federal election: the inner city seat of [Melbourne](https://en.wikipedia.org/wiki/Division_of_Melbourne), and the bayside seat of [Macnamara](https://en.wikipedia.org/wiki/Division_of_Macnamara). Up until the recent election, Melbourne was the only Division to have a Greens representative. Macnamara, previously known as \u0026#34;Melbourne Ports\u0026#34; has been a Labor stronghold for all of its existence. ** The near miss: Macnamara The contest in Macnamara was curious, and the vote counting took a very long time. Unlike almost all other Divisions, in Macnamara it was a three way contest, with Labor, Liberals and Greens polling very similar numbers at each polling booth. In fact the Greens pulled more votes individually than either Labor or the Liberals, but none of these parties had enough first preference votes to win. The decision thus came down to preferences, which knocked out the Greens and in the end put Labor back as the winner. Whether this is a good thing or not depends on your perspective, but it does show that a relatively high first preference count may not necessarily transfer to a win; one of the other parties might pick up more votes through preferences, and enough to push them over an absolute majority of 50% + 1. But what I decided to do was, similar to my previous mapping post, show the Division of Macnamara with coloured Voronoi regions depending on first preferences. In this map, green shows support for the Greens; red for the Australian Labor Party, and blue for the Liberal Party. (Note that the Liberal Party is not \u0026#34;liberal\u0026#34; in any dictionary sense; this is a right-wing party, once the support of business and economic interests, it has steadily become more conservative over the years.) #+BEGIN_EXPORT HTML \u0026lt;iframe width=\u0026#34;1010\u0026#34; height=\u0026#34;810\u0026#34; src=\u0026#34;/Macnamara_first_prefs.html\u0026#34;\u0026gt;\u0026lt;/iframe\u0026gt; #+END_EXPORT This map seems to show that statistically, Greens support is fairly widespread across the Division. There is also a lot of detail left out: for example, many votes were cast at pre-polling Centres, and the numbers are large enough to significantly affect the result. This table compares votes cast in the Division on the day, which is what the map shows, against votes cast at the pre-polling centres: | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | | Type | GRN | ALP | Lib | Others | |------------+-------+-------+-------+--------| | On the day | 12110 | 11499 | 8737 | 4857 | | PPVC | 8014 | 8791 | 8085 | 3262 | |------------+-------+-------+-------+--------| | Totals | 20124 | 20290 | 16822 | 7849 | Clearly first preferences show the Greens votes exceed votes both for the Labor and Liberal parties on the day. At pre-polling, Labor did better than the Greens. It does seem though that the Liberal first preferences were very much smaller, so it seems a bit paradoxical that the Greens were knocked out first, giving a final TCP of Labor and Liberal. But that\u0026#39;s preferences for you! ** The win The current leader of the Australian Greens is [Adam Bandt](https://en.wikipedia.org/wiki/Adam_Bandt), who is the Federal MP for the Division of Melbourne. He has held this seat, increasing his majority at each election, since first winning it in 2010. The next map shows the TCP results from the most recent election, which shows that Bandt has a TCP majority at every polling place except two: #+BEGIN_EXPORT HTML \u0026lt;iframe width=\u0026#34;1010\u0026#34; height=\u0026#34;610\u0026#34; src=\u0026#34;/Melbourne_map_with_regions.html\u0026#34;\u0026gt;\u0026lt;/iframe\u0026gt; #+END_EXPORT As with all such maps, the amount of information here is not huge - but it /looks/ nice. In particular, it shows that statistically, Greens support is fairly widespread across the Division. ** Other similar informational sites There are other sites, showing results at various booths. One is the excellently named [PollBludger](https://www.pollbludger.net) from the political analyst Willian Bowe, who is described on the site as \u0026#34;is a Perth-based election analyst and occasional teacher of political science. His blog, The Poll Bludger, has existed in one form or another since 2004, and is one of the most heavily trafficked websites on Australian politics.\u0026#34; Another site is [The Tally Room](https://www.tallyroom.com.au) from Ben Raue, who has an [adjunct appointment](https://www.sydney.edu.au/arts/about/our-people/academic-staff/benjamin-raue.html) at the University of Sydney. A very nice addition to this site is a [tutorial on how to create your own maps](https://www.tallyroom.com.au/maps/tutorial), in this case using Google Earth. (I don\u0026#39;t know how recent this is, though.) (The above site is not to be confused with the AEC\u0026#39;s own [Tallyroom](https://tallyroom.aec.gov.au/HouseDefault-27966.htm) which also has a lot of results for the downloading.) However, I don\u0026#39;t know of any sites which add Voronoi diagrams around the polling booths so as to give a picture of the voting characteristics of an electorate. Whether this is of any use is of course a moot point. * Post-election mapping :GIS:voting: :PROPERTIES: :EXPORT_FILE_NAME: post_election_mapping :EXPORT_DATE: 2022-06-05 :END: This continues on from the previous post, trying to make some sense of the voting in my electorate of [Wills](https://en.wikipedia.org/wiki/Division_of_Wills) and the neighbouring electorate of [Cooper](https://en.wikipedia.org/wiki/Division_of_Cooper). Both these electorates (or more formally \u0026#34;Divisions\u0026#34;), as I mentioned in the previous post, are very similar in their geography, demography, and history. Last post I simply showed a map of voting booths, using a circle roughly proportional to the size of the ratio of votes between the two major candidates. This used a local system called [Two Candidate Preferred](https://en.wikipedia.org/wiki/Two-party-preferred_vote), which indicates the results after all preferences have been distributed. Australian lower house elections use a preferential system formally called [instant run-off voting](https://en.wikipedia.org/wiki/Instant-runoff_voting), in which each voter numbers all candidates in order of preference. The candidates with lowest counts are successively removed from the counting; their ballots being passed on to other candidates using the highest available preference on a ballot. This continues until only two candidates remain; the one with the largest number of votes is the winner. Although the full count can take some weeks - this must include all the pre-poll votes, postal votes, and absentee votes - an indicative TCP is usually available on the evening of an election. There are sometimes a handful of electorates for which an outcome may not be known for some time, especially if the count is very close. In this most recent election, the division of [Macnamara](https://tallyroom.aec.gov.au/HouseDivisionPage-27966-322.htm) took a long time to be counted: it was a three way contest between Liberal, Labor, and the Greens, with very similar first preference counts for all three parties. It was thus a count which relied very heavily on preferences. For this post I was interested in overlaying the electorate with a [Voronoi diagram](https://en.wikipedia.org/wiki/Voronoi_diagram) based on the booths. This is a subdivision of the electorate into regions around each booth; each such region consists of the points in the plane which are close to that particular booth than any other. If we make the simplifying (and not unreasonable) assumption that everybody votes in the booth closest to where they live, we can thus subdivide the electorate into Greens/ Labor regions. The idea is to colour each region by its TCP: a booth that favours labour will have its corresponding region red, and a booth that favours the Greens will have its corresponding region green. To obtain the Voronoi diagram we make use of the Python library [geovoronoi](https://github.com/WZBSocialScienceCenter/geovoronoi) which returns regions as shapefiles. These can then be easily converted to Json files for including on a [folium](https://python-visualization.github.io/folium/) map. Here are the results, first for Wills: #+BEGIN_EXPORT HTML \u0026lt;iframe width=\u0026#34;750\u0026#34; height=\u0026#34;1000\u0026#34; src=\u0026#34;/wills_voronoi.html\u0026#34;\u0026gt;\u0026lt;/iframe\u0026gt; #+END_EXPORT and for Cooper: #+BEGIN_EXPORT HTML \u0026lt;iframe width=\u0026#34;750\u0026#34; height=\u0026#34;1000\u0026#34; src=\u0026#34;/cooper_voronoi.html\u0026#34;\u0026gt;\u0026lt;/iframe\u0026gt; #+END_EXPORT Naturally these maps cannot tell us everything, and their limitations must be noted: there is no attempt to provide shades of colour for the size of the ratio. That is, a booth with a Labor to Greens voting ratio of 3.5 gets the same shade of red as a booth with a ratio of 1.01. However, the popups show the ratio at each booth. The numbers of votes cast at the booths are not equal. For instance, if you go to the [AEC page for Wills](https://tallyroom.aec.gov.au/HouseDivisionPage-27966-234.htm) and check out the TCP numbers by polling place, numbers of votes cast range from 71 at Strathmore North to 2739 at Brunswick North, to even larger numbers at the pre-poll voting centres, Northcote and Pascoe Vale, with 5210 and 15141 total votes cast respectively. A better map may make adjustments both for the value of the ratio, and the total number of votes cast. * Post-election swings :GIS:voting: :PROPERTIES: :EXPORT_FILE_NAME: post_election_swings :EXPORT_DATE: 2022-05-22 :END: So the Australian federal election of 2022 is over as far as the public is concerned; all votes have been cast and now it\u0026#39;s a matter of waiting while the [Australian Electoral Commission](https://www.aec.gov.au/) tallies the numbers, sorts all the preferences, and arrives at a result. Because of the complications of the voting system, and of all the checks and balances within it, a final complete result may not be known for some days or even weeks. What /is/ known, though, is that the sitting government has been ousted, and that the Australian Labor Party (ALP) will lead the new government. Whether the ALP amasses enough wins for it to govern with a complete majority is not known; they may have a \u0026#34;minority government\u0026#34; in coalition with either independent candidates or the Greens. In Australia, there are 151 federal electorates or \u0026#34;Divisions\u0026#34;; each one corresponds to seat in the House of Representatives; the Lower House of the Federal government. The winner of an election is whichever party or coalition wins the majority of seats; the Prime Minister is simply the leader of the major party in that coalition. Australians thus have no say whatsoever in the Prime Minister; that is entirely a party matter, which is why Prime Ministers have sometimes been replaced in the middle of a term. My concern is the neighbouring electorates of [Cooper](https://en.wikipedia.org/wiki/Division_of_Cooper) and [Wills](https://en.wikipedia.org/wiki/Division_of_Wills). Both are very similar both geographically and politically; both are Labor strongholds, in each of which the Greens have made considerable inroads. Indeed Cooper (called Batman until a few years ago) used to be one of the safest Labor seats in the country; it has now become far less so, and in each election the battle is now between Labor and the Greens. Both are urban seats, in Melbourne; in each of them the southern portion is more gentrified, diverse, and left-leaning, and the Northern part is more solidly working-class, and Labor-leaning. In each of them the dividing line is [Bell St](https://bit.ly/3NvMAZn), known as the \u0026#34;tofu curtain\u0026#34;. (Also as the \u0026#34;Latte Line\u0026#34; or the \u0026#34;Hipster-Proof Fence\u0026#34;.) Thus Greens campaigning consists of letting the southerners know they haven\u0026#39;t been forgotten, and attempting to reach out to the northerners. This is mainly done with door-knocking by volunteers, and there is never enough time, or enough volunteers, to reach every household. Anyway, here are some maps showing the Greens/Labor result at each polling booth. The size of the circle represents the ratio of votes: red for a Labor majority; green for a Greens majority. And the popup tooltip gives the name of the polling booth, and the swing either to Labor or to the Greens. I couldn\u0026#39;t decide the best way of displaying the swings, so in the end I just displayed the swing to Labor in all booths with a Labor majority, even if that swing was sometimes negative. And similarly for the Greens. Note that a large swing may correspond to a relatively small number of votes being cast. ** The Division of Cooper #+BEGIN_EXPORT HTML \u0026lt;iframe width=\u0026#34;750\u0026#34; height=\u0026#34;1000\u0026#34; src=\u0026#34;/cooper.html\u0026#34;\u0026gt;\u0026lt;/iframe\u0026gt; #+END_EXPORT ** The Division of Wills #+BEGIN_EXPORT HTML \u0026lt;iframe width=\u0026#34;750\u0026#34; height=\u0026#34;1000\u0026#34; src=\u0026#34;/wills.html\u0026#34;\u0026gt;\u0026lt;/iframe\u0026gt; #+END_EXPORT * Ramanujan\u0026#39;s cubes :mathematics:computation: :PROPERTIES: :EXPORT_FILE_NAME: ramanujans_cubes :EXPORT_DATE: 2022-04-03 :END: This post illustrates the working of Ramanujan\u0026#39;s generating functions for solving Euler\u0026#39;s diophantine equation \\(a^3+b^3=c^3+d^3\\) as described by Andrews and Berndt in \u0026#34;Ramanujan\u0026#39;s Lost Notebook, Part IV\u0026#34;, pp 199 - 205 (Section 8.5). The text is available from [[https://link.springer.com/book/10.1007/978-1-4614-4081-9][Springer]]. Ramanujan\u0026#39;s result is that if \\[ f_1(x) = \\frac{1+53x+9x^2}{1-82x-82x^2+x^3} = a_0+a_1x+a_2x^2+a_3x^3+\\cdots = \\alpha_0+\\frac{\\alpha_1}{x}+\\frac{\\alpha_2}{x^2}+\\frac{\\alpha_3}{x^3}+\\cdots\\] \\[ f_2(x) = \\frac{2-26x-12x^2}{1-82x-82x^2+x^3} = b_0+b_1x+b_2x^2+b_3x^3+\\cdots = \\beta_0+\\frac{\\beta_1}{x}+\\frac{\\beta_2}{x^2}+\\frac{\\beta_3}{x^3}+\\cdots\\] \\[ f_3(x) = \\frac{2+8x-10x^2}{1-82x-82x^2+x^3} = c_0+c_1x+c_2x^2+c_3x^3+\\cdots = \\gamma_0+\\frac{\\gamma_1}{x}+\\frac{\\gamma_2}{x^2}+\\frac{\\gamma_3}{x^3}+\\cdots\\] then for every value of \\(n\\) we have: \\(a_n^3+b_n^3=c_n^3+(-1)^3\\) and \\(\\alpha_n^3+\\beta_n^3=\\gamma_n^3-(-1)^3\\) thus providing infinite sequences of solutions to Euler\u0026#39;s equations. The values \\(\\alpha_1,\\beta_1,\\gamma_1\\) are \\(9, -12, -10\\) giving rise to \\((9)^3+(-12)^3=(-10)^3-(-1)^3\\) which can be rewritten as \\(9^3+10^3=12^2+1^3\\). Andrews and Berndt comment that: #+BEGIN_QUOTE This is another of those many results of Ramanujan for which one wonders, “How did he ever think of this?” #+END_QUOTE And we all know the story from Hardy about Ramanujan\u0026#39;s comment about the number 1729. #+BEGIN_SRC python import sympy as sy sy.init_printing(num_columns=120) x,a,b,c = sy.var(\u0026#39;x,a,b,c\u0026#39;) n = sy.Symbol(\u0026#39;n\u0026#39;, positive=True, integer=True) #+END_SRC Start by entering the three rational functions. #+BEGIN_SRC python g = 1-82*x-82*x**2+x**3 f1 = (1+53*x+9*x**2)/g f2 = (2-26*x-12*x**2)/g f3 = (2+8*x-10*x**2)/g display(f1) display(f2) display(f3) #+END_SRC \\[ \\frac{9x^2+53x+1}{x^3-82x^2-82x+1} \\] \\[ \\frac{-12x^2-26x+2}{x^3-82x^2-82x+1} \\] \\[ \\frac{-10x^2+8x+2}{x^3-82x^2-82x+1} \\] #+BEGIN_SRC python fp1 = f1.apart(x,full=True).doit() fp2 = f2.apart(x,full=True).doit() fp3 = f3.apart(x,full=True).doit() fs1 = [z.simplify() for z in fp1.args] fs2 = [z.simplify() for z in fp2.args] fs3 = [z.simplify() for z in fp3.args] display(fs1) display(fs2) display(fs3) #+END_SRC \\[ \\left[ -\\frac{43}{85x+85}, \\frac{8(101+11\\sqrt{85})}{85(2x-83-9\\sqrt{85})}, \\frac{8(101-11\\sqrt{85})}{85(2x-83+9\\sqrt{85})}\\right] \\] \\[ \\left[ -\\frac{16}{85x+85}, \\frac{28(-37-4\\sqrt{85})}{85(2x-83-9\\sqrt{85})}, \\frac{28(-37+4\\sqrt{85})}{85(2x-83+9\\sqrt{85})}\\right] \\] \\[ \\left[ -\\frac{16}{85x+85}, \\frac{6(-139-15\\sqrt{85})}{85(2x-83-9\\sqrt{85})}, \\frac{6(-139+15\\sqrt{85})}{85(2x-83+9\\sqrt{85})}\\right] \\] Note that the denominators of each fraction is the same (as we\u0026#39;d expect). Now we use the fact that \\[\\frac{a}{bx+c}\\] has the infinite series expansion \\[\\frac{a}{c}\\left(1- \\frac{b}{c}x+\\left(\\frac{b}{c}\\right)^2x^2-\\left(\\frac{b}{c}\\right)^3x^3+\\cdots\\right)\\] This means that the coefficient of \\(x^n\\) is \\[\\frac{a}{c}\\left(-\\frac{b}{c}\\right)^n\\] Beacuse of the denominators, the values of \\(b\\) and \\(c\\) are always the same. We start by considering =fs1=, which consists of the partial fraction sums of =f1=. #+BEGIN_SRC python a1_s = [sy.numer(z) for z in fs1] b1_s = [sy.denom(z).coeff(x) for z in fs1] c1_s = [sy.denom(z).coeff(x,0) for z in fs1] ac1_s = [sy.simplify(s/t) for s,t in zip(a1_s,c1_s)] bc1_s = [sy.simplify(s/t) for s,t in zip(b1_s,c1_s)] display(ac1_s) display(bc1_s) #+END_SRC \\[ \\left[-\\frac{43}{85},\\frac{64}{85}-\\frac{8\\sqrt{85}}{85}, \\frac{64}{85}+\\frac{8\\sqrt{85}}{85}\\right] \\] \\[ \\left[1, -\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}, -\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right] \\] Now we can determine the coefficient of \\(x^n\\) in the power series expansion of $f_1(x)$: #+BEGIN_SRC python a_n = sum(s*(-t)**n for s,t in zip(ac1_s,bc1_s)) display(a_n) #+END_SRC \\[ -\\frac{43(-1)^n}{85} + \\left(\\frac{64}{85}-\\frac{8\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(\\frac{64}{85}+\\frac{8\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\] And repeat all of the above for $f_2(x)$ and its partial fractions =fs2=. #+BEGIN_SRC python a2_s = [sy.numer(z) for z in fs2] b2_s = [sy.denom(z).coeff(x) for z in fs2] c2_s = [sy.denom(z).coeff(x,0) for z in fs2] ac2_s = [sy.simplify(s/t) for s,t in zip(a2_s,c2_s)] bc2_s = [sy.simplify(s/t) for s,t in zip(b2_s,c2_s)] b_n = sum(s*(-t)**n for s,t in zip(ac2_s,bc2_s)) display(b_n) #+END_SRC \\[ -\\frac{16(-1)^n}{85} + \\left(\\frac{77}{85}-\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(\\frac{77}{85}+\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\] Continuing for $f_3(x)$: #+BEGIN_SRC python a3_s = [sy.numer(z) for z in fs3] b3_s = [sy.denom(z).coeff(x) for z in fs3] c3_s = [sy.denom(z).coeff(x,0) for z in fs3] ac3_s = [sy.simplify(s/t) for s,t in zip(a3_s,c3_s)] bc3_s = [sy.simplify(s/t) for s,t in zip(b3_s,c3_s)] c_n = sum(s*(-t)**n for s,t in zip(ac3_s,bc3_s)) display(c_n) #+END_SRC \\[ -\\frac{16(-1)^n}{85} + \\left(\\frac{93}{85}-\\frac{9\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(\\frac{93}{85}+\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\] In order to see their similarities and differences, we now show them together: #+BEGIN_SRC python display(a_n) display(b_n) display(c_n) #+END_SRC \\[ -\\frac{43(-1)^n}{85} + \\left(\\frac{64}{85}-\\frac{8\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(\\frac{64}{85}+\\frac{8\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\] \\[ -\\frac{16(-1)^n}{85} + \\left(\\frac{77}{85}-\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(\\frac{77}{85}+\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\] \\[ -\\frac{16(-1)^n}{85} + \\left(\\frac{93}{85}-\\frac{9\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(\\frac{93}{85}+\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\] The first few coefficients can be checked against a Taylor series expansion: #+BEGIN_SRC python s1 = f1.series(n=6) display([s1.coeff(x,k) for k in range(6)]) a_s = [a_n.subs(n,k).simplify() for k in range(6)] display(a_s) #+END_SRC \\begin{aligned} \u0026amp;[1,\\quad 135,\\quad 11161,\\quad 926271,\\quad 76869289,\\quad 6379224759]\\\\ \u0026amp;\\\\ \u0026amp;[1,\\quad 135,\\quad 11161,\\quad 926271,\\quad 76869289,\\quad 6379224759] \\end{aligned} #+BEGIN_SRC python s2 = f2.series(n=6) display([s2.coeff(x,k) for k in range(6)]) b_s = [b_n.subs(n,k).simplify() for k in range(6)] display(b_s) #+END_SRC \\begin{aligned} \u0026amp;[2,\\quad 138,\\quad 11468,\\quad 951690,\\quad 78978818,\\quad 6554290188]\\\\ \u0026amp;\\\\ \u0026amp;[2,\\quad 138,\\quad 11468,\\quad 951690,\\quad 78978818,\\quad 6554290188] \\end{aligned} #+BEGIN_SRC python s3 = f3.series(n=6) display([s3.coeff(x,k) for k in range(6)]) c_s = [c_n.subs(n,k).simplify() for k in range(6)] display(c_s) #+END_SRC \\begin{aligned} \u0026amp;[2,\\quad 172,\\quad 14258,\\quad 1183258,\\quad 98196140,\\quad\\quad 8149006378]\\\\ \u0026amp;\\\\ \u0026amp;[2,\\quad 172,\\quad 14258,\\quad 1183258,\\quad 98196140,\\quad\\quad 8149006378] \\end{aligned} Now, if everything has behaved properly, we should now have \\[ a^3 + b^3 = c^3 + (-1)^3 \\] and we can check the first few values: #+BEGIN_SRC python [s**3+t**3-u**3 for s,t,u in zip(a_s,b_s,c_s)] #+END_SRC \\[ [1,\\quad -1,\\quad 1,\\quad -1,\\quad 1,\\quad -1] \\] And now for the general result: #+BEGIN_SRC python sy.powsimp(sy.expand(a_n**3 + b_n**3 - c_n**3),combine=\u0026#39;all\u0026#39;,force=True).factor() #+END_SRC \\[ (-1)^n \\] Woo hoo! Now for the other expansions, in negative powers of $x$; in other words based on the the functions $f_k(1/x)$. We\u0026#39;ll rename these functions: $g_k(x) = f_k(1/x)$. After that it\u0026#39;s pretty much a carbon copy of the preceding computations. #+BEGIN_SRC python g1 = f1.subs(x,1/x).simplify() g2 = f2.subs(x,1/x).simplify() g3 = f3.subs(x,1/x).simplify() display(g1) display(g2) display(g3) #+END_SRC \\[ \\frac{x(x^2+53x+9)}{x^3-82x^2-82x+1} \\] \\[ \\frac{2x(x^2 - 13x - 6)}{x^3 - 82x^2 - 82x + 1} \\] \\[ \\frac{2x(x^2 + 4x - 5)}{x^3 - 82x^2 - 82x + 1} \\] #+BEGIN_SRC python gp1 = g1.apart(x,full=True).doit() gp2 = g2.apart(x,full=True).doit() gp3 = g3.apart(x,full=True).doit() gs1 = [z.simplify() for z in gp1.args] gs2 = [z.simplify() for z in gp2.args] gs3 = [z.simplify() for z in gp3.args] display(gs1) display(gs2) display(gs3) #+END_SRC \\[ \\left[1, \\frac{43}{85x+85}, \\frac{8(1429+155\\sqrt{85})}{85(2x-83-9\\sqrt{85})}, \\frac{8(1429-155\\sqrt{85})}{85(2x-83+9\\sqrt{85})}\\right] \\] \\[ \\left[2, -\\frac{16}{85x+85}, \\frac{14(839+91\\sqrt{85})}{85(2x-83-9\\sqrt{85})}, \\frac{14(8397-91\\sqrt{85})}{85(2x-83+9\\sqrt{85})}\\right] \\] \\[ \\left[2, \\frac{16}{85x+85}, \\frac{12(1217+132\\sqrt{85})}{85(2x-83-9\\sqrt{85})}, \\frac{12(1217-132\\sqrt{85})}{85(2x-83+9\\sqrt{85})}\\right] \\] For ease of writing in Python, we\u0026#39;ll use =d=, =e= and =f= instead of $\\alpha$, $\\beta$ and $\\gamma$. #+BEGIN_SRC python d1_s = [sy.numer(z) for z in gs1] e1_s = [sy.denom(z).coeff(x) for z in gs1] f1_s = [sy.denom(z).coeff(x,0) for z in gs1] df1_s = [sy.simplify(s/t) for s,t in zip(d1_s,f1_s)] ef1_s = [sy.simplify(s/t) for s,t in zip(e1_s,f1_s)] d_n = sum(s*(-t)**n for s,t in zip(df1_s,ef1_s)) #+END_SRC #+BEGIN_SRC python d2_s = [sy.numer(z) for z in gs2] e2_s = [sy.denom(z).coeff(x) for z in gs2] f2_s = [sy.denom(z).coeff(x,0) for z in gs2] df2_s = [sy.simplify(s/t) for s,t in zip(d2_s,f2_s)] ef2_s = [sy.simplify(s/t) for s,t in zip(e2_s,f2_s)] e_n = sum(s*(-t)**n for s,t in zip(df2_s,ef2_s)) #+END_SRC #+BEGIN_SRC python d3_s = [sy.numer(z) for z in gs3] e3_s = [sy.denom(z).coeff(x) for z in gs3] f3_s = [sy.denom(z).coeff(x,0) for z in gs3] df3_s = [sy.simplify(s/t) for s,t in zip(d3_s,f3_s)] ef3_s = [sy.simplify(s/t) for s,t in zip(e3_s,f3_s)] f_n = sum(s*(-t)**n for s,t in zip(df3_s,ef3_s)) #+END_SRC #+BEGIN_SRC python display(d_n) display(e_n) display(f_n) #+END_SRC \\[ -\\frac{43(-1)^n}{85} + \\left(-\\frac{64}{85}-\\frac{8\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(-\\frac{64}{85}+\\frac{8\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\] \\[ -\\frac{16(-1)^n}{85} + \\left(-\\frac{77}{85}-\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(-\\frac{77}{85}+\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\] \\[ -\\frac{16(-1)^n}{85} + \\left(-\\frac{93}{85}-\\frac{9\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}-\\frac{9\\sqrt{85}}{2}\\right)^n + \\left(-\\frac{93}{85}+\\frac{7\\sqrt{85}}{85}\\right) \\left(\\frac{83}{2}+\\frac{9\\sqrt{85}}{2}\\right)^n \\] As before, a quick check: #+BEGIN_SRC python ds = [d_n.subs(n,k).simplify() for k in range(6)] es = [e_n.subs(n,k).simplify() for k in range(6)] fs = [f_n.subs(n,k).simplify() for k in range(6)] display(ds) display(es) display(fs) #+END_SRC \\begin{aligned} \u0026amp;[-1, 9, 791, 65601, 5444135, 451797561]\\\\ \u0026amp;\\\\ \u0026amp;[-2, -12, -1010, -83802, -6954572, -577145658]\\\\ \u0026amp;\\\\ \u0026amp;[-2, -10, -812, -67402, -5593538, -464196268] \\end{aligned} #+BEGIN_SRC python [s**3+t**3-v**3 for s,t,v in zip(ds,es,fs)] #+END_SRC \\[ [-1,\\quad 1,\\quad -1,\\quad 1,\\quad -1,\\quad 1] \\] And finally, confirming the general result: #+BEGIN_SRC python sy.powsimp(sy.expand(d_n**3 + e_n**3 - f_n**3),combine=\u0026#39;all\u0026#39;,force=True).factor() #+END_SRC \\[ -(-1)^n \\] Again, woo hoo! : * Wordle :Julia: :PROPERTIES: :EXPORT_FILE_NAME: wordle :EXPORT_DATE: 2022-01-24 :END: [Wordle](https://wordlegame.org) is a pleasant game, basically Mastermind with words. You choose an English word (although it can also be played in other languages), and then you\u0026#39;re told if your letters are incorrect, correct but in the wrong place, or correct and in the right place. These are shown by the colours grey, yellow, and green. The genius is that you can share your result: the grid of coloured squares which shows how quickly or slowly you\u0026#39;ve guessed the word. The shared grid shows only the colours, and not the letters, so is not helpful for anybody else. At the time of writing, Twitter seems awash with Wordle grids. For example, today my attempt looked like this: #+ATTR_HTML: :width 300 [[file:/wordle_letters.png]] but the grid I could share looked like this: #+ATTR_HTML: :width 250 [[file:/wordle_colours.png]] Since all words are English, each turn considerably lessens the pool of words from which to choose. Since you have no idea what the hidden word might be, you just need to make a guess at each stage. I don\u0026#39;t know what pool of words is used to create the daily wordle, whether its mostly simple English, or whether it includes some unusual words. So the plan was to create a Wordle \u0026#34;helper\u0026#34; program which would provide the list of allowable words at each stage. I used Julia for speed. to start, read in the word list (I used the ~all5.txt~ list from the previous post): #+BEGIN_SRC Julia Julia\u0026gt; wds = readlines(open(\u0026#34;all5.txt\u0026#34;)) #+END_SRC What we\u0026#39;re going to do is to create a simple function which will take a word list, a chosen word, and its Wordle colours, and produce a new word list containing all possible usable words. It will look like: =wordle_guess(word_list, word, colours)= and so for the third word down above, we would use something like: =wordle_guess(current_words,\u0026#34;pylon\u0026#34;,\u0026#34;nnyyy\u0026#34;)= where the characters ~n~, ~y~, ~g~ will be used for grey, yellow, and green. The use of ~n~ can be taken to stand for \u0026#34;no\u0026#34; or \u0026#34;nil\u0026#34; or \u0026#34;not\u0026#34; or \u0026#34;never\u0026#34;. (For an entertaining riff on words beginning with \u0026#34;n\u0026#34; I can\u0026#39;t recommend strongly enough the first story in Stanisław Lem\u0026#39;s [\u0026#34;The Cyberiad\u0026#34;](https://en.wikipedia.org/wiki/The_Cyberiad) which is quite possibly the most brilliantly inventive book ever - and certainly in the realm of science fiction.) Then the logic of the program will be very simple; we walk through the current word one letter at a time, and reduce the word-list according to the colour: - for grey, find all words which do not use that particular letter - for yellow, find all words with that letter but in another position - for green, find all words with that letter in that position No doubt there are cleverer ways of doing this, but here\u0026#39;s what I whipped up: #+BEGIN_SRC Julia function wordle_guess(ws,wd,cs) wlist = copy(ws) for i in 1:5 if cs[i] == \u0026#39;n\u0026#39; wlist = [w for w in wlist if isnothing(findfirst(wd[i],w))] elseif cs[i] == \u0026#39;y\u0026#39; wlist = [w for w in wlist if !isnothing(findfirst(wd[i],w)) \u0026amp;\u0026amp; w[i] != wd[i]] else wlist = [w for w in wlist if w[i]==wd[i]] end end return(wlist) end #+END_SRC And here\u0026#39;s how it was used: #+BEGIN_SRC Julia Julia\u0026gt; wds2 = wordle_guess(wds,\u0026#34;cream\u0026#34;,\u0026#34;nnnnn\u0026#34;) Julia\u0026gt; wds3 = wordle_guess(wds2,\u0026#34;shunt\u0026#34;,\u0026#34;nnnyn\u0026#34;) Julia\u0026gt; wds4 = wordle_guess(wds3,\u0026#34;pylon\u0026#34;,\u0026#34;nnyyy\u0026#34;) 2-element Vector{String}: \u0026#34;knoll\u0026#34; \u0026#34;lingo\u0026#34; #+END_SRC At this stage I simply had to flip a coin, and as you see from above, I chose the wrong one! The remarkable thing is how much the word lists are reduced at each turn: #+BEGIN_SRC Julia Julia\u0026gt; [length(x) for x in [wds,wds2,wds3,wds4]] 4-element Vector{Int64}: 6196 937 37 2 #+END_SRC An interesting question might be: if you choose a random word at each stage from the current list of usable words, what is the expected number of trials to find the word? And does that value differ between different hidden words? And also: given a hidden word, and choosing a random word each time, what is the maximum number of trials that could be used? These will need to wait for another day (or maybe they\u0026#39;ve been answered already.) * Five letter words in English :PROPERTIES: :EXPORT_FILE_NAME: five_letter_words_in_english :EXPORT_DATE: 2022-01-23 :END: I was going to make a little post about [Wordle](https://wordlegame.org), but I go sidetracked exploring five letter words. At the same time, I had a bit of fun with [regular expressions](https://en.wikipedia.org/wiki/Regular_expression) and some simple scripting with [ZSH](https://zsh.sourceforge.io). The start was to obtain lists of 5-letter words. One is available at the [Stanford Graphbase](https://www-cs-faculty.stanford.edu/~knuth/sgb.html) site; the file ~sgb-words.txt~ contains \u0026#34;the 5757 five-letter words of English\u0026#34;. Others are available through English-language wordlists, such as those in the Linux directory ~/usr/share/dict/~. There are two such files for British and American English. So we can start by gathering all these different lists of words, and also sorting Knuth\u0026#39;s file so that it is in alphabetical order. Here\u0026#39;s how (assuming that we\u0026#39;re in a writable directory that will contain these files): #+BEGIN_SRC Bash grep -E \u0026#39;^[a-z]{5}$\u0026#39; /usr/share/dict/american-english \u0026gt; ./usa5.txt grep -E \u0026#39;^[a-z]{5}$\u0026#39; /usr/share/dict/british-english \u0026gt; ./brit5.txt sort sgb-words.txt \u0026gt; ./knuth5.txt #+END_SRC The regular expressions in the first two lines simply ask for words of exactly five letters made from lower-case characters. This eliminates proper names and words with apostrophes. Now let\u0026#39;s see how many words each file contains: #+BEGIN_SRC Bash $ wc -l *5.txt 5300 brit5.txt 5757 knuth5.txt 5150 usa5.txt 16207 total #+END_SRC Note the nice use of ZSH\u0026#39;s powerful [globbing](https://en.wikipedia.org/wiki/Glob_(programming)) features - one area which it is more powerful than BASH. Now there are too many words to see exactly the differences between them, but to start let\u0026#39;s list the words in ~brit5.txt~ which are not in ~usa5.txt~, and also the words in ~usa5.txt~ which are not in ~brit5.txt~: #+BEGIN_SRC bash $ grep -f usa5.txt -v brit5.txt \u0026gt; brit-usa_d.txt $ grep -f brit5.txt -v usa5.txt \u0026gt; usa-bri_d.txt #+END_SRC I\u0026#39;m using a debased version of set difference for the output file, so that ~brit-usa_d.txt~ are those words in ~brit5.txt~ which are not in ~usa5.txt~. I\u0026#39;ve added a ~_d~ to make a handle for globbing: #+BEGIN_SRC bash $ wc -l *_d.txt 188 brit-usa_d.txt 38 usa-brit_d.txt 226 total #+END_SRC And now we can look at the contents of these files, using the handy Linux `column` command to print the output with fewer lines: #+BEGIN_SRC Bash $ cat usa-brit_d.txt | column arbor\tchili\tfagot\tfeces\thonor\tmiter\tniter\trigor\tsavor\tvapor ardor\tcolor\tfavor\tfiber\thumor\tmolds\tocher\truble\tslier\tvigor armor\tdolor\tfayer\tfuror\tlabor\tmoldy\todors\trumor\ttumor calks\tedema\tfecal\tgrays\tliter\tmolts\tplows\tsaber\tvalor #+END_SRC Notice, as you may expect, that this file contains American spellings: \u0026#34;rigor\u0026#34; instead of British \u0026#34;rigour\u0026#34;, \u0026#34;liter\u0026#34; instead of British \u0026#34;litre\u0026#34;, and so on. However, the other file difference contains only a few spelling differences, and quite a lot of words not in the American wordlist: #+BEGIN_SRC bash $ cat brit-usa_d.txt | column abaci\tblent\tcroci\tflyer\thollo\tliras\tnitre\tpupas\tslily\ttogae\twrapt aeons\tblest\tcurst\tfogey\thomie\tlitre\tnosey\trajas\tslyer\ttopis\twrier agism\tblubs\tdados\tfondu\thooka\tloxes\tochre\tranis\tsnuck\ttorah\tyacks ameer\tbocce\tdeers\tfrier\thorsy\tlupin\todour\trecta\tspacy\ttorsi\tyocks amirs\tbocci\tdicky\tgamey\thuzza\tmacks\toecus\trelit\tspelt\ttsars\tyogin amnia\tboney\tdidos\tgaols\tidyls\tmaths\tpanty\tropey\tspick\ttyres\tyucks ampul\tbosun\tditzy\tgayly\tikons\tmatts\tpapaw\tsabre\tspilt\ttzars\tyuppy amuck\tbriar\tdjinn\tgipsy\timbed\tmavin\tpease\tsaree\tstogy\tulnas\tzombi appal\tbrusk\tdrily\tgismo\tindue\tmetre\tpenes\tsheik\tstyes\tvacua aquae\tbunko\tenrol\tgnawn\tjehad\tmiaow\tpigmy\tsherd\tswops\tveldt arses\tburqa\tenure\tgreys\tjinns\tmicra\tpilau\tshlep\tsynch\tvitas aunty\tcaddy\teying\tgybed\tjunky\tmitre\tpilaw\tshoed\ttabus\tvizir aurae\tcalfs\teyrie\tgybes\tkabob\tmomma\tpinky\tshoon\ttempi\tvizor baddy\tcalif\tfaery\thadji\tkebob\tmould\tpodgy\tshorn\tthymi\twelch bassi\tcelli\tfayre\thallo\tkerbs\tmoult\tpodia\tshtik\ttikes\twhirr baulk\tchapt\tfezes\thanky\tkiddy\tmynah\tpricy\tsiree\ttipis\twhizz beaux\tclipt\tfibre\theros\tkopek\tnarks\tprise\tsitup\ttiros\twizes bided\tconey\tfiord\thoagy\tleapt\tnetts\tpryer\tskyed\ttoffy\twooly #+END_SRC Of course, some of these words are spelled with a different number of letters in American English: for example the British \u0026#34;djinn\u0026#34; is the American \u0026#34;jinn\u0026#34;; the British \u0026#34;saree\u0026#34; is the American \u0026#34;sari\u0026#34;. Now of course we want to see how the Knuth file differs, as it\u0026#39;s the file with the largest number of words: #+BEGIN_SRC bash $ grep -f usa5.txt -v knuth5.txt \u0026gt; knuth-usa_d.txt $ grep -f brit5.txt -v knuth5.txt \u0026gt; knuth-brit_d.txt $ wc -l knuth*_d.txt 895 knuth-brit_d.txt 980 knuth-usa_d.txt 1875 total #+END_SRC Remarkably enough, there are also words in both the original files which are not in Knuth\u0026#39;s list: #+BEGIN_SRC bash $ grep -f knuth5.txt -v usa5.txt \u0026gt; usa-knuth_d.txt $ grep -f knuth5.txt -v brit5.txt \u0026gt; brit-knuth_d.txt $ wc -l *knuth_d.txt 438 brit-knuth_d.txt 373 usa-knuth_d.txt 811 total #+END_SRC So maybe our best bet would be to concatenate all the files, and take the all the words, leaving out any duplicates. Something like this: #+BEGIN_SRC bash $ cat usa5.txt brit5.txt knuth5.txt | sort | uniq -u \u0026gt; allu5.txt $ cat usa5.txt brit5.txt knuth5.txt | sort | uniq -d \u0026gt; alld5.txt $ cat allu5.txt alld5.txt | sort \u0026gt; all5.txt #+END_SRC The first line finds all the words which are unique - that is, that appear only once in the concatenated file, and the second line finds all the words which are repeated. These two lists are disjoint, and so may then be concatenated to form a master list, which can be found to contain 6196 words. Surely this file is complete? Well, the English language is a great collector of words, and every year we find new words being used, many from other languages and cultures. Here are some words that are not in the ~all5.txt~ file: Australian words: galah, dunny, smoko, durry, bogan, chook (there are almost certainly others) Indian words: crore, lakhs, dosai, iddli, baati, chaat, kheer, kofta, kulfi, rasam, poori (the first two are numbers, the others are foods) Scots words: canty, curch, flang, kythe, plack, routh, saugh, teugh, wadna - these are all used by Burns in his poems, which are written in English (admittedly a dialectical form of it). New words: qango, fubar, crunk, noobs, vlogs, rando, vaper (the first two are excellent acronyms; the others are new words) As with the Australian words, none of these lists are exhaustive; the full list of five-letter English words not in the file ~all5.txt~ would run probably into the many hundreds, maybe even thousands. ** A note on word structures I was curious about the numbers of vowels and consonants in words. To start, here\u0026#39;s a little Julia function which encodes the positions of consonants as an integer between 0 and 31. For example, take the word \u0026#34;drive\u0026#34;. We can encode this as [1,1,0,1,0] where the 1\u0026#39;s are at the positions of the consonants. Then this can be considered as binary digits representing the number 27. #+BEGIN_SRC Julia julia\u0026gt; function cvs(word) vs = indexin(collect(word),collect(\u0026#34;aeiou\u0026#34;)) vs2 = replace(x -\u0026gt; isnothing(x) ? 1 : 0,vs) return(sum(vs2 .* [16,8,4,2,1])) end #+END_SRC Now we simply walk through the words in ~all5.txt~ determining their values as we go, and keeping a running total: #+BEGIN_SRC Julia julia\u0026gt; wds = readlines(open(\u0026#34;all5.txt\u0026#34;)) julia\u0026gt; cv = zeros(Int16,32) julia\u0026gt; for w in wds c = cvs(w) cv[c+1] = cv[c+1]+1 end julia\u0026gt; hcat(0:31,cv) 32×2 Matrix{Int64}: 0 0 1 0 2 2 3 1 4 4 5 48 6 10 7 19 8 2 9 61 10 96 11 156 12 24 13 262 14 21 15 24 16 1 17 34 18 105 19 585 20 97 21 1514 22 432 23 1158 24 5 25 301 26 249 27 832 28 16 29 96 30 15 31 26 #+END_SRC We see that the most common patterns are 21 = 10101, and 23 = 10111. But what about some of the smaller values? #+BEGIN_SRC Julia julia\u0026gt; for w in wds if cvs(w) == 24 println(w) end end stoae wheee whooo xviii xxiii #+END_SRC Yes, there are some Roman numerals hanging about, and probably they should be removed. And one more, 30 = 11110: #+BEGIN_SRC Julia julia\u0026gt; for w in wds if cvs(w) == 30 println(w) end end chyme clxvi cycle hydra hydro lycra phyla rhyme schmo schwa style styli thyme thymi xxxvi #+END_SRC Again a few Roman numerals. These may need to be removed by hand. One way to do this is by using regular expressions again: #+BEGIN_SRC Bash $ grep -E \u0026#39;[xlcvi]{5}\u0026#39; all5.txt civic civil clvii clxii clxiv clxix clxvi lxvii villi xcvii xviii xxiii xxvii xxxii xxxiv xxxix xxxvi #+END_SRC and we see that we have 3 English words, and the rest Roman numerals. These can be deleted. * A new year (2022) :PROPERTIES: :EXPORT_FILE_NAME: new_year_2022 :EXPORT_DATE: 2022-01-01 :END: What does one do on the first day of a new year but write a blog post, and in it clearly delineate all plans for the coming year? Well, I\u0026#39;m doing the first part, but not the second, as I know that any plans will not be fulfilled - something /always/ gets in the way. You will notice a complete absence of posts last year after April. This was due to pressures both at work and outside, and left me with little mental energy to blog. I don\u0026#39;t know that this year will be much different. However, a few things to note (in no particular order): 1. I decided to upgrade my VPS, starting with my installation of [Nextcloud](https://nextcloud.com). I\u0026#39;ve been using Nextcloud as a replacement for Dropbox for years. But I managed to stuff-up an upgrade. While it was unavailable to me I used [Syncthing](https://syncthing.net) to keep my various computers in sync with each other, and I must say this works extremely well. So well, in fact, that it seems hardly worth going back to Nextcloud. And it seems that there are people who are ditching Nextcloud for simpler solutions: Syncthing for syncing, and something else for backups. [Rsync](https://rsync.samba.org) could be used here; I\u0026#39;ve also been recommended to look at [Duplicati](https://www.duplicati.com). 2. The upgrade of both Nextcloud and my VPS system (Ubuntu 18.04 to Ubuntu 20.04) both went awry, and I ended up with a non-working system. I could ssh into it, but none of the services would work. So I decided to ditch the lot, and start from scratch by re-imaging my VPS. This \u0026#34;scorched earth\u0026#34; approach meant at one stroke I got rid of years of accumulated rubbish: files and apps I\u0026#39;d downloaded, experimented and discarded; any number of old unused docker containers and images. (Although I had made an effort to clean up those.) And I\u0026#39;ve been slowly building everything back up again, with much external help in particular for managing [traefik](https://traefik.io), which I like very much. But it has a configuration which is tricky, at least for me. 3. I did manage to attach a USB drive to my home wireless router, and make that drive visible to both Linux and Windows - which took longer than it should have, as I kept misunderstanding descriptions and instructions. But it\u0026#39;s working now. So in one sense I have at least local backups. However, I also need \u0026#34;off-site\u0026#34; backups. 4. In my teaching this last year I used, as for previous years, a mixture of Excel and [Geogebra](https://www.geogebra.org) for my numerical methods class, and Excel with its [Solver](https://www.solver.com) add-in for my linear programming class. These students are all pre-service teachers, and so I use the software they will be most likely to encounter in their professional lives. I have come to quite like Excel, and my students even get to do a bit of VBA programming. (Well, I write the programs, and then they edit them slightly.) I have a love-hate relationship with Geogebra. It does many things well, but there\u0026#39;s always an annoying limit, or things you can\u0026#39;t change. I hope to write up about this. But here\u0026#39;s one: Geogebra\u0026#39;s default variable names for lists (if you don\u0026#39;t give them names yourself) are l1, l2, l3 and so on. But Geogebra uses a sans-serif font in which a lower-case L is indistinguishable from an upper-case I. And you can\u0026#39;t change the font. So if you\u0026#39;re seeing \u0026#34;l1\u0026#34; for the first time, you can\u0026#39;t distinguish the first character. This is a very poor GUI decision. And it annoys me a lot, because it would be trivial to fix: use an upper-case L instead, or allow users to change the font! 5. I taught for the first time a data analytics subject, based around [R](https://www.r-project.org), which I\u0026#39;d never before used. Well, there\u0026#39;s nothing like having to teach something to learn it quickly, and I learned it well enough to teach it to a beginning class, and also to enjoy it. Like all languages, R comes in for plenty of criticism, and much of its functionality can be managed now with Python, but R has been a sort of standard for a decade or more, and that alone is a very good reason for learning it. What\u0026#39;s more, it seems to be getting a new lease of life with the [tidyverse](https://www.tidyverse.org) suite of packages by[Hadley Wickham](https://en.wikipedia.org/wiki/Hadley_Wickham). And these come with excellent documentation. 6. I finished the year looking again at [bicentric polygons](https://en.wikipedia.org/wiki/Bicentric_polygon), which fascinate me. Some years ago, [Phil Todd](https://saltire.com/people.html), the creator of the [Saltire](https://saltire.com) geometry application, found a bicentric pentagon whose vertices were a subset of the vertices of a regular nonagon. You can see his [PDF file here](https://atcm.mathandtech.org/EP2015/invited/1.pdf). I was wondering if there are other bicentric polygons whose vertices are subsets of a regular \\(n\\)-gon (other than triangles or regularly-spaced vertices), and this led me on a bit of a hunt. Using a very inefficient program (and in Python), I found no other bicentric pentagons, and the only bicentric quadrilaterals were right-angled kites; that is, whose vertices are at \\[ (\\pm 1,0),\\quad (\\cos(x),\\pm\\sin(x)) \\] for $0\u0026lt;x\u0026lt;\\pi/2$. This either means there are no others, or there are others I haven\u0026#39;t found. I don\u0026#39;t know. It would be nice to discover a symmetric but non-regular bicentric hexagon (vertices a subset of an \\(n\\)-gon, for \\(n\u0026gt;7\\). So - software and mathematics - plenty going on! * A note on Steffensen\u0026#39;s method for solving equations :mathematics:computation: :PROPERTIES: :EXPORT_FILE_NAME: steffensen :EXPORT_DATE: 2021-04-30 :END: Steffensen\u0026#39;s method is based on Newton\u0026#39;s iteration for solving a non-linear equation \\(f(x)=0\\): \\[ x\\leftarrow x-\\frac{f(x)}{f\u0026#39;(x)} \\] Newton\u0026#39;s method can fail to work in a number of ways, but when it does work it displays /qudratic convergence/; the number of correct signifcant figures roughly doubling at each step. However, it also has the disadvntage of needing to compute the derivative as well as the function. This may be difficult for some functions. Steffensen\u0026#39;s idea was to use the quotient approximation of the derivative: \\[ f\u0026#39;(x)\\approx\\frac{f(x+h)-f(x)}{h} \\] when \\(h\\) is small, and since we trying to solve \\(f(x)=0\\), we may assume that in the neighbourhood of the solution \\(f(x)\\) is itself small, so can be used for \\(h\\). This means we can write \\[ f\u0026#39;(x)\\approx\\frac{f(x+f(x))-f(x)}{f(x)} \\] which leads to the following version of Newton\u0026#39;s method: \\[ x\\leftarrow x-\\frac{f(x)^2}{f(x+f(x))-f(x)}. \\] This is a neat idea, and in fact when it works it converges almost as fast as Newton\u0026#39;s method. However, it is very senstive to the starting value. For example, suppose we want to find the value of \\(W(10)\\), where \\(W(x)\\) is Lambert\u0026#39;s \\(W\\) function; the inverse of \\(y=xe^x\\). Finding \\(W(10)\\) then means solving the equation \\[ xe^x-10=0. \\] Newton\u0026#39;s method uses the iteration \\[ x\\leftarrow x-\\frac{xe^x-10}{e^x(x+1)} \\] and with a positive starting value not too big will converge; the first 50 places of the solution are: \\[ 1.74552800274069938307430126487538991153528812908093 \\] Staring with \\(x=2\\) will produce over 1000 correct decimal places in 12 steps. If we apply Steffensen\u0026#39;s method starting with \\(x=2\\) we\u0026#39;ll see values that wobble about 1.9 for ages 1.9 before converging to the wrong value. Newton\u0026#39;s method will work for almost any value (although the larger the initial value, the long the iterations take to \u0026#34;settle down\u0026#34;); Steffensen\u0026#39;s method will only work when the initial value is close to 1.7. ** A slight improvement Using the \u0026#34;central\u0026#34; approximation of the derivative: \\[ f\u0026#39;(x)\\approx\\frac{f(x+h)-f(x-h)}{2h} \\] makes a considerable difference; this leads to the iteration \\[ x\\leftarrow x-\\frac{2f(x)^2}{f(x+f(x))-f(x-f(x))} \\] This does however require the computation of three function values, rather than just the original two. A slightly faster version of the above is \\[ x\\leftarrow x-\\frac{f(x)^2}{f(x+\\frac{1}{2}f(x))-f(x-\\frac{1}{2}f(x))}. \\] * Exploring Tanh-Sinh quadrature :mathematics:computation: :PROPERTIES: :EXPORT_FILE_NAME: tanh-sinh quadrature :EXPORT_DATE: 2021-04-30 :END: As is well known, tanh-sinh quadrature takes an integral \\[ \\int_{-1}^1f(x)dx \\] and uses the substitution \\[ x = g(t) = \\tanh\\left(\\frac{\\pi}{2}\\sinh t\\right) \\] to transform the integral into \\[ \\int_{-\\infty}^{\\infty}f(g(t))g\u0026#39;(t)dt. \\] The reason this works so well is that the derivative \\(g\u0026#39;(t)\\) dies away at a /double exponentional rate/; that is, at the rate of \\[ e^{-e^t} \\] In fact, \\[ g\u0026#39;(t)=\\frac{\\frac{\\pi}{2}\\cosh t}{\\cosh^2(\\frac{\\pi}{2}\\sinh t)} \\] and for example, \\(g\u0026#39;(10)\\approx 4.4\\times 10^{-15022}\\). To apply this technique, [David Bailey](https://www.davidhbailey.com/dhbpapers/dhb-tanh-sinh.pdf), one of the strongest proponents for it, recommends choosing first a small value \\(h\\), some integer \\(N\\) and a working precision so that \\(|g\u0026#39;(Nh)f(g(Nh))|\\) is less than the precision you want. So if you want 1000 decimal place accuracy, you need to be sure that your working precision allows you to determine that \\[ \\|g\u0026#39;(Nh)f(g(Nh))\\| \u0026lt; 10^{-1000}. \\] If you start with say \\(h=0.5\\) you can then halve this value at each step until you have approximations that agree to your precision. Bailey claims that \\(h=2^{-12}\\) is \u0026#34;sufficient to evaluate most integrals to 1000 place accuracy\u0026#34;. The computation then requires calculating the value of \\[ h\\sum_{j=-N}^N g\u0026#39;(hj)f(g(hj)). \\] (I\u0026#39;m using exactly the same notation as in Bailey\u0026#39;s description.) The elegant thing is that the nodes or absisscae (values at the which the function is evaluated) are based on values of \\(hj\\), each step of which includes all values from the previous step. So at each stage we only have to compute the \u0026#34;new\u0026#34; values. For example, if \\(Nh=2\\), for example, the first positive values of \\(hj\\) are \\[ 0,\\frac{1}{2},1,\\frac{3}{2},2 \\] and the next step will have \\[ 0,\\frac{1}{4},\\frac{1}{2},\\frac{3}{4},1,\\frac{5}{4},\\frac{3}{2},\\frac{7}{4},2 \\] of which the new values are \\[ \\frac{1}{4},\\frac{3}{4},\\frac{5}{4},\\frac{7}{4}. \\] At the next step, the new values will be \\[ \\frac{1}{8},\\frac{3}{8},\\frac{5}{8},\\frac{7}{8},\\frac{9}{8},\\frac{11}{8},\\frac{13}{8},\\frac{15}{8} \\] and so on. In fact at each step, with \\(h=2^{-k}\\) we only have to perform computations at the values \\[ \\frac{1}{2^k},\\frac{3}{2^k},\\frac{5}{2^k},\\ldots,\\frac{2N-1}{2^k}. \\] We can express this sequence as \\[ h+2kh, h = 0,1,2,\\ldots N-1 \\] and as the value of \\(h\\) halves each step, the value of \\(N\\) doubles each step. ** A test in Julia I decided to see how easy this might be by attempting to evaluate \\[ \\int_0^1\\frac{\\arctan x}{x}dx \\] This is a good test integral, as the integrand has a removable singularity at one end. By integrating the Taylor series term by term (and assuming convergence), we obtain: \\[ 1-\\frac{1}{3^2}+\\frac{1}{5^2}-\\frac{1}{7^2}+\\frac{1}{9^2}-\\cdots \\] and this particular sum is known as [Catalan\u0026#39;s constant](https://en.wikipedia.org/wiki/Catalan%27s_constant) and denoted \\(G\\). It is unknown whether \\(G\\) is irrational, let alone transcendental. Because the integral is equal to a known value, the results can be checked against published values including [one publication](http://www.gutenberg.org/ebooks/812) providing 1,500,000 decimal places. Over the interval \\([-1,1]\\) the integral transforms to \\[ \\int_{-1}^1\\frac{\\arctan((x+1)/2)}{x+1}dx \\] I decided to try for 1000 decimal places, and go up to \\(Nh=8\\), given that \\(g\u0026#39;(8)\\approx 2.5\\times 10^{-2030}\\). (This is overkill, of course.) We then need enough precision so that near the left endpoint, the function value is never given as -1. For example, in Julia: #+BEGIN_SRC julia setprecision(4000) function g(t)::BigFloat bt = big(t) bp = big(pi) return tanh(bp/2*sinh(bt)) end function gt(t)::BigFloat bt = big(t) bp = big(pi) return bp/2*cosh(bt)/cosh(bp/2*sinh(bt))^2 end function at(x)::BigFloat y = big(x) return atan((y+1)/2)/(y+1) end #+END_SRC The trouble is at this precision, the value of \\(g(-8)\\) is given as \\(-1.0\\), at which the function is undefined. If we increase the precision to 7000, we find that #+BEGIN_SRC julia setprecision(7000) using Printf @printf \u0026#34;%.8e\u0026#34; 1+g(-8) 5.33290917e-2034 #+END_SRC This small difference can\u0026#39;t be managed at only 4000 bits. Now we can have a crack at the first iteration of computation: #+BEGIN_SRC julia h = big\u0026#34;0.5\u0026#34; N = 16 inds = [k*h for k=1:N] xs = [g(z) for z in inds] ws = [gd(z) for z in inds] ts = h*gd(0)*at(g(0)) ts += h*sum(ws[i]*at(xs[i]) for i = 1:N) ts += h*sum(ws[i]*at(-xs[i]) for i = 1:N) @printf \u0026#34;%.50e\u0026#34; ts 9.15969525022017573265491207994328001754668713901325e-01 #+END_SRC and this is correct to about 6 decimal places. We are here exploiting the fact that the weight function \\(g\u0026#39;(t)\\) is even, and the subsitution function \\(g(t)\\) is odd, so we only have to compute values for positive \\(t\\). The promise of tanh-sinh quadrature is that the accuracy roughly doubles for each halving of the step size. We can test this, by repeatedly iterating the current value with new ordinates, and compare each to a downloaded value of a few thousand decimal places of the constant: #+BEGIN_SRC julia for j = 2:10 h = h/2 inds = [h+2*k*h for k in 0:N-1] xs = [g(z) for z in inds] ws = [gd(z) for z in inds] ts = ts/2 + h*sum(ws[i]*at(xs[i]) for i in 1:N) ts = ts + h*sum(ws[i]*at(-xs[i]) for i in 1:N) print(j,\u0026#34; \u0026#34;) @printf \u0026#34;%.8e\u0026#34; abs(ts-catalan) println() N = N*2 end 2 6.01994061e-10 3 6.03834702e-20 4 8.07587315e-38 5 1.15722093e-74 6 9.05835440e-148 7 7.95770023e-294 8 2.44238219e-585 9 4.47198995e-1167 10 1.24233864e-1355 #+END_SRC At this stage we have \\(h=0.5^{10}\\) and a result that is well over our intended accuracy of 1000 decimal places. If we were instead to compare the absolute differences of successive results, we would see: #+BEGIN_SRC julia 2 3.93144679e-06 3 6.01994061e-10 4 6.03834702e-20 5 8.07587315e-38 6 1.15722093e-74 7 9.05835440e-148 8 7.95770023e-294 9 2.44238219e-585 10 4.47198995e-1167 #+END_SRC and we could infer that we have at least 1167 decimal places accuracy. (If we were to go one more step, the difference becomes about \\(1.004\\times 10^{-2304}\\) which is the limit of the set precision.) * High precision quadrature with Clenshaw-Curtis :mathematics:computation: :PROPERTIES: :EXPORT_FILE_NAME: high_precision_clenshaw_curtis :EXPORT_DATE: 2021-04-21 :END: An article by Bailey, Jeybalan and LI, \u0026#34;A comparison of three high-precision quadrature schemes\u0026#34;, and available online [here](https://www.davidhbailey.com/dhbpapers/quadrature-em.pdf), compares Gauss-Legendre quadrature, tanh-sinh quadrature, and a rule where the nodes and weights are given by the error function and its integrand respectively. However, [Nick Trefethen](https://en.wikipedia.org/wiki/Nick_Trefethen) of Oxford has [shown experimentally](https://people.maths.ox.ac.uk/trefethen/cctalk.pdf) that Clenshaw-Curtis quadrature is generally no worse than Gaussian quadrature, with the added advantage that the nodes and weights are easier to obtain. I\u0026#39;m using here the slightly sloppy convention of taking \u0026#34;Clenshaw-Curtis quadrature\u0026#34; to be the generic name for any integration rule over the interval \\([-1,1]\\) whose nodes are given by cosine values (more particularly, the zeros of Chebyshev polynomials). However, rules very similar to Clenshaw-Curtis were described by the Hungarian mathematician [Lipót Fejér](https://en.wikipedia.org/wiki/Lip%C3%B3t_Fej%C3%A9r) in the 1930s; some authors like to be very clear about the distinctions between the \u0026#34;Fejér I\u0026#34;, \u0026#34;Fejér II\u0026#34;, and \u0026#34;Clenshaw-Curtis\u0026#34; rules, all of which have very slightly different nodes. ** The quadrature rule The particular quadrature rule may be considered to be an \u0026#34;open rule\u0026#34; in that, like Gauss-Legendre quadrature, it doesn\u0026#39;t use the endpoints. An \\(n\\)-th order rule will have the nodes \\[ x_k = \\cos\\left(\\frac{2k+1}{2n}\\pi\\right), k= 0,1,\\ldots,n-1. \\] The idea is that we create an interpolating polynomial \\(p(x)\\) through the points \\((x_k,f(x_k)\\), and use that polynomial to obtain the integral approximation \\[ \\int_{-1}^1f(x)\\approx\\int_{-1}^1p(x)dx. \\] However, such a polynomial can be written in Lagrange form as \\[ p(x) = \\sum_{k=0}^{n-1}f(x_k)p_k(x) \\] with \\[ p_k(x)=\\frac{\\prod(x-x_i)}{\\prod (x_k-x_i)} \\] where the products are taken over \\(0\\le i\\le n-1\\) excluding \\(k\\). This means that the integral approximation can be written as \\[ \\int_{-1}^1\\sum_{k=0}^{n-1}f(x_k)p_k(x)=\\sum_{k=0}^{n-1}\\left(\\int_{-1}^1p_k(x)dx\\right)f(x_k). \\] Thus writing \\[ w_k = \\int_{-1}^1p_k(x)dx \\] we have an integration rule \\[ \\int_{-1}^1f(x)dx\\approx\\sum w_kf(x_k). \\] Alternatively, as for a Newton-Cotes rule, we can determine the weights as being the unique values for which \\[ \\int_{-1}^1x^mdx = \\sum w_k(x_k)^m \\] for all \\(m=0,1,\\ldots,n-1\\). The integral \\(I_n\\) on the left is equal to 0 for odd \\(n\\) and equal to \\(2/(n+1)\\) for even \\(n\\), so we have an \\(n\\times n\\) linear system consisting of equations \\[ w_0x_0^n+w_1x_1^n+\\cdots w_{n-1}x_{n-1}^m=I_m \\] For example, here is how the weights could be constructed in Python: #+BEGIN_SRC python import numpy as np N = 9 xs = [np.cos((2*(N-k)-1)*np.pi/(2*N)) for k in range(N)] A = np.array([[xs[i]**j for i in range(N)] for j in range(N)]) b = [(1+(-1)**k)/(k+1) for k in range(N)] ws = np.linalg.solve(A,b) ws[:,None] array([[0.05273665], [0.17918871], [0.26403722], [0.33084518], [0.34638448], [0.33084518], [0.26403722], [0.17918871], [0.05273665]]) #+END_SRC Then we can approximate an integral, say \\[ \\int_{-1}^1e^{-x^2}dx \\] by #+BEGIN_SRC python f = lambda x: np.exp(-x*x) sum(ws * np.vectorize(f)(xs)) 1.4936477751634403 #+END_SRC and this is correct to six decimal places. ** Use of the DCT The above method for obtaining the weights is conceptually easy, but computationally expensive. A far neater method, described in [an article by Alvise Sommariva](https://www.sciencedirect.com/science/article/pii/S089812211200689X), is to use the discrete cosine transform - in particular the DCT III, which is available in most standard implementations as ~idct~. Define a vector \\(m\\) (called by Sommariva the weighted /modified moments/) of length \\(N\\) by \\begin{aligned} m_0\u0026amp;=\\sqrt{2}\\\\ m_k\u0026amp;=2/(1-k^2)\\text{ if \\(k\\ge 2\\) is even}\\\\ m_k\u0026amp;=0\\text{ if \\(k\\) is odd} \\end{aligned} Then the weights are given by the DCT of \\(m\\). Again in Python: #+BEGIN_SRC python from scipy.fftpack import idct m = [np.sqrt(2),0]+[(1+(-1)^k)/(1-k^2) for k in range(2,N)] ws = np.sqrt(2/N)*idct(m,norm=\u0026#39;ortho\u0026#39;) ws[:,None] array([[0.05273665], [0.17918871], [0.26403722], [0.33084518], [0.34638448], [0.33084518], [0.26403722], [0.17918871], [0.05273665]]) #+END_SRC which is exactly the same as before. ** Arbitrary precision Our choice here (in Python) will be the [mpmath](https://mpmath.org) library, in spite of some [criticisms](https://fredrikj.net/blog/2018/11/announcing-python-flint-0-2/) against it. Our use of it though will be confined to simple arithmetic (sometimes of matrices) rather than more complicated routines such as quadrature. Note also that the criticisms were more in the nature of comparisons; the writer is in fact the principal author of mpmath. As a start, here\u0026#39;s how we might develop the above nodes and weights for 30 decimal places. First the nodes: #+BEGIN_SRC python from mpmath import mp mp.dps = 30 xs = [mp.cos((2*(N-k)-1)*mp.pi/(2*N)) for k in range(N) for i in range(N): print(xs[i]) -0.984807753012208059366743024589 -0.866025403784438646763723170753 -0.642787609686539326322643409907 -0.342020143325668733044099614682 8.47842766036889964395870146939e-32 0.342020143325668733044099614682 0.642787609686539326322643409907 0.866025403784438646763723170753 0.984807753012208059366743024589 #+END_SRC Next the weights. As mpmath doesn\u0026#39;t have its own DCT routine, we can construct a DCT matrix and multiply by it: #+BEGIN_SRC python dct = mp.matrix(N,N) dct[:,0] = mp.sqrt(1/N) for i in range(N): for j in range(1,N): dct[i,j] = mp.sqrt(2/N)*mp.cos(mp.pi/2/N*j*(2*i+1)) m = mp.matrix(N,1) m[0] = mp.sqrt(2) for k in range(2,N,2): m[k] = mp.mpf(2)/(1-k**2) ws = dct*m*mp.sqrt(2/N) ws matrix( [[\u0026#39;0.0527366499099067816401650387891\u0026#39;], [\u0026#39;0.179188712522045851600122685138\u0026#39;], [\u0026#39;0.264037222541004397180813776173\u0026#39;], [\u0026#39;0.330845175168136422780278417119\u0026#39;], [\u0026#39;0.346384479717813038086088934304\u0026#39;], [\u0026#39;0.330845175168136422780278417119\u0026#39;], [\u0026#39;0.264037222541004397180813776173\u0026#39;], [\u0026#39;0.179188712522045851600122685138\u0026#39;], [\u0026#39;0.0527366499099067816401650387892\u0026#39;]]) #+END_SRC Given the nodes and the weights, evaluating the integral is straightforward: #+BEGIN_SRC python f = lambda x: mp.exp(-x*x) fs = [f(x) for x in xs] ia = sum([a*b for a,b in zip(ws,fs)]) print(ia) 1.49364777516344031419344222023 #+END_SRC This no more accurate than the previous result as we are still using only 9 nodes and weights. But suppose we increase both the precision and the number of nodes: #+BEGIN_SRC python mp.dps = 100 N = 128 #+END_SRC If we carry out all the above computations with these values, we can check the accuracy of the approximate result against the exact value which is \\(\\sqrt{\\pi}\\phantom{.}\\text{erf}(1)\\): #+BEGIN_SRC python ie = mp.sqrt(mp.pi)*mp.erf(1) mp.nprint(abs(ia-ie),10) 2.857468478e-101 #+END_SRC and we see the we have accuracy to 100 decimal places. ** Onwards and upwards With the current integral \\[ \\int_{-1}^1e^{-x^2}dx \\] here are some values with different precisions and values of N, starting with the two we have already: #+ATTR_CSS: :width 50% | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | | dps | N | absolute difference | |------+-----+---------------------| | 30 | 9 | 4.904614138e-7 | | 100 | 128 | 2.857468478e-101 | | 500 | 256 | 8.262799923e-298 | | 1000 | 512 | 8.033083996e-667 | Note that the code above is in no way optimized; there is no hint of a fast DCT, for example. Thus for large values of N and for high precision there is a time-cost; the last computation in the table took 44 seconds. (I have an IBM ThinkPad X1 Carbon 3rd Generation laptop running Arch Linux. It\u0026#39;s several years old, but its proved to be a terrific workhorse.) But this shows that a Clenshaw-Curtis type quadrature approach is quite appropriate for high precision integration. * The circumference of an ellipse :mathematics:computation: :PROPERTIES: :EXPORT_FILE_NAME: circumference_ellipse :EXPORT_DATE: 2021-04-10 :END: *Note:* This blog post is mainly computational, with a hint of proof-oriented mathematics here and there. For a more in-depth analysis, read the excellent article \u0026#34;Gauss, Landen, Ramanujan, the Arithmetic-Geometric Mean, Ellipses, pi, and the Ladies Diary\u0026#34; by Gert Akmkvist and Bruce Berndt, in /The American Mathematical Monthly/, vol 95 no. 7 (August-September 1988), pages 585-608, and happily made available online for free at [[https://www.maa.org/sites/default/files/pdf/upload_library/22/Ford/Almkvist-Berndt585-608.pdf]] This article was the 1989 winner of the MAA Halmos-Ford award \u0026#34;For outstanding expository papers in /The American Mathematical Monthly/\u0026#34; and you should certainly read it. ** Introduction It\u0026#39;s one of the delights or annoyances of mathematics, which ever way you look at it, that there\u0026#39;s no simple formula for the cicrcumference of an ellipse comparable to \\(2\\pi r\\) for a circle. Indeed, for an ellipse with semi axes \\(a\\) and \\(b\\), the circumference can be expressed as the integral \\[ 4\\int_0^{\\pi/2}\\sqrt{a^2\\cos^2\\theta + b^2\\sin^2\\theta}\\,d\\theta \\] which is one of a class of integrals called /elliptic integrals/, and which cannot be expressed using algebraic or standard transcendental functions. However, it turns out that there are ways of very quickly computing the circumference on an ellipse to any desired accuracy, using methods which originate with Carl Friedrich Gauss. ** The arithmetic-geometric mean Given \\(a\u0026gt;b\u0026gt;0\\), we can define two sequences by: \\[ a_0 = a,\\qquad a_{k+1}=(a_k+b_k)/2 \\] and \\[ b_0=b,\\qquad b_{k+1}=\\sqrt{a_kb_k}. \\] Thus the \\(a\\) values are the arithmetic means of the previous pair; the \\(b\\) values the geometric mean of the pair. Since \\[ b\u0026lt;\\sqrt{ab}\u0026lt;\\frac{a+b}{2}\u0026lt;a \\] and since \\(a_{k+1}\u0026lt;a_k\\) and \\(b_{k+1}\u0026gt;b_k\\), it follows that the \\(a\\) values are decreasing and bounded below, and the \\(b\\) values are increasing and bounded above, so they both converge. Also, if \\(c_k=\\sqrt{a_k^2-b_k^2}\\) then (see Almqvist \\\u0026amp; Berndt pp 587-588): \\begin{aligned} c_{k+1}\u0026amp;=\\sqrt{a_{k+1}^2-b_{k+1}^2}\\\\ \u0026amp;=\\sqrt{\\frac{1}{4}(a_k+b_k)^2-a_kb_k}\\\\ \u0026amp;=\\frac{1}{2}(a_k-b_k)\\\\ \u0026amp;=\\frac{c_k^2}{4a_{k+1}}\\\\ \u0026amp;\u0026lt;\\frac{c_k^2}{4M(a,b)}. \\end{aligned} This shows that not only do the sequences converge to the same limit, but that the sequences converge /quadratically/; each iteration being double the precision of the previous. The common limit is called the /arithmetic-geometric mean/ of \\(a\\) and \\(b\\) and will be denoted here as \\(M(a,b)\\). To give an indication of this speed, in Python: #+BEGIN_SRC python from mpmath import mp mp.dps = 50 a,b = mp.mpf(\u0026#39;3\u0026#39;), mp.mpf(\u0026#39;2\u0026#39;) for i in range(10): a,b = (a+b)/2, mp.sqrt(a*b) print(\u0026#39;{0:52s} {1:52s}\u0026#39;.format(str(a),str(b))) 2.5 2.4494897427831780981972840747058913919659474806567 2.4747448713915890490986420373529456959829737403283 2.4746160019198827700554874647235766528956885806854 2.4746804366557359095770647510382611744393311605069 2.474680435816873015671798992747485357556612143505 2.4746804362363044626244318718928732659979716520059 2.4746804362363044625888873352899297125054403531594 2.4746804362363044626066596035914014892517060025827 2.4746804362363044626066596035914014892516421855508 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 2.4746804362363044626066596035914014892516740940667 #+END_SRC You see that we have reached the limit of precision in six steps. To better demonstrate the speed of convergence, use greater precision and display the difference \\(a_k-b_k\\): #+BEGIN_SRC python mp.dps = 2000 a,b = mp.mpf(\u0026#39;3\u0026#39;), mp.mpf(\u0026#39;2\u0026#39;) for i in range(10): a,b = (a+b)/2, mp.sqrt(a*b) mp.nprint(a-b,10) 0.05051025722 0.0001288694717 8.388628939e-10 3.55445366e-20 6.381703188e-41 2.057141145e-82 2.137563719e-165 2.307963982e-331 2.690598786e-663 3.656695286e-1327 #+END_SRC You can see that the precision does indeed roughly double at each step. ** Elliptic integrals and the AGM This integral: \\[ K(k) = \\int_0^{\\pi/2}\\frac{1}{\\sqrt{1-k^2\\sin^2\\theta}}d\\theta \\] is called a /complete elliptic integral of the first kind/. If \\[ k^2=1-\\frac{b^2}{a^2} \\] then \\[ \\int_0^{\\pi/2}\\frac{1}{\\sqrt{1-k^2\\sin^2\\theta}} d\\theta = a\\int_0^{\\pi/2} \\frac{1}{\\sqrt{a^2\\cos^2\\theta +b^2\\sin^2\\theta}} d\\theta \\] and the integral on the right is denoted \\(I(a,b)\\). Gauss observed (and proved) that \\[ I(a,b)=\\frac{\\pi}{2M(a,b)}. \\] This is in fact equivalent to the assertion that \\[ K(x) = \\frac{\\pi}{2M(1-x,1+x)}. \\] Gauss started with assuming that \\(M(1-x,1+x)\\) was an even function, so could be expressed as a power series in even powers of \\(x\\). He then compared the power series representing $K(x)$ with the integral computed by expanding the integrand as a power series using the binomial theorem and integrating term by term, to obtain \\[ \\frac{1}{M(1-x,1+x)}=\\frac{2}{\\pi}K(x)=1+\\left(\\frac{1}{2}\\right)^{2}x^2+ \\left(\\frac{1\\cdot 3}{2\\cdot 4}\\right)^{2}x^4+ \\left(\\frac{1\\cdot 3\\cdot 5}{2\\cdot 4\\cdot 6}\\right)^{2}x^6+\\cdots \\] and this power series can be written as \\[ \\sum_{n=0}^{\\infty}\\frac{(2n)!^2}{2^{4n}(n!)^4}x^{2n}. \\] Another proof, apparently originally due to Newman, uses first the substition \\(t=a\\tan\\theta\\) to rewrite the integral: \\[ I(a,b)=\\int_0^{\\pi/2}\\frac{1}{\\sqrt{a^2\\cos^2\\theta+b^2\\sin^2\\theta}} d\\theta = \\int^{\\infty}_{-\\infty}\\frac{1}{\\sqrt{(a^2+t^2)(b^2+t^2)}} dt \\] and make the substitution \\[ u = \\frac{1}{2}\\left(t-\\frac{ab}{t}\\right) \\] which produces (after \u0026#34;some care\u0026#34;, according to the Borwein brothers in their book /Pi and the AGM/): \\[ I(a,b) = \\int^{\\infty}_{-\\infty}\\frac{1}{\\sqrt{(\\left(\\frac{a+b}{2}\\right)^2+t^2)(ab+t^2)}} dt =I(\\frac{a+b}{2},\\sqrt{ab}) = I(a_1,b_1). \\] Continuing this process we see that \\(I(a,b) = I(a_k,b_k)\\) for any \\(k\\), and taking the limit, that \\(I(a,b)=I(M,M)\\). Finally \\[ I(M,M) = \\int_0^{\\pi/2}\\frac{1}{\\sqrt{M^2\\cos^2\\theta+M^2\\sin^2\\theta}} d\\theta = \\int_0^{\\pi/2}\\frac{1}{M} d\\theta = \\frac{\\pi}{2M}. \\] So we now know that the complete elliptic integral of the first kind is computable by means of the AGM. But what about the complete elliptic integral of the /second/ kind? ** Complete elliptic integrals of the second kind These are defined as \\begin{aligned} E(k)\u0026amp;=\\int_0^{\\pi/2}\\sqrt{1-k^2\\sin^2\\theta} d\\theta\\\\ \u0026amp;=\\int_0^1\\sqrt{\\frac{1-k^2t^2}{1-t^2}}\\,dt. \\end{aligned} *Note:* An alternative convention is to write: \\begin{aligned} E(m)\u0026amp;=\\int_0^{\\pi/2}\\sqrt{1-m\\sin^2\\theta} d\\theta\\\\ \u0026amp;=\\int_0^1\\sqrt{\\frac{1-mt^2}{1-t^2}}\\,dt. \\end{aligned} This elliptic integral is the one we want, since the perimeter of an ellipse given in cartesion form as \\[ \\frac{x^2}{a^2}+\\frac{y^2}{b^2}=1 \\] is equal to \\[ 4aE\\left(\\sqrt{1-\\frac{b^2}{a^2}}\\right) \\] (using the first definition). This can be written as \\[ 4aE(e) \\] where \\(e\\) is the eccentricity of the ellipse. The computation of elliptic integrals goes back as far as Euler, taking in Gauss and then Legendre along the way. [Adrien-Marie Legendre](https://en.wikipedia.org/wiki/Adrien-Marie_Legendre) (1752-1833) is better known now than he was in his lifetime; his contributions to elliptic integrals are now seen as fundamental, and in paving the way for the greater work of Abel and Jacobi. In particular, Legendre showed that \\[ \\frac{K(k)-E(k)}{K(k)}=\\frac{1}{2}(c_0^2+2c_1^2+4c_2^2+\\cdots + 2^nc_n^2+\\cdots ) \\] where \\[ c_n^2=a_n^2-b_n^2. \\] The above equation can be rewritten to provide a fast-converging series for \\(E(k)\\): \\begin{aligned} E(k)\u0026amp;=K(k)(1-\\frac{1}{2}(c_0^2+2c_1^2+4c_2^2+\\cdots + 2^nc_n^2+\\cdots))\\\\ \u0026amp;=\\frac{\\pi}{2M(a,b)}(1-\\frac{1}{2}(c_0^2+2c_1^2+4c_2^2+\\cdots + 2^nc_n^2+\\cdots)). \\end{aligned} In the sequences for the AGM, let \\(M(a,b)\\) be approximated by \\(a_n\\). This produces a sequence that converges very quickly to \\(E(k)\\): \\begin{aligned} e_0\u0026amp;=\\frac{\\pi}{2a}(1-\\frac{1}{2}c_0^2)\\\\ e_1\u0026amp;=\\frac{\\pi}{2a_1}(1-\\frac{1}{2}(c_0^2+2c_1^2))\\\\ e_2\u0026amp;=\\frac{\\pi}{2a_2}(1-\\frac{1}{2}(c_0^2+2c_1^2+4c_2^2)) \\end{aligned} This can easily be managed recursively, given \\(a\\) and \\(b\\), by setting \\[ a_0=a,\\qquad,b_0=b,\\qquad p_0=1,\\qquad s_0=a^2-b^2 \\] and then iterating by \\begin{aligned} a_{k+1}\u0026amp;=(a_k+b_k)/2\\\\ b_{k+1}\u0026amp;=\\sqrt{a_kb_k}\\\\ p_{k+1}\u0026amp;=2p_k\\\\ s_{k+1}\u0026amp;=s_k+p_k(a_{k+1}^2-b_{k+1}^2) \\end{aligned} Then the values \\[ e_k = \\frac{2\\pi}{a_k}(a_0^2-\\frac{1}{2}s_k) \\] approach the perimeter of the ellipse. We can demonstrate this in Python for \\(a,b=3,2\\), again using the multiprecision library mpmath: #+BEGIN_SRC python mp.dps = 100 a = mp.mpf(3) b = mp.mpf(2) p = 1 s = a**2-b**2 a0 = a e = 2*mp.pi/a*(a0*a0-1/2*s) for i in range(11): a, b, p = (a+b)/2, np.sqrt(a*b), p*2 s += p*(a**2-b**2) print(2*mp.pi/a*(a0*a0-1/2*s)) 15.70796326794896619231321691639751442098584699687552910487472296153908203143104499314017412671058534 15.86502654157467714117218441586859215641446367453641805450510958042565274839788072998452444741042018 15.86543958660157016021207778974106592713197464956329054113604818310136964968401110893643543265751673 15.86543958929058979121772312468572564968472056328417484804713380278017532700185422580941088121856672 15.86543958929058979133166302778307249672987827943566144626073834574026314097232621543195947183210587 15.86543958929058979133166302778307249673008284832650068966726311774248223910968899525488214058871233 15.86543958929058979133166302778307249673008284832650068966726311774248223910968899591430967903912196 15.865439589290589791331663027783072496730082848326500689667263117742482239109688995914309679039122 15.86543958929058979133166302778307249673008284832650068966726311774248223910968899591430967903912207 15.86543958929058979133166302778307249673008284832650068966726311774248223910968899591430967903912222 15.86543958929058979133166302778307249673008284832650068966726311774248223910968899591430967903912252 #+END_SRC We see that we have reached the limits of this precision very quickly. And as above, we can look instead at the differences between succesive approximations: #+BEGIN_SRC python mp.dps=2000 a = mp.mpf(3) b = mp.mpf(2) p = 1 s = a**2-b**2 a0 = a e = 2*mp.pi/a*(a0*a0-1/2*s) for i in range(11): a, b, p = (a+b)/2, np.sqrt(a*b), p*2 s += p*(a**2-b**2) e1 = 2*mp.pi/a*(a0*a0-1/2*s) mp.nprint(e1-e,10) e = e1 2.094395102 0.1570632736 0.0004130450269 2.689019631e-9 1.139399031e-19 2.045688908e-40 6.594275385e-82 6.852074221e-165 7.398301331e-331 8.624857552e-663 1.172173128e-1326 #+END_SRC The convergence is seen to be quadratic. ** Using Excel It might seem utterly regressive to use Excel - or indeed, any spreadsheet - for computations such as this, but in fact Excel can be used very easily for recursive computations, as long as you\u0026#39;re happy to use only IEEE double precision. In Cells A1-E2, enter: | | A | B | C | D | E | | 1 | ~3~ | ~2~ | ~1~ | ~=A1^2-B1^2~ | | | 2 | ~=(A1+B1)/2~ | ~=SQRT(A1*B1)~ | ~=2*C1~ | ~=D1+C2*(A2^2-B2^2)~ | ~=2*PI()/A2*($A$1^2-1/2*D2)~ | and then copy A2-E2 down a few rows. Because of overflow errors, if you show 14 decimal places in column E, they\u0026#39;ll never quite settle down. But in fact you reach the limits of double precision by row 5. * Voting power (7): Quarreling voters :voting:algebra:python: :PROPERTIES: :EXPORT_FILE_NAME: voting_power_quarreling :EXPORT_DATE: 2021-01-24 :END: In all the previous discussions of voting power, we have assumed that all winning coalitions are equally likely. But in practice that is not necessarily the case. Two or more voters may be opposed on so many issues that they would never vote the same way on any issues: such a pair of voters may be said to be /quarrelling/. To see how this may make a difference, consider the voting game \\[ [51; 49,48,3] \\] The winning coalitions for which a party is critical are \\[ [49,48],\\\\; [49,3],\\\\; [48,3] \\] and since each party is critical to the same number of winning coalitions, the Banzhaf indices are equal. But suppose that the two largest parties are quarrelling, and so the only winning coalitions are then \\[ [49,3],\\\\; [48,3]. \\] This means that the smaller party has /twice/ the power of each of the larger ones! We can set this up in Sage, using polynomial rings, with an ideal that represents the quarrel. In this case: #+BEGIN_SRC python sage: q = 51 sage: w = [49,48,3] sage: R.\u0026lt;x,y,z\u0026gt; = PolynomialRing(QQ) sage: qu = R.ideal(x*y) sage: p = ((1+x^49)*(1+y^48)*(1+z^3)).reduce(qu) sage: p #+END_SRC \\[ x^{49}z^3+y^{48}z^3+x^{49}+y^{48}+z^3+1 \\] From this polynomial, we choose the monomials whose degree sums are not less than the quota: #+BEGIN_SRC python sage: wcs = [m.degrees() for m in p.monomials() if sum(m.degrees()) \u0026gt;= q] sage: wcs #+END_SRC \\[ [(49,0,3),(0,48,3)] \\] To determine the initial (not normalized) Banzhaf values, we traverse this list of tuples, checking in each tuple if the value is critical to its coalition: #+BEGIN_SRC python sage: beta = [0,0,0] sage: for c in wcs: sc = sum(c) for i in range(3): if sc-c[i] \u0026lt; q: beta[i] += 1 beta #+END_SRC \\[ [1,1,2] \\] or \\[ [0.24, 0.25, 0.5] \\] for normalized values. ** Generalizing Of course the above can be generalized to any number of voters, and any number of quarrelling pairs. For example, consider the Australian Senate, of which the current party representation is: #+ATTR_CSS: :width 50% | Alignment | Party | Seats | |------------+-----------------+-------| | Government | Liberal | 31 | | | National | 5 | |------------+-----------------+-------| | Opposition | Labor | 26 | |------------+-----------------+-------| | Crossbench | Greens | 9 | | | One Nation | 2 | | | Centre Alliance | 1 | | | Lambie Alliance | 1 | | | Patrick Team | 1 | Although the current government consists of two separate parties, they act in effect as one party with a weight of 36. (This is known formally as the \u0026#34;Liberal-National Coalition\u0026#34;.) With no quarrels then, we have \\[ [39;36,26,9,2,1,1,1] \\] of which the Banzhaf values have been found to be 52, 12, 12, 10, 4, 4, 4 and the Banzhaf power indices \\[ 0.5306, 0.1224, 0.1224, 0.102, 0.0408, 0.0408, 0.0408. \\] Suppose that the Government and Opposition are quarrelling, as are also the Greens and One Nation. (This is a reasonable assumption, given current politics and the platforms of the respective parties.) #+BEGIN_SRC python sage: q = 39 sage: w = [36,26,9,2,1,1,1] sage: n = len(w) sage: R = PolynomialRing(QQ,\u0026#39;x\u0026#39;,n) sage: xs = R.gens() sage: qu = R.ideal([xs[0]*xs[1],xs[2]*xs[3]]) sage: pr = prod(1+xs[j]^w[j] for j in range(n)).reduce(qu) #+END_SRC As before, we extract the degrees of those monomials whose sum is at least $q$, and determine which party in each is critical: #+BEGIN_SRC python sage: wcs = [m.degrees() for m in pr.monomials() if sum(m.degrees()) \u0026gt;= q] sage: beta = [0]*n sage: for c in wcs: sc = sum(c) for i in range(n): if sc-c[i] \u0026lt; q: beta[i] += 1 sage: beta #+END_SRC $[16,0,7,6,2,2,2]$ which can be normalized to $[0.4571, 0.0, 0.2, 0.1714, 0.0571, 0.0571, 0.0571]$. The remarkable result is that Labor - the Opposition party, with the second largest number of seats in the Senate - loses all its power! This means that Labor /cannot/ afford to be in perpetual quarrel with the government parties. ** A simple program And of course, all of the above can be written into a simple Sage program: #+BEGIN_SRC python def qbanzhaf(q,w,r): n = len(w) R = PolynomialRing(QQ,\u0026#39;x\u0026#39;,n) xs = R.gens() qu = R.ideal([xs[y[0]]*xs[y[1]] for y in r]) pr = prod(1+xs[j]^w[j] for j in range(n)).reduce(qu) wcs = [m.degrees() for m in pr.monomials() if sum(m.degrees()) \u0026gt;= q] beta = [0]*n for c in wcs: sc = sum(c) for i in range(n): if sc-c[i] \u0026lt; q: beta[i] += 1 return(beta) #+END_SRC All the quarrelling pairs are given as a list, so that the Australian Senate computation could be entered as #+BEGIN_SRC python sage: qbanzahf(q,w,[[0,1],[2,3]]) #+END_SRC * Voting power (6): Polynomial rings :voting:algebra:python: :PROPERTIES: :EXPORT_FILE_NAME: voting_power_polynomial_rings :EXPORT_DATE: 2021-01-22 :END: As we have seen previously, it\u0026#39;s possible to compute power indices by means of polynomial generating functions. We shall extend previous examples to include the Deegan-Packel index, in a way somewhat different to that of Alonso-Meijide et al (see previous post for reference). Again, suppose we consider the voting game \\[ [30;28,16,5,4,3,3] \\] What we\u0026#39;ll do here though, rather than using just one variable, we\u0026#39;ll have a variable for each voter. We\u0026#39;ll use Sage for this, as it\u0026#39;s open source, and provides a very rich environment for computing with polynomial rings. We first create the polynomial \\[ p = \\prod_{k=0}^5(1+x_k^{w_k}) \\] where $w_i$ are the weights given above. We are using the Python (and hence Sage) convention that indices are numbered startng at zero. #+BEGIN_SRC python sage: q = 30 sage: w = [28,16,5,4,3,3] sage: n = len(w) sage: R = PolynomialRing(QQ,\u0026#39;x\u0026#39;,n) sage: xs = R.gens() sage: pr = prod(1+xs[i]^w[1] for i in range(n) #+END_SRC Now we can extract all the monomials, and consider only those for which the degree sum is not less than the quota $q$: #+BEGIN_SRC python sage: pm = pr.monomials() sage: pw = [x for x in pr if sum(x.degrees()) \u0026gt;= q] sage: pw = pw[::-1]; pw #+END_SRC \\begin{aligned} \u0026amp;\\left[x_{1}^{16} x_{2}^{5} x_{3}^{4} x_{4}^{3} x_{5}^{3},\\\\; x_{0}^{28} x_{5}^{3},\\\\; x_{0}^{28} x_{4}^{3},\\\\; x_{0}^{28} x_{3}^{4},\\\\; x_{0}^{28} x_{2}^{5},\\\\; x_{0}^{28} x_{4}^{3} x_{5}^{3},\\\\; x_{0}^{28} x_{3}^{4} x_{5}^{3},\\\\; x_{0}^{28} x_{3}^{4} x_{4}^{3},\\right.\\cr \u0026amp;\\qquad x_{0}^{28} x_{2}^{5} x_{5}^{3},\\\\; x_{0}^{28} x_{2}^{5} x_{4}^{3},\\\\; x_{0}^{28} x_{2}^{5} x_{3}^{4},\\\\; x_{0}^{28} x_{3}^{4} x_{4}^{3} x_{5}^{3},\\\\; x_{0}^{28} x_{2}^{5} x_{4}^{3} x_{5}^{3},\\\\; x_{0}^{28} x_{2}^{5} x_{3}^{4} x_{5}^{3},\\cr \u0026amp;\\qquad x_{0}^{28} x_{2}^{5} x_{3}^{4} x_{4}^{3},\\\\; x_{0}^{28} x_{2}^{5} x_{3}^{4} x_{4}^{3} x_{5}^{3},\\\\; x_{0}^{28} x_{1}^{16},\\\\; x_{0}^{28} x_{1}^{16} x_{5}^{3},\\\\; x_{0}^{28} x_{1}^{16} x_{4}^{3},\\\\; x_{0}^{28} x_{1}^{16} x_{3}^{4},\\cr \u0026amp;\\qquad x_{0}^{28} x_{1}^{16} x_{2}^{5},\\\\; x_{0}^{28} x_{1}^{16} x_{4}^{3} x_{5}^{3},\\\\; x_{0}^{28} x_{1}^{16} x_{3}^{4} x_{5}^{3},\\\\; x_{0}^{28} x_{1}^{16} x_{3}^{4} x_{4}^{3},\\\\; x_{0}^{28} x_{1}^{16} x_{2}^{5} x_{5}^{3},\\cr \u0026amp;\\qquad x_{0}^{28} x_{1}^{16} x_{2}^{5} x_{4}^{3},\\\\; x_{0}^{28} x_{1}^{16} x_{2}^{5} x_{3}^{4},\\\\; x_{0}^{28} x_{1}^{16} x_{3}^{4} x_{4}^{3} x_{5}^{3},\\\\; x_{0}^{28} x_{1}^{16} x_{2}^{5} x_{4}^{3} x_{5}^{3},\\cr \u0026amp;\\qquad \\left. x_{0}^{28} x_{1}^{16} x_{2}^{5} x_{3}^{4} x_{5}^{3},\\\\; x_{0}^{28} x_{1}^{16} x_{2}^{5} x_{3}^{4} x_{4}^{3},\\\\; x_{0}^{28} x_{1}^{16} x_{2}^{5} x_{3}^{4} x_{4}^{3} x_{5}^{3}\\right] \\end{aligned} As we did with subsets, we can winnow out the monomials which are multiples of others, by writing a recursive function: #+BEGIN_SRC python def mwc(q,t,p): if len(p)==0: return(t) else: for x in p[1:]: if p[0].divides(x): p.remove(x) return(mwc(q,t+[p[0]],p[1:])) #+END_SRC We can apply it: #+BEGIN_SRC python sage: pw2 = pw.copy() sage: mwc1 = mwc(q,[],pw2) sage: mwc1 #+END_SRC \\[ \\left[x_{1}^{16} x_{2}^{5} x_{3}^{4} x_{4}^{3} x_{5}^{3},\\\\; x_{0}^{28} x_{5}^{3},\\\\; x_{0}^{28} x_{4}^{3},\\\\; x_{0}^{28} x_{3}^{4},\\\\; x_{0}^{28} x_{2}^{5},\\\\; x_{0}^{28} x_{1}^{16}\\right] \\] Now it\u0026#39;s just a matter of working out the indices from the variables, and this is most easily done in Python with a dictionary, with the variables as keys and their indices as values. #+BEGIN_SRC python sage: dp = {xs[i]:0 for i in range(n)} sage: for m in mcw1: mv = m.variables() nm = len(mv) for x in mv: dps[x] += 1/nm sage: dp #+END_SRC \\[ \\left\\{x_{0} : \\frac{5}{2}, x_{1} : \\frac{7}{10}, x_{2} : \\frac{7}{10}, x_{3} : \\frac{7}{10}, x_{4} : \\frac{7}{10}, x_{5} : \\frac{7}{10}\\right\\} \\] And this of course can be normalized: #+BEGIN_SRC python sage: s = sum(dps.values()) for x in dps.keys(): dps[x] = dps[x]/s #+END_SRC \\[ \\left\\{x_{0} : \\frac{5}{12}, x_{1} : \\frac{7}{60}, x_{2} : \\frac{7}{60}, x_{3} : \\frac{7}{60}, x_{4} : \\frac{7}{60}, x_{5} : \\frac{7}{60}\\right\\} \\] And these are the values we found in the previous post, using Julia and working with subsets. ** Back to Banzhaf Recall that the Banzhaf power indices could be computed from polynomials in two steps. For voter $i$, define \\[ f_i(x) = \\prod_{j\\ne i}(1+x^{w_j}) \\] and suppose that the coefficient of $x^k$ is $a_k$. Then define \\[ \\beta_i=\\sum_{j=q-w_i}^{q-1}a_j \\] and these values are normalized for the Banzhaf power indices. Suppose we define, as for the Deegan-Packel indices, \\[ p = \\prod_{k=1}^n(1+x_i^{w_i}). \\] Then reduce $p$ modulo $x_i^{w_i}$. This has the effect of setting $x_i^{w_i}=0$ in $p$ so simply removes all monomials containing $x_i^{w_i}$. Then $\\beta_i$ is computed by adding the coefficients of all monomials whose degree sum lies between $q-w_i$ and $q-1$. First define the polynomial ring and the polynomial $p$: #+BEGIN_SRC python sage: w = [28,16,5,4,3,3] sage: q = 30 sage: n = len(w) sage: R = PolynomialRing(QQbar,\u0026#39;x\u0026#39;,n) sage: xs = R.gens() sage: p = prod(1+xs[i]^w[i] for i in range(n)) #+END_SRC Now for the reeducation and computation: #+BEGIN_SRC python sage: ids = [R.ideal(xs[i]^w[i]) for i in range(n)] sage: for i in range(n): pri = pr.reduce(ids[i]) ms = pri.monomials() cs = pri.coefficients() cds = [(x,sum(y.degrees())) for x,y in zip(cs,ms)] beta += [sum(x for (x,y) in cds if y\u0026gt;=(q-w[i]) and (y\u0026lt;q))] sage: print(beta) #+END_SRC \\[ [30,2,2,2,2,2] \\] from which the Banzhaf power indices can be computed as \\[ \\left[\\frac{3}{4},\\\\;\\frac{1}{20},\\\\;\\frac{1}{20},\\\\;\\frac{1}{20},\\\\;\\frac{1}{20},\\\\;\\frac{1}{20}\\right]. \\] Note that this is of course unnecessarily clumsy, as the Banzhaf indices can be easily and readily computed using univariate polynomials. We may thus consider this approach as a proof of concept, rather than a genuine alternative. In Sage, Banzhaf indices can be computed with polynomials very neatly, given weights and a quota: #+BEGIN_SRC python sage: def banzhaf(q,w): n = len(w) beta = [] for i in range(n): pp = prod(1+x^w[j] for j in range(n) if j != i) cc = pp.list() beta += [sum(cc[j] for j in range(len(cc)) if j\u0026gt;=(q-w[i]) and (j\u0026lt;q))] return(beta) #+END_SRC These values can then be easily normalized to produce the Banzhaf power indices. * Voting power (5): The Deegan-Packel and Holler power indices :voting:algebra:python:julia: :PROPERTIES: :EXPORT_FILE_NAME: voting_power_deegan-packel_holler :EXPORT_DATE: 2021-01-14 :END: We have explored the Banzhaf and Shapley-Shubik power indices, which both consider the ways in which any voter can be pivotal, or critical, or necessary, to a winning coalition. A more recent power index, which takes a different approach, was defined by Deegan and Packel in 1976, and considers only /minimal winning coalitions/. A winning coalition $S$ is /minimal/ if every member of $S$ is critical to it, or alternatively, that $S$ does not contain any winning coalition as a proper subset. It is easy to see that these are equivalent, for if $i\\in S$ was not critical, then $S-\\{i\\}$ would be a winning coalition which is a proper subset. Given $N=\\{1,2,\\ldots,n\\}$, let $W\\subset 2^N$ be the set of all minimal winning coalitions, and let $W_i\\subset W$ be those that contain the voter $i$. Then we define \\[ DP_i=\\sum_{S\\in W_i}\\frac{1}{|S|} \\] where $|S|$ is the cardinality of $S$. For example, consider the voting game \\[ [16;10,9,6,5] \\] Using the indicies 1 to 4 for the voters, the minimal winning coalitions are \\[ \\{1,2\\},\\{1,3\\},\\{2,3,4\\}. \\] and hence \\begin{aligned} DP_1 \u0026amp;= \\frac{1}{2}+\\frac{1}{2} = 1 \\\\cr DP_2 \u0026amp;= \\frac{1}{2}+\\frac{1}{3} = \\frac{5}{6} \\\\cr DP_3 \u0026amp;= \\frac{1}{2}+\\frac{1}{3} = \\frac{5}{6} \\\\cr DP_4 \u0026amp;= \\frac{1}{3} \\end{aligned} and these values can be normalized so that their sum is unity: \\[ [1/3,5/18,5/18,1/9]\\approx [0.3333, 0.2778,0.2778, 0.1111]. \\] In comparison, both the Banzhaf and Shapley-Shubik indices return \\[ [0.4167, 0.25, 0.25, 0.0833]. \\] Allied to the Deegan-Packel index is /Holler\u0026#39;s public-good index/, also called the /Holler-Packel/ index, defined as \\[ H_i=\\frac{|W_i|}{\\sum_{j\\in N}|W_j|}. \\] In other words, this index first counts the number of minimal wining coalitions that contain $i$, and then normalises those values for the sum to be unity. In the example above, we have voters 1, 2, 3 and 4 being members of 2, 2, 2, 1 minimal winning coalitions respectively, and so the power indices are \\[ [2/7, 2/7, 2/7, 1/7] \\approx [0.2857,0.2857,0.2857,0.1429]. \\] ** Implementation (1): Python We can implement the Deegan-Packel in Python, either by using itertools, or simply rolling our own little functions: #+BEGIN_SRC python def all_subsets(X): T = [[]] for x in X: T += [t+[x] for t in T] return(T) def is_subset(A,B): out = True for a in A: if B.count(a) == 0: out = False break return(out) def mwc(q,S,T): if len(T) == 0: return(S) else: if sum(T[0]) \u0026gt;= q: S += [T[0]] temp = T.copy() for t in temp: if is_subset(T[0],t): temp.remove(t) return(mwc(q,S,temp)) else: return(mwc(q,S,T[1:])) #+END_SRC #+BEGIN_SRC python def prod(A): m = len(A) n = len(A[0]) p = 1 for i in range(m): for j in range(n): p *= A[i][j] return(p) #+END_SRC Of the three functions above, the first simply returns all subsets (as lists); the second tests whether one list is a subset of another (treating both as sets), and the final routine returns all minimal winning coalitions using an elementary recursion. The function starts off considering all subsets of the set of weights, and goes through the list until it finds one whose sum is at least equal to the quota. Then it removes all other subsets which are supersets of the found one. The calls the routine on this smaller list. For example: #+BEGIN_SRC python \u0026gt;\u0026gt;\u0026gt; wts = [10,9,6,5] \u0026gt;\u0026gt;\u0026gt; T = all_subsets(wts)[1:] \u0026gt;\u0026gt;\u0026gt; q = 16 \u0026gt;\u0026gt;\u0026gt; mwc(q,[],T) [[10, 9], [10, 6], [9, 6, 5]] #+END_SRC It is an easy matter now to obtain the Deegan-Packel indices: #+BEGIN_SRC python def dpi(q,wts): m = mwc(q,[],all_subsets(wts)[1:]) dp = [] for w in wts: d = 0 for x in m: d += x.count(w)/len(x) dp += [d] return(dp) #+END_SRC And as an example: #+BEGIN_SRC python \u0026gt;\u0026gt;\u0026gt; wts = [10,9,6,5] \u0026gt;\u0026gt;\u0026gt; q = 16 \u0026gt;\u0026gt;\u0026gt; dpi(q,wts) [1.0, 0.8333333333333333, 0.8333333333333333, 0.3333333333333333] #+END_SRC and of course these can be normalized so that their sum is unity. ** Implementation (2): Julia Now we\u0026#39;ll use [Julia](https://julialang.org), and its [Combinatorics](https://github.com/JuliaMath/Combinatorics.jl) library. Because Julia implements JIT compiling, its speed is generally faster than that of Python. Just to be different, we\u0026#39;ll develop two functions, one which first produces all winning coalitions, and the second which winnows that set to just the minimal winning coalitions: #+BEGIN_SRC julia using Combinatorics function wcs(q,w) S = powerset(w) out = [] for s in S if sum(s) \u0026gt;= q append!(out,[s]) end end return(out) end function mwc(q,out,wc) if isempty(wc) return(out) else f = wc[1] popfirst!(wc) temp = [] # temp finds all supersets of f = wc[1] for w in wc if issubset(f,w) append!(temp,[w]) end end return(mwc(q,append!(out,[f]),setdiff(wc,temp))) end end #+END_SRC Now we can try it out: #+BEGIN_SRC julia julia\u0026gt; q = 16; julia\u0026gt; w = [10,9,6,5]; julia\u0026gt; cs = wcs(q,w) 7-element Array{Any,1}: [10, 9] [10, 6] [10, 9, 6] [10, 9, 5] [10, 6, 5] [9, 6, 5] [10, 9, 6, 5] julia\u0026gt; mwc(q,[],cs) 3-element Array{Any,1}: [10, 9] [10, 6] [9, 6, 5] #+END_SRC ** Repeated elements Although both Julia and Python work with multisets, this becomes tricky in terms of the power indices. A simple expedient is to change repeated indices by small amounts so that they are all different, but that the sum will not affect any quota. If we have for example four indices which are the same, we can add 0.1, 0.2, 0.3 to three of them. So we consider the example \\[ [30;28,16,5,4,3,3] \\] given as an example of a polynomial method in the article \u0026#34;Computation of several power indices by generating functions\u0026#34; by J. M. Alonso-Meijide et al; you can find the article on [Science Direct](https://www.sciencedirect.com/science/article/pii/S0096300312010089?casa_token=VhxXp9aJU_MAAAAA:VI_M5HKBYi7Ge41BGBobiw1Zkh_u8Lv6IJFRZs7FqDoOVl9VnzU7sx6ulhS2QCvdjOZedAW6eQ). So: #+BEGIN_SRC julia julia\u0026gt; q = 30; julia\u0026gt; w = [28,16,5,4,3.1,3]; julia\u0026gt; cs = wcs(q,w); julia\u0026gt; ms = mwc(q,[],cs) 6-element Array{Any,1}: [28.0, 16.0] [28.0, 5.0] [28.0, 4.0] [28.0, 3.1] [28.0, 3.0] [16.0, 5.0, 4.0, 3.1, 3.0] #+END_SRC From here it\u0026#39;s an easy matter to compute the Deegan-Packel power indices: #+BEGIN_SRC julia julia\u0026gt; dp = [] for i = 1:6 x = 0//1 for m in mw x = x + count(j -\u0026gt; j == w[i],m)//length(m) end append!(dp, [x]) end julia\u0026gt; print(dp) Any[5//2, 7//10, 7//10, 7//10, 7//10, 7//10] julia\u0026gt; print([x/sum(dp) for x in dp]) Rational{Int64}[5//12, 7//60, 7//60, 7//60, 7//60, 7//60] #+END_SRC and these are the values obtained by the authors (but with a lot less work). * Three-dimensional impossible CAD :geometry:CAD: :PROPERTIES: :EXPORT_FILE_NAME: three_d_impossible_CAD :EXPORT_HUGO_CUSTOM_FRONT_MATTER: :mathjax true :EXPORT_DATE: 2021-01-10 :END: Recently I friend and I wrote a semi-serious paper called \u0026#34;The geometry of impossible objects\u0026#34; to be delivered at a mathematics technology conference. The reviewer was not hugely complimentary, saying that there was nothing new in the paper. Well, maybe not, but we had fun pulling together some information about impossible shapes and how to draw them. You can see some of our programs hosted at [repl.it](https://repl.it/@AlasdairMcAndrew/ATCM2020#main.py). But my interest /here/ is to look at the three-dimensional versions of such objects; how to use a programmable CAD to create 3d objects which, from a particular perspective, look impossible. Here\u0026#39;s an example in Perth, Western Australia: [[file:/penrose_triangle.png]] (I think the sides are a bit too thin to give a proper feeling for the shape, though.) Another image, neatly showing how such a Penrose triangle can be created, is this of a \u0026#34;[Unique vase](http://www.marvelbuilding.com/unique-vase-inspired-penrose-triangle-90.html)\u0026#34;: [[file:/Unique-Vase-Inspired-by-Penrose-Triangle-3.jpg]] With this in mind, we can easily create such a shape in any 3D CAD program we like. I\u0026#39;ve spoken before about [programmable CAD](https://numbersandshapes.net/post/programmable-cad/) but then using [OpenJSCAD](https://openjscad.org/dokuwiki/doku.php?id=start). However, I decided to switch to Python\u0026#39;s [CADquery](https://cadquery.readthedocs.io/en/latest/), because it offered the option of orthographic viewing instead of just perspective viewing. This meant that all objects retained their size even if further away from the viewpoint, which made cutting out from the foreground objects much easier. For making shapes available online, the best solution I found was the online and free viewer and widget provided by [LAI4D](http://www.lai4d.com/) and its online [Laboratory](http://www.lai4d.com/laboratory/en/). This provides not only a nice environment to explore 3D shapes, but the facility to export the shape as an html widget for viewing in an iframe. ** Penrose triangle In fact, this first shape is created with [OpenJSCAD](https://en.wikibooks.org/wiki/OpenJSCAD_User_Guide) which I was using before discovering Cadquery. The only problem with OpenJSCAD - of which there\u0026#39;s a new version called simply [JSCAD](https://openjscad.org/dokuwiki/doku.php?id=start) - is that it doesn\u0026#39;t seem to allow orthographic viewing. This is preferable to perspective viewing for impossible figures, as it\u0026#39;s much easier to work out how to line everything up as necessary. You can see the version of my OpenJSCAD Penrose triangle by opening up up this link: http://bit.ly/3q5OAer This should open up OpenJSCAD with the shape in it. You should be able to move it around until you get the impossible effect. You\u0026#39;ll see the JavaScript file that produced it on the right. This shows the starting shape and what you should aim to produce: [[file:/penrose_triangles.jpg]] And here is the same construction in CadQuery, but working with an orthographic projection. As before, we create the cut-away beam as a two-dimensional shape given by the coordinates of its vertices, then \u0026#34;extrude\u0026#34; it perpendicular to its plane. The remaining beam is a rectangular box. #+BEGIN_SRC python pts = [ (0,0), (50,0), (50,10), (49.9,10), (49.9,0.1), (39.9,0.1), (30,10), (10,10), (10,40), (0,40) ] L_shape = cq.Workplane(\u0026#34;front\u0026#34;).polyline(pts).close().extrude(10) upright = cq.Workplane(\u0026#34;front\u0026#34;).transformed(offset=(5,45,-15)).box(10,10,50) pt = Assembly( [ Part(L_shape, \u0026#34;L shape\u0026#34;), Part(upright, \u0026#34;Upright\u0026#34;), ], \u0026#34;Penrose triangle\u0026#34;) exportSTL(pt,\\\u0026#34;penrose_triangle.stl\\\u0026#34;, linear_deflection=0.01, angular_deflection=0.1) show(pt) #+END_SRC The saved STL file can then be imported into the LAI4D online widget, and saved as an iframe for viewing. The Reutersvärd triangle, which is named for the Swedish graphics designer [Oscar Reutersvärd](https://en.wikipedia.org/wiki/Oscar_Reutersv%C3%A4rd) seems to predate the Penrose tirangle, and is a thing of great beauty. Anyway, to see it in LAI4d, go to: [STL Reutersvard triangle here](http://www.lai4d.com/lai4dwidget/lai4d_viewer.html?%7B%7Bfile%7D%7B%7Blai4d%7D%7Breutersvard_triangle_a64%7D%7D%7Bview%7D%7B%7Btype%7D%7Bview%7D%7Bprojection%7D%7Bparallel%7D%7Borientation%7D%7B%7B44.877776998530976%7D%7B54.68478347702594%7D%7B-17.49622953308079%7D%7D%7Bpoint%7D%7B%7B78.896957544495%7D%7B-22.970889712348303%7D%7B33.09381031469302%7D%7D%7Bzoom%7D%7B100%7D%7Btarget%7D%7B-86.60254037844388%7D%7Bfar%20light%20direction%7D%7B%7B1%7D%7B-1%7D%7B-3%7D%7D%7Bfar%20light%20factor%7D%7B0.7%7D%7Bfocus%20reflection%20factor%7D%7B0%7D%7D%7D) There are many hundreds of clever and witty impossible figures at the [Impossible Figure Library](https://im-possible.info/english/library/index.html), which I strongly recommend you visit! ** Penrose Staircase There\u0026#39;s a nice video [on youtube here](https://www.youtube.com/watch?v=7pPPWei2oEA) which I used as the basis for my model, but since then I\u0026#39;ve disovered other CAD models, for example on [yeggi.com](https://www.yeggi.com/q/penrose+stairs/). Anyway, the model is made up of square prisms of different heights and bases, with a final shape jutting out at the top: [[file:/staircases.jpg]] The Python code is : transforms = [(0,0,0),(0,-1,0),(0,-2,5),(0,-3,5),(1,-3,5),(2,-3,5),(3,-3,5),(4,-3,5), (4,-2,5),(4,-1,5),(3,-1,5),(2,-1,20)] vscale = 0.2 vtransforms = [(t[0],t[1],t[2]*vscale) for t in transforms] heights = [11,12,8,9,10,11,12,13,14,15,16,2] boxes = [cq.Workplane(\u0026#34;front).transformed(\\ offset = vtransforms[i]).box(1,1,heights[i]*vscale,centered=(True,True,False)) for i in range(11)]\\n\u0026#34;, pts1 = [(0,0),(0,1),(-1,1),(-1,0.2),(-0.2,0.2)] box12 = cq.Workplane(\u0026#34;front\u0026#34;).polyline(pts1).close().extrude(0.4) pts2 = [(0.2,0),(0.6,0),(0.2,0.4)] prism = cq.Workplane(\u0026#34;YZ\u0026#34;).transformed(offset=(0,0,-1)).polyline(pts2).close().extrude(0.8) shape = box12.cut(prism).translate((2.5,-1.5,4)) boxes += [shape] ps = Assembly( [Part(boxes[i],\u0026#34;Box \u0026#34;+str(i),\u0026#34;#daa520\u0026#34;) for i in range(12)], \u0026#34;Penrose staircase\u0026#34;) exportSTL(ps,\\\u0026#34;penrose_staircase2.stl\\\u0026#34;, linear_deflection=0.01, angular_deflection=0.1) # show(ps)// ... code Below is an iframe containing the LAI4D widget, it may take a few seconds to load, and you may need to refresh the page: #+BEGIN_EXPORT HTML \u0026lt;iframe width=\u0026#34;1114\u0026#34; height=\u0026#34;868\u0026#34; src=\u0026#34;http://widget.lai4d.org/lai4d_viewer.html?{{file}{{lai4d}{penrose_triangle2_a49}}{view}{{type}{view}{projection}{parallel}{orientation}{{45.622279541377225}{49.11108740776223}{5.344582071903109}}{point}{{6.3974657616243}{-5.484774096849316}{7.296050456965282}}{zoom}{7.769169839822013}{target}{-7.7691698398220135}{far light direction}{{1}{-1}{-3}}{far light factor}{0.7}{focus reflection factor}{0}}}\u0026#34; style=\u0026#34;margin: 0px; border: 1px solid black; overflow: hidden;\u0026#34;\u0026gt;\u0026lt;/iframe\u0026gt; #+END_EXPORT If you get the shape into a state from which you can\u0026#39;t get the stair affect working, click on the little circular arrows in the bottom left, which will reset the object to its initial orientation. ** Impossible box This was just a matter of creating the edges, finding a nice view, and using some trial and error to slice through the front most beams so that it seemed that hhe read beams were at the front. Here is another widget, again you may have to wait for it to load. #+BEGIN_EXPORT HTML \u0026lt;iframe width=\u0026#34;1552\u0026#34; height=\u0026#34;670\u0026#34; src=\u0026#34;http://widget.lai4d.org/lai4d_viewer.html?{{file}{{lai4d}{impossible_box_a12}}{view}{{type}{view}{projection}{parallel}{orientation}{{117.46214919672738}{61.92386833724598}{96.59634990634542}}{point}{{109.04777595866362}{68.68195696813399}{75.52581510314023}}{zoom}{123.96235987858037}{target}{-107.35455276791946}{far%20light%20direction}{{1}{-1}{-3}}{far%20light%20factor}{0.7}{focus%20reflection%20factor}{0}}}\u0026#34; style=\u0026#34;margin: 0px; border: 1px solid black; overflow: hidden;\u0026#34;\u0026gt;\u0026lt;/iframe\u0026gt; #+END_EXPORT * Voting power (4): Speeding up the computation :voting:algebra:julia: :PROPERTIES: :EXPORT_FILE_NAME: voting_power_speeding :EXPORT_DATE: 2021-01-06 :END: ** Introduction and recapitulation Recall from previous posts that we have considered two power indices for computing the power of a voter in a weighted system; that is, the ability of a voter to influence the outcome of a vote. Such systems occur when the voting body is made up of a number of \u0026#34;blocs\u0026#34;: these may be political parties, countries, states, or any other groupings of people, and it is assumed that within every bloc all members will vote the same way. Indeed, in some legislatures, voting according to \u0026#34;the party line\u0026#34; is a requirement of membership. Examples include the American Electoral College, in which the \u0026#34;voters\u0026#34; are the states; the Australian Senate, the European Union, the International Monetary Fund, and many others. Given a set $N=\\{1,2,\\ldots,n\\}$ of voters and their weights $w_i$, and a quota $q$ required to pass any motion, we have represented this as \\[ [q;w_1,w_2,\\ldots,w_n] \\] and we define a /winning coalition/ as any subset $S\\subset N$ of voters for which \\[ \\sum_{i\\in S}w_i\\ge q. \\] It is convenient to define a characteristic function $v$ on all subsets of $N$ so that $v(S)=1$ if $S$ is a winning coalition, and 0 otherwise. Given a winning coalition $S$, a voter $i\\in S$ is /necessary/ if $v(S)-v(S-\\{i\\})=1$. For any voter $i$, the number of winning coalitions for which that voter is necessary is \\[ \\beta_i = \\sum_S\\left(v(S)-v(S-\\{i\\})\\right) \\] where the sum is taken over all winning coalitions. Then the /Banzhaf power index/ is this value normalized so that the sum of all indices is unity: \\[ B_i=\\frac{\\beta_i}{\\sum_{i=1}^n \\beta_i}. \\] The /Shapley-Shubik power index/ is defined by considering all permutations $p$ of $N$. Taken cumulative sums from the left, a voter $p_k$ is /pivotal/ if this is the first voter for which te cumulative sum is at least $q$. For each voter $i$, let $\\sigma_i$ be the number of permutations for which $i$ is pivotal. Then \\[ S_i=\\frac{1}{n!}\\sigma_i \\] which ensures that the sum of all indices is unity. Although these two indices seem very different, there is in fact a deep connection. Consider any permutation $p$ and suppose that $i=p_k$ is the pivotal voter, This voter will also be pivotal in all permutations for which $i=p_k$ and the values to the right and left of $i$ stay there: there will be $(k-1)!(n-k)!$ such permutations. However, we can consider the values up to and including $i=p_k$ as a winning coalition for which $i$ is necessary, which means we can write \\[ S_i=\\sum_S\\frac{(n-k)!(k-1)!}{n!}\\left(v(S)-v(S-\\{i\\})\\right) \\] which can be compared to the Banzhaf index above as being similar and with a different weighting function. Note that the above expression can be written as \\[ S_i=\\sum_S\\left(k{\\dbinom n k}\\right)^{-1}\\left(v(S)-v(S-\\{i\\})\\right) \\] which uses smaller numbers. For example, if $n=50$ then $n!\\approx 3.0149\\times 10^{64}$ but the largest binomial value is only approximately $1.264\\times 10^{14}$. ** Computing with polynomials We have also seen that if we define \\[ f_i(x) = \\prod_{m\\ne i}(1+x^{w_m}) \\] then \\[ \\beta_i = \\sum_{j=q-w_i}^{q-1}a_j \\] where $a_j$ is the coefficient of $x_j$ in $f_i(x)$. Similarly, if \\[ f_i(x,y) = \\prod_{m\\ne i}(1+yx^{w_m}) \\] then \\[ S_i=\\sum_{k=0}^{n-1}\\frac{k!(n-1-k)!}{n!}\\sum_{j=q-w_i}^{q-1}c_{jk} \\] where $c_{jk}$ is the coefficient of $x^jy^k$ in the expansion of $f_i(x,y)$. In a previous post we have shown how to implement these in Python, using the Sympy library. However, Python can be slow, and using Cython is not trivial. We thus here show how to use Julia and its Polynomials package. ** Using Julia for speed The Banzhaf power indices can be computed almost directly from the definition: #+BEGIN_SRC julia using Polynomials function px(n) return(Polynomial([0,1])^n) end function banzhaf(q,w) n = length(w) inds = vec(zeros(Int64,1,n)) for i in 1:n p = Polynomial([1]) for j in 1:i-1 p = p * px(v[j]) end for j in i+1:n p = p*px(v[j]) end inds[i] = sum(coeffs(p)[k] for k in q-w[i]+1:q) end return(inds) end #+END_SRC This will actually return the vector of $\\beta_i$ values which can then be easily normalized. The function =px= is a \u0026#34;helper function\u0026#34; that simply returns the polynomial $x^n$. For the Shapley-Shubik indices, the situation is a bit trickier. There are indeed some Julia libraries for multivariate polynomials, but they seem (at the time of writing) to be not fully functional. However, consider the polynomials \\[ f_i(x,y)=\\prod_{m\\ne i}(1+yx^{w_m}) \\] from above. We can consider this as a polynomial of degree $n-1$ in $y$, whose coefficients are polynomials in $x$. So if \\[ f_i(x,y) = 1 + p_1(x)y + p_2(x)y^2 +\\cdots + p_{n-1}(x)y^{n-1} \\] then $f_i(x,y)$ can be represented as a vector of polynomials \\[ [1,p_1(x),p_2(x),\\ldots,p_{n-1}(x)]. \\] With this representation, we need to perform a multiplication by $1+yx^p$ and also determine coefficients. Multiplication is easy, noting at once that $1+yx^p$ is linear in $y$, and so we use the expansion of the product \\[ (1+ay)(1 + b_1y + b_2y^2 + \\cdots + b_{n-1}y^{n-1}) \\] to \\[ 1 + (a+b_1)y + (ab_1+b_2)y^2 + \\cdots + (ab_{n-2}+b_{n-1})y^{n-1} + ab_{n-1}y^n. \\] This can be readily programmed as: #+BEGIN_SRC julia function mulp1(n,p) p0 = Polynonial(0) px = Polynomial([0,1]) c1 = cat(p,p0,dims=1) c2 = cat([p0,p,dims=1) return(c1 + px^n .* c2) end #+END_SRC The first two lines aren\u0026#39;t really necessary, but they do make the procedure easier to read. And we need a little program for extracting coefficients, with a result of zero if the power is greater than the degree of the polynomial (Julia\u0026#39;s =coeff= function simply produces a list of all the coefficients in the polynomial.) #+BEGIN_SRC julia function one_coeff(p,n) d = degree(p) pc = coeffs(p) if n \u0026lt;= d return(pc[n+1]) else return(0) end end #+END_SRC Now we can put all of this together in a function very similar to the Python function for computing the Shapley-Shubik indices with polynomials: #+BEGIN_SRC julia function shapley(q,w) n = length(w) inds = vec(zeros(Float64,1,n)) for i in 1:n p = Polynomial(1) for j in 1:i-1 p = mulp1(w[j],p) end for j in i+1:n p = mulp1(w[j],p) end B = vec(zeros(Float64,1,n)) for j in 1:n B[j] = sum(one_coeff(p[j],k) for k in q-w[i]:q-1) end inds[i] = sum(B[j+1]/binomial(n,j)/(n-j) for j in 0:n-1) end return(inds) end #+END_SRC And a quick test (with timing) of the powers of the states in the Electoral College; here =ecv= is the number of electors of all the states, in alphabetical order (the first states are Alabama, Alaska, Arizona, and the last states are West Virginia, Wisconsin, Washington): #+BEGIN_SRC julia ecv = [9, 3, 11, 6, 55, 9, 7, 3, 3, 29, 16, 4, 4, 20, 11, 6, 6, 8, 8, 4, 10, 11, 16, 10, 6, 10, 3, 5, 6, 4, 14, 5, 29, 15, 3, 18, 7, 7, 20, 4, 9, 3, 11, 38, 6, 3, 13, 12, 5, 10, 3] @time(s = shapley(270,ecv)); 0.722626 seconds (605.50 k allocations: 713.619 MiB, 7.95% gc time) #+END_SRC This is running on a Lenovo X1 Carbon, 3rd generation, using Julia 1.5.3. The operating system is a very recently upgraded version of Arch Linux, and currently using kernel 5.10.3. * Voting power (3): The American swing states :voting:algebra: :PROPERTIES: :EXPORT_FILE_NAME: voting_power_swing_states :EXPORT_DATE: 2021-01-03 :END: As we all know, American Presidential elections are done with a two-stage process: first the public votes, and then the Electoral College votes. It is the Electoral College that actually votes for the President; but they vote (in their respective states) in accordance with the plurality determined by the public vote. This unusual system was devised by the Founding Fathers as a compromise between mob rule and autocracy, of which both they were determined to guard against. The Electoral College is not now an independent body: in all states but two all electoral college votes are given to the winner in that state. This means that the Electoral College may \u0026#34;amplify\u0026#34; the public vote; or it may return a vote which differs from the public vote, in that a candidate may receive a majority of public votes, and yet still lose the Electoral College vote. This means that there are periodic calls for the Electoral College to be disbanded, but in reality that seems unlikely. And in fact as far back as 1834 the then President, Andrew Jackson, was demanding its disbanding: a President, according to Jackson, should be a \u0026#34;man of the people\u0026#34; and hence elected by the people, rather than by an elite \u0026#34;College\u0026#34;. This is one of the few instances where Jackson didn\u0026#39;t get his way. The initial idea of the Electoral College was that voters in their respective states would vote for Electors who would best represent their interests in a Presidential vote: these Electors were supposed to be wise and understanding men who could be relied on to vote in a principled manner. [Article ii, Section 1](https://www.senate.gov/civics/constitution_item/constitution.htm#a2_sec1) of the USA Constitution describes how this was to be done. When it became clear that electors were not in fact acting impartially, but only at the behest of the voters, some of the Founding Fathers were horrified. And like so many political institutions the world over, the Electoral College does not now live up to its original expectations, but is also too entrenched in the political process to be removed. The purpose of /this/ post is to determine the voting power of the \u0026#34;swing states\u0026#34;, in which most of a Presidential campaign is conducted. It has been estimated that something like 75% of Americans are ignored in a campaign; this might be true, but that\u0026#39;s just plain politics. For example California (with 55 Electoral College cotes) is so likely to return a Democrat candidate that it may be considered a \u0026#34;safe state\u0026#34; (at least, for the Democrats); it would be a waste of time for a candidate to spend too much time there. Instead, a candidate should stump in Florida, for example, which is considered a swing state, and may go either way: we have seen how close votes in Florida can be. For discussion about measuring voting power using /power indices/ check out the previous two blog posts. ** The American Electoral College According to the excellent site [270 to win](https://www.270towin.com) and their very useful election histories, we can determine which states have voted \u0026#34;the same\u0026#34; for any election post 1964. Taking 2000 as a reasonable starting point, we have the following results. Some states have voted the same in every election from 2000 onwards; others have not. | \u0026lt;l\u0026gt; | \u0026lt;r\u0026gt; | \u0026lt;l\u0026gt; | \u0026lt;r\u0026gt; | \u0026lt;l\u0026gt; | \u0026lt;r\u0026gt; | | Safe Democrat | | Safe Republican | | Swing | | |---------------+-----+-----------------+-----+----------------+-----| | California | 55 | Alabama | 9 | Colorado | 9 | | Connecticut | 7 | Alaska | 3 | Florida | 29 | | Delaware | 3 | Arizona | 11 | Idaho | 4 | | DC | 3 | Arkansas | 6 | Indiana | 11 | | Hawaii | 4 | Georgia | 16 | Iowa | 6 | | Illinois | 20 | Kansas | 6 | Michigan | 16 | | Maine | 3 | Kentucky | 8 | Nevada | 6 | | Maryland | 10 | Louisiana | 8 | New Hampshire | 4 | | Massachusetts | 11 | Mississippi | 6 | New Mexico | 5 | | Minnesota | 10 | Missouri | 10 | North Carolina | 15 | | New Jersey | 14 | Montana | 3 | Ohio | 18 | | New York | 29 | Nebraska | 4 | Pennsylvania | 20 | | Oregon | 7 | North Dakota | 3 | Virginia | 13 | | Rhode Island | 4 | Oklahoma | 7 | Wisconsin | 10 | | Vermont | 3 | South Carolina | 9 | Maine CD 2 | 1 | | Washington | 12 | South Dakota | 3 | Nebraska CD 2 | 1 | | | | Tennessee | 11 | | | | | | Texas | 38 | | | | | | Utah | 6 | | | | | | West Virginia | 5 | | | | | | Wyoming | 3 | | | |---------------+-----+-----------------+-----+----------------+-----| | | 195 | | 175 | | 168 | From the table, we see that since 2000, we can count on 195 \u0026#34;safe\u0026#34; Electoral College votes for the Democrats, and 175 \u0026#34;safe\u0026#34; Electoral College votes for the Republicans. Thus of the 168 undecided votes, for a Democrat win the party must obtain at least 75 votes, and for a Republican win, the party needs to amass 95 votes. Note that according to the site, of the votes in Maine and Nebraska, all but one are considered safe - remember that these are the only two states to apportion votes by Congressional district. Of Maine\u0026#39;s 4 Electoral College votes, 3 are safe Democrat and one is a swing vote; for Nebraska, 4 of its votes are safe Republican, and 1 is a swing vote. All this means is that a Democrat candidate should be campaigning considering the power given by \\[ [75; 9,29,4,11,6,16,6,4,5,15,18,20,13,10,1,1] \\] and a Republican candidate will be working with \\[ [95; 9,29,4,11,6,16,6,4,5,15,18,20,13,10,1,1] \\] ** A Democrat campaign So let\u0026#39;s imagine a Democrat candidate who wishes to maximize the efforts of the campaign by concentrating more on states with the greatest power to influence the election. #+BEGIN_SRC python In [1]: q = 75; w = [9,29,4,11,6,16,6,4,5,15,18,20,13,10,1,1] In [2]: b = banzhaf(q,w); bn = [sy.Float(x/sum(b)) for x in b]; [sy.N(x,4) for x in bn] Out[2]: [0.05192, 0.1867, 0.02271, 0.06426, 0.03478, 0.09467, 0.03478, 0.02271, 0.02870, 0.08801, 0.1060, 0.1196, 0.07515, 0.05800, 0.005994, 0.005994] In [3]: s = shapley(q,w); [sy.N(x/sum(s),4) for x in s] out[3]: [0.05102, 0.1902, 0.02188, 0.06375, 0.03367, 0.09531, 0.03367, 0.02188, 0.02770, 0.08833, 0.1073, 0.1217, 0.07506, 0.05723, 0.005662, 0.005662] #+END_SRC The values are not the same, but they are in fact quite close, and in this case they are comparable to the numbers of Electoral votes in each state. To compare values, it will be most efficient to set up a DataFrame using Python\u0026#39;s data analysis library [pandas](https://pandas.pydata.org). We shall also convert the Banzhaf and Shapley-Shubik values from sympy floats int ordinary python floats. #+BEGIN_SRC python In [4]: import pandas as pd In [5]: bf = [float(x) for x in bn] In [6]: sf = [float(x) for x in s] In [5]: d = {\u0026#34;States\u0026#34;:states, \u0026#34;EC Votes\u0026#34;:ec_votes, \u0026#34;Banzhaf indices\u0026#34;:bf, \u0026#34;Shapley-Shubik indices:sf} In [6]: swings = pd.DataFrame(d) In [7]: swings.sort_values(by = \u0026#34;EC Votes\u0026#34;, ascending = False) In [8]: ssings.sort_values(by = \u0026#34;Banzhaf indices\u0026#34;, ascending = False) In [9]: swings.sort_values(by = \u0026#34;Shapley-Shubik indices\u0026#34;, ascending = False) #+END_SRC We won\u0026#39;t show the results of the last three expressions, but they all give rise to the same ordering. We can still get some information by not looking so much at the values of the power indices, but their /relative values/ to the number of Electoral votes. To do this we need a new column which normalizes the Electoral votes so that their sum is unity: #+BEGIN_SRC python In [10]: swings[\u0026#34;Normalized EC Votes\u0026#34;] = swings[\u0026#34;EC Votes\u0026#34;]/168.0 In [11]: swings[\u0026#34;Ratio B to N\u0026#34;] = swings[\u0026#34;Banzhaf indices\u0026#34;]/swings[\u0026#34;Normalized EC Votes\u0026#34;] In [12]: swings[\u0026#34;Ratio S to N\u0026#34;] = swings_d[\u0026#34;Shapley-Shubik indices\u0026#34;]/swings[\u0026#34;Normalized EC Votes\u0026#34;] In [13]: swings.sort_values(by = \u0026#34;EC Votes\u0026#34;, ascending = False) #+END_SRC The following table shows the result. | \u0026lt;l\u0026gt; | \u0026lt;l6\u0026gt; | \u0026lt;l8\u0026gt; | \u0026lt;l16\u0026gt; | \u0026lt;l11\u0026gt; | \u0026lt;l8\u0026gt; | \u0026lt;l8\u0026gt; | | States | EC Votes | Banzhaf indices | Shapley-Shubik indices | EC Votes Normalized | Ratio B to N | Ratio S to N | |----------------+----------+-----------------+------------------------+---------------------+--------------+--------------| | Florida | 29 | 0.186702 | 0.190164 | 0.172619 | 1.081585 | 1.101640 | | Pennsylvania | 20 | 0.119575 | 0.121732 | 0.119048 | 1.004430 | 1.022552 | | Ohio | 18 | 0.106034 | 0.107289 | 0.107143 | 0.989655 | 1.001360 | | Michigan | 16 | 0.094671 | 0.095309 | 0.095238 | 0.994049 | 1.000743 | | North Carolina | 15 | 0.088014 | 0.088330 | 0.089286 | 0.985754 | 0.989293 | | Virginia | 13 | 0.075149 | 0.075057 | 0.077381 | 0.971155 | 0.969966 | | Indiana | 11 | 0.064261 | 0.063752 | 0.065476 | 0.981447 | 0.973660 | | Wisconsin | 10 | 0.058004 | 0.057227 | 0.059524 | 0.974471 | 0.961422 | | Colorado | 9 | 0.051922 | 0.051017 | 0.053571 | 0.969215 | 0.952318 | | Iowa | 6 | 0.034777 | 0.033670 | 0.035714 | 0.973770 | 0.942774 | | Nevada | 6 | 0.034777 | 0.033670 | 0.035714 | 0.973770 | 0.942774 | | New Mexico | 5 | 0.028695 | 0.027704 | 0.029762 | 0.964169 | 0.930862 | | Idaho | 4 | 0.022714 | 0.021877 | 0.023810 | 0.953972 | 0.918823 | | New Hampshire | 4 | 0.022714 | 0.021877 | 0.023810 | 0.953972 | 0.918823 | | Maine CD 2 | 1 | 0.005994 | 0.005662 | 0.005952 | 1.007058 | 0.951282 | | Nebraska CD 2 | 1 | 0.005994 | 0.005662 | 0.005952 | 1.007058 | 0.951282 | We can thus infer that a Democrat candidate should indeed campaign most vigorously in the states with the largest number of Electoral votes. This might seem to be obvious, but as we have shown in previous posts, there is not always a correlation between voting weight and voting power, and that a voter with a low weight might end up having considerable power. ** A Republican candidate Going through all of the above, but with a quota of 95, produces in the end the following: | \u0026lt;l\u0026gt; | \u0026lt;l6\u0026gt; | \u0026lt;l8\u0026gt; | \u0026lt;l16\u0026gt; | \u0026lt;l11\u0026gt; | \u0026lt;l8\u0026gt; | \u0026lt;l8\u0026gt; | | States | EC Votes | Banzhaf indices | Shapley-Shubik indices | EC Votes Normalized | Ratio B to N | Ratio S to N | |----------------+----------+-----------------+------------------------+----------------------+--------------+--------------| | Florida | 29 | 0.186024 | 0.190086 | 0.172619 | 1.077658 | 1.101190 | | Pennsylvania | 20 | 0.119789 | 0.121871 | 0.119048 | 1.006230 | 1.023718 | | Ohio | 18 | 0.106258 | 0.107258 | 0.107143 | 0.991741 | 1.001075 | | Michigan | 16 | 0.094453 | 0.095156 | 0.095238 | 0.991756 | 0.999140 | | North Carolina | 15 | 0.088106 | 0.088410 | 0.089286 | 0.986789 | 0.990194 | | Virginia | 13 | 0.075362 | 0.074940 | 0.077381 | 0.973906 | 0.968460 | | Indiana | 11 | 0.064064 | 0.063568 | 0.065476 | 0.978439 | 0.970862 | | Wisconsin | 10 | 0.058073 | 0.057394 | 0.059524 | 0.975628 | 0.964219 | | Colorado | 9 | 0.052133 | 0.051209 | 0.053571 | 0.973140 | 0.955892 | | Iowa | 6 | 0.034692 | 0.033612 | 0.035714 | 0.971363 | 0.941142 | | Nevada | 6 | 0.034692 | 0.033612 | 0.035714 | 0.971363 | 0.941142 | | New Mexico | 5 | 0.028776 | 0.027715 | 0.029762 | 0.966885 | 0.931235 | | Idaho | 4 | 0.022912 | 0.021963 | 0.023810 | 0.962300 | 0.922436 | | New Hampshire | 4 | 0.022912 | 0.021963 | 0.023810 | 0.962300 | 0.922436 | | Maine CD 2 | 1 | 0.005877 | 0.005621 | 0.005952 | 0.987357 | 0.944289 | | Nebraska CD | 1 | 0.005877 | 0.005621 | 0.005952 | 0.987357 | 0.944289 | and we see a similar result as for the Democrat version, an obvious difference though being that Michigan has decreased its relative power, at least as measured using the Shapley-Shubik index. \\ * Voting power (2): computation :voting:algebra: :PROPERTIES: :EXPORT_FILE_NAME: voting_power_computation :EXPORT_DATE: 2020-12-31 :END: ** Naive implementation of Banzhaf power indices As we saw in the previous post, computation of the power indices can become unwieldy as the number of voters increases. However, we can very simply write a program to compute the Banzhaf power indices simply by looping over all subsets of the weights: #+BEGIN_SRC python def banzhaf1(q,w): n = len(w) inds = [0]*n P = [[]] # these next three lines creates the powerset of 0,1,...,(n-1) for i in range(n): P += [p+[i] for p in P] for S in P[1:]: T = [w[s] for s in S] if sum(T) \u0026gt;= q: for s in S: T = [t for t in S if t != s] if sum(w[j] for j in T)\u0026lt;q: inds[s]+=1 return(inds) #+END_SRC And we can test it: #+BEGIN_SRC python In [1]: q = 51; w = [49,49,2] In [2]: banzhaf(q,w) Out[2]: [2, 2, 2] In [3]: banzhaf(12,[4,4,4,2,2,1]) Out[3]: [10, 10, 10, 6, 6, 0 #+END_SRC The origin of the Banzhaf power indices was when John Banzhaf explored the fairness of a local voting system where six bodies had votes 9, 9, 7, 3, 1, 1 and a majority of 16 was required to pass any motion: #+BEGIN_SRC python In [4]: banzhaf(16,[9,9,7,3,1,1]) Out[4]: [16, 16, 16, 0, 0, 0] #+END_SRC This result led Banzhaf to campaign against this system as being manifestly unfair. ** Implementation using polynomials In 1976, the eminent mathematical political theorist [Steven Brams](https://en.wikipedia.org/wiki/Steven_Brams), along with Paul Affuso, in an article \u0026#34;Power and Size: A New Paradox\u0026#34; (in the /Journal of Theory and Decision/) showed how generating functions could be used effectively to compute the Banzhaf power indices. For example, suppose we have \\[ [6; 4,3,2,2] \\] and we wish to determine the power of the first voter. We consider the formal polynomial \\[ q_1(x) = (1+x^3)(1+x^2)(1+x^2) = 1 + 2x^2 + x^3 + x^4 + 2x^5 + x^7. \\] The coefficient of $x^j$ is the number of ways all the /other/ voters can combine to form a weight sum equal to $j$. For example, there are two ways voters can join to create a sum of 5: voters 2 and 3, or voters 2 and 4. But there is only one way to create a sum of 4: with voters 3 and 4. Then the number of ways in which voter 1 will be necessary can be found by adding all the coefficients of $x^{6-4}$ to $x^5$. This gives a value p(1) = 6. In general, define \\[ q_i(x) = \\prod_{j\\ne i}(1-x^{w_j}) = a_0 + a_1x + a_2x^2 +\\cdots + a_kx^k. \\] Then it is easily shown that \\[ p(i) = \\sum_{j=q-w_i}^{q-1}a_j. \\] As another example, suppose we use this method to compute Luxembourg\u0026#39;s power in the EEC: \\[ q_6(x) = (1+x^4)^3(1+x^2)^2 = 1 + 2x^2 + 4x^4 + 6x^6 + 6x^8 + 6x^{10} + 4x^{12} + 2x^{14} + x^{16} \\] and we find $b(6)$ by adding the coefficients of $x^{12-w_6}$ to $x^{12-1}$, which produces zero. This can be readily implemented in Python, using the [sympy](https://www.sympy.org/en/index.html) library for symbolic computation. #+BEGIN_SRC python import sympy as sy def banzhaf(q,w): sy.var(\u0026#39;x\u0026#39;) n = len(w) inds = [] for i in range(n): p = 1 for j in range(i): p *= (1+x**w[j]) for j in range(i+1,n): p *= (1+x**w[j]) p = p.expand() inds += [sum(p.coeff(x,k) for k in range(q-w[i],q))] return(inds) #+END_SRC ** Computation of Shapley-Shubik index The use of permutations will clearly be too unwieldy. Even for say 15 voters, there are $2^{15}=32768$ subsets, but $1,307,674,368,000$ permutations, which is already too big for enumeration (except possibly on a very fast machine, or in parallel, or using a clever algorithm). The use of polynomials for computations in fact precedes the work of Brams and Affuso; it was published by Irwin Mann and Lloyd Shapley in 1962, in a \u0026#34;memorandum\u0026#34; called /Values of Large Games IV: Evaluating the Electoral College Exactly/ which happily you can find as a [PDF file here](https://apps.dtic.mil/dtic/tr/fulltext/u2/276368.pdf). Building on some previous work, they showed that the Shapley-Shubik index corresponding to voter $i$, could be defined as \\[ \\Phi_i=\\sum_{k=0}^{n-1}\\frac{k!(n-1-k)!}{n!}\\sum_{j=q-w_i}^{q-1}c_{jk} \\] where $c_{jk}$ is the coefficient of $x^jy^k$ in the expansion of \\[ f_i(x,y)=\\prod_{m\\ne i}(1+x^{w_m}y). \\] This of course has many similarities to the polynomial definition of the Banzhaf power index, and can be computed similarly: #+BEGIN_SRC python def shapley(q,w): sy.var(\u0026#39;x,y\u0026#39;) n = len(w) inds = [] for i in range(n): p = 1 for j in range(i): p *= (1+y*x**w[j]) for j in range(i+1,n): p *= (1+y*x**w[j]) p = p.expand() B = [] for j in range(n): pj = p.coeff(y,j) B += [sum(pj.coeff(x,k) for k in range(q-w[i],q))] inds += [sum(sy.Float(B[j]/sy.binomial(n,j)/(n-j)) for j in range(n))] return(inds) #+END_SRC ** A few simple examples The Australian (federal) Senate consists of 76 members, of which a simple majority is required to pass a bill. It is unusual for the current elected government (which will have a majority in the lower house: the House of Representatives) also to have a majority in the Senate. Thus it is quite possible for a party with small numbers to wield significant power. A case in point is that of the \u0026#34;[Australian Democrats](https://en.wikipedia.org/wiki/Australian_Democrats)\u0026#34; party, founded in 1977 by a disaffected ex-Liberal politician called [Don Chipp](https://en.wikipedia.org/wiki/Don_Chipp), with the uniquely Australian slogan \u0026#34;Keep the Bastards Honest\u0026#34;. For nearly two decades they were a vital force in Australian politics; they have pretty much lost all power they once had, although the [party still exists](https://www.democrats.org.au). Here\u0026#39;s a little table showing the Senate composition in various years: #+ATTR_CSS: :width 75% | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | | Party | 1985 | 2000 | 2020 | |---------------------+------+------+------| | Government | 34 | 35 | 36 | | Opposition | 33 | 29 | 26 | | Democrats | 7 | 9 | | | Independent | 1 | 1 | 1 | | Nuclear Disarmament | 1 | | | | Greens | | 1 | 9 | | One Nation | | 1 | 2 | | Centre Alliance | | | 1 | | Lambie Party | | | 1 | This composition in 1985 can be described as \\[ [39; 34,33,7,1,1]. \\] And now: #+BEGIN_SRC python In [1]: b = banzhaf(39,[34,33,7,1,1]) [sy.N(x/sum(b),4) for x in b] Out[1]: [0.3333, 0.3333, 0.3333, 0, 0] In [2]: s = shapley(39,[34,33,7,1,1]) [sy.N(x,4) for x in s] Out[2]: [0.3333, 0.3333, 0.3333, 0, 0] #+END_SRC Here we see that both power indices give the same result: that the Democrats had equal power in the Senate to the two major parties, and the other two senate members had no power at all. In 2000, we have $[39;35,29,9,1,1,1]$ and: #+BEGIN_SRC python In [1]: b = banzhaf(39,[31,29,9,1,1,1]) [sy.N(x/sum(b),4) for x in b] Out[1]: [0.34, 0.3, 0.3, 0.02, 0.02, 0.02] In [2]: s = shapley(39,[31,29,9,1,1,1]) [sy.N(x,4) for x in s] Out[2]: [0.35, 0.3, 0.3, 0.01667, 0.01667, 0.01667] #+END_SRC We see here that the two power indices give two slightly different results, but in each case the power of the Democrats was equal to that of the opposition, and this time the parties with single members had real (if small) power. By 2020 the Democrats have disappeared as a political force, their place being more-or-less taken (at least numerically) by the Greens: #+BEGIN_SRC python In [1]: b = banzhaf(39,[36,26,9,2,1,1,1] [sy.N(x/sum(b),4) for x in b] Out[1]: [0.5306, 0.1224, 0.1224, 0.102, 0.04082, 0.04082, 0.04082] In [2]: s = shapley(39,[36,26,9,2,1,1,1] [sy.N(x,4) for x in s] Out[2]: [0.5191, 0.1357, 0.1357, 0.1024, 0.03571, 0.03571, 0.03571] #+END_SRC This shows a very different sort of power balance to previously: the Government has much more power in the Senate, partly to having close to a majority and partly because of the fracturing of other Senate members through a host of smaller parties. Note that the Greens, while having more members that the Democrats did in 1985, have far less power. Note also that One Nation, while only having twice as many members as the singleton parties, has far more power: 2.5 times by Banzhaf, 2.8667 times by Shapley-Shubik. * Voting power :voting: :PROPERTIES: :EXPORT_FILE_NAME: voting_power :EXPORT_DATE: 2020-12-30 :END: After the 2020 American Presidential election, with the usual post-election analyses and (in this case) vast numbers of lawsuits, I started looking at the Electoral College, and trying to work out how it worked in terms of power. Although power is often conflated simply with the number of votes, that\u0026#39;s not necessarily the case. We consider /power/ as the ability of any state to affect the outcome of an election. Clearly a state with more votes: such as California with 55, will be more powerful than a state with fewer, for example Wyoming with 3. But often power is not directly correlated with size. For example, imagine a version of America with just 3 states, Alpha, Beta, and Gamma, with electoral votes 49, 49, 2 respectively, and 51 votes needed to win. The following table shows the ways that the states can join to reach (or exceed) that majority, and in each case which state is \u0026#34;necessary\u0026#34; for the win: #+ATTR_CSS: :width 75% | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | | Winning Coalitions | Votes won | Necessary States | |--------------------+-----------+------------------| | Alpha, Beta | 98 | Alpha, Beta | | Alpha, Gamma | 51 | Alpha, Gamma | | Beta, Gamma | 51 | Beta, Gamma | | Alpha, Beta, Gamma | 100 | No single state | By \u0026#34;necessary states\u0026#34; we mean a state whose votes are necessary for the win. And in looking at that table, we see that in terms of influencing the vote, Gamma, with only 2 electors, is /equally as powerful/ as the other two states. To give another example, the Treaty Of Rome in the 1950\u0026#39;s established the first version of the European Common Market, with six member states, each allocated a number of votes for decision making: #+ATTR_CSS: :width 50% | | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | | | Member | Votes | |---+-----------------+-------| | 1 | France | 4 | | 2 | West Germany | 4 | | 3 | Italy | 4 | | 4 | The Netherlands | 2 | | 5 | Belgium | 2 | | 6 | Luxembourg | 1 | The treaty determined that a quota of 12 votes was needed to pass any resolution. At first this table might seem manifestly unfair: West Germany with a population of over 55 million compared with Luxembourg\u0026#39;s roughly 1/3 of a million, thus with something like 160 times the population, West Germany got only 4 times the number of votes of Luxembourg. But in fact it\u0026#39;s even worse: since 12 votes are required to win, and all the other numbers of votes are even, there is no way that Luxembourg can influence any vote at all: its voting power was zero. If another state joined, also with a vote of 1, then it and Luxembourg together can influence a vote, and so Luxembourg\u0026#39;s voting power would increase. A /power index/ is some numerical value attached to a weighted vote which describes its power in this sense. Although there are many such indices, there are two which are most widely used. The first was developed by [Lloyd Shapley](https://en.wikipedia.org/wiki/Lloyd_Shapley) (who would win the Nobel Prize for Economics in 2012) and [Martin Shubik](https://en.wikipedia.org/wiki/Martin_Shubik) in 1954; the second by [John Banzhaf](https://en.wikipedia.org/wiki/John_Banzhaf) in 1965. ** Basic definitions First, some notation. In general we will have $n$ voters each with a /weight/ $w_i$, and a quota $q$ to be reached. For the American Electoral College, the voters are the states, the weights are the numbers of Electoral votes, and $q$ is the number of votes required: 238. This is denoted as \\[ [q; w_1, w_2,\\ldots,w_n]. \\] The three state example above is thus denoted \\[ [51; 49, 49, 2] \\] and the EEC votes as \\[ [12; 4,4,4,2,2,1]. \\] *** The Shapley-Shubik index Suppose we have $n$ votes with weights $w_1$, $w_2$ up to $w_n$, and a quote $q$ required. Consider all permutations of $1,2, \\ldots,n$. For each permutation, add up the weights starting at the left, and designate as the /pivot voter/ the first voter who causes the cumulative sum to equal or exceed the quota. For each voter $i$, let $s_i$ be the number of times that voter has been chosen as a pivot. Then its power index is $s_i/n!$. This means that the sum of all power indices is unity. Consider the three state example above, where $w_1=w_2=49$ and $w_3=2$, and where we compute cumulative sums only up to reaching or exceeding the quota: #+ATTR_CSS: :width 75% | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | | Permutation | Cumulative sum of weights | Pivot Voter | |-------------+---------------------------+-------------| | 1 2 3 | 49, 98 | 2 | | 1 3 2 | 49, 51 | 3 | | 2 1 3 | 49, 98 | 1 | | 2 3 1 | 49, 51 | 3 | | 3 1 2 | 2, 51 | 1 | | 3 2 1 | 2, 51 | 2 | We see that $s_1=s_2=s_3=2$ and so the Shapley-Shubik power indices are all $1/3$. *** The Banzhaf index For the Banzhaf index, we consider the /winning coalitions/: these are any subset $S$ of voters for which the sum of weights is not less than $q$. It\u0026#39;s convenient to define a function for this: \\[ v(S) = \\begin{cases} 1 \u0026amp; \\text{if } \\sum_{i\\in S}w_i\\ge q \\cr 0 \u0026amp; \\text{otherwise} \\end{cases} \\] A voter $i$ is /necessary/ for a winning coalition $S$ if $S-\\{i\\}$ is not a winning coalition; that is, if $v(S)-v(S-\\{i\\})=1$. If we define \\[ p(i) =\\sum_S v(S)-v(s-\\{i\\}) \\] then $b(i)$ is a measure of power, and the (normalized) Banzhaf power indices are defined as \\[ b(i) = \\frac{p(i)}{\\sum_i p(i)} \\] so that the sum of all indices (as for the Shapley-Shubik index) is again unity. Considering the first table above, we see that $p(1)=p(2)=p(3)=2$ and the Banzhf power indices are all $1/3$. For this example the Banzhaf and Shapley-Shubik values agree. This is not always the case. For the EEC example, the winning coalitions are, with necessary voters: | \u0026lt;l20\u0026gt; | \u0026lt;l10\u0026gt; | \u0026lt;l15\u0026gt; | | Winning Coalition | Votes | Necessary voters | |-------------------+-------+------------------| | 1,2,3 | 12 | 1,2,3 | | 1,2,4,5 | 12 | 1,2,4,5 | | 1,3,4,5 | 12 | 1,3,4,5 | | 2,3,4,5 | 12 | 2,3,4,5 | | 1,2,3,6 | 13 | 1,2,3 | | 1,2,4,5,6 | 13 | 1,2,4,5 | | 1,3,4,5,6 | 13 | 1,3,4,5 | | 2,3,4,5,6 | 13 | 2,3,4,5 | | 1,2,3,4 | 14 | 1,2,3 | | 1,2,3,5 | 14 | 1,2,3 | | 1,2,3,4,6 | 15 | 1,2,3 | | 1,2,3,5,6 | 15 | 1,2,3 | | 1,2,3,4,5 | 16 | No single voter | | 1,2,3,4,5,6 | 17 | No single voter | Counting up the number of times each voter appears in the rightmost column, we see that \\[ p(1) = p(2) = p(3) = 10,\\quad p(4) = p(5) = 6,\\quad p(6) = 0 \\] and so \\[ b(1) = b(2) = b(3) = \\frac{5}{21},\\quad b(4) = b(5) = \\frac{1}{7}. \\] Note that the power of the three biggest states is in fact only 5/3 times that of the smaller states, in spite of having twice as many votes. This is a striking example of how power is not proportional to voting weight. Note that computing the Shapley-Shubik index could be unwieldy; there are \\[ \\frac{6!}{3!2!} = 60 \\] different permutations of the weights, and clearly as the number of weights increases, possibly with very few repetitions, the number of permutations will be excessive. For the Electoral College, with 51 members, and a few states with the same numbers of voters, the total number of permutations will be \\[ \\frac{51!}{(2!)^4(3!)^3(4!)(5!)(6!)(8!)} = 5368164393879631593058456306349344975896576000000000 \\] which is clearly far too large for enumeration. But as we shall see, there are other methods. * Electing a president :voting:linear_programming:julia: :PROPERTIES: :EXPORT_FILE_NAME: electing_a_president :EXPORT_DATE: 2020-11-07 :END: Every four years (barring death or some other catastrophe), the USA goes through the periodic madness of a presidential election. Wild behaviour, inaccuracies, mud-slinging from both sides have been central since George Washington\u0026#39;s second term. And the entire business of voting is muddied by the Electoral College, the 538 members of which do the actual voting: the public, in their own voting, merely instruct the College what to do. Although it has been said that the EC \u0026#34;magnifies\u0026#34; the popular vote, this is not always the case, and quite often a president will be elected with a majority (270 or more) of Electoral College votes, in spite of losing the popular vote. This dichotomy encourages periodic calls for the College to be disbanded. As you probably know, each of the 50 states and the District of Columbia has Electors allocated to it, roughly proportional to population. Thus California, the most populous state, has 55 electors, and several smaller states (and DC) only 3. In all states except Maine and Nebraska, the votes are allocated on a \u0026#34;winner takes all\u0026#34; principle: that is, all the Electoral votes will be allocated to whichever candidate has obtained a plurality in that state. For only two candidates then, if a states\u0026#39; voters produce a simple majority of votes for one of them, that candidate gets all the EC votes. Maine and Nebraska however, allocate their EC votes by congressional district. In each state, 2 EC votes are allocated to the winner of the popular vote in the state, and for each congressional district (2 in Maine, 3 in Nebraska), the other votes are allocated to the winner in that district. It\u0026#39;s been a bit of a mathematical game to determine the theoretical lowest bound on a popular vote for a president to be elected. To show how this works, imagine a miniature system with four states and 14 electoral college votes: #+ATTR_CSS: :width 50% | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | | State | Population | Electors | |-------+------------+----------| | Abell | 100 | 3 | | Bisly | 100 | 3 | | Champ | 120 | 4 | | Dairy | 120 | 4 | Operating on the winner takes all principle in each state, 8 EC votes are required for a win. Suppose that in each state, the votes are cast as follows, for the candidates Mr Man and Mr Guy: #+ATTR_CSS: :width 75% | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | \u0026lt;l\u0026gt; | | State | Mr Man | Mr Guy | EC Votes to Man | EC Votes to Guy | |-----------+--------+--------+-----------------+-----------------| | Abell | 0 | 100 | 0 | 3 | | Bisly | 0 | 100 | 0 | 3 | | Champ | 61 | 59 | 4 | 0 | | Dairy | 61 | 59 | 4 | 0 | |-----------+--------+--------+-----------------+-----------------| | **Total** | 122 | 310 | 8 | 6 | and Mr Man wins with 8 EC votes but only about 27.3% of the popular vote. Now you might reasonably argue that this situation would never occur in practice, and probably you\u0026#39;re right. But extreme examples such as this are used to show up inadequacies in voting systems. And sometimes very strange things /do/ happen. So: what is the smallest percentage of the popular vote under which a president could be elected? To experiment, we need to know the number of registered voters in each state (and it appears that the percentage of eligible citizens enrolled to vote differs markedly between the states), and the numbers of electors. The first I ran to ground [here](https://worldpopulationreview.com/state-rankings/number-of-registered-voters-by-state) and the few states not accounted for I found information on their Attorney Generals\u0026#39; sites. The one state for which I couldn\u0026#39;t find statistics was Illinois, so I used the number 7.8 million, which has been bandied about on a few news sites. The numbers of electors per state is easy to find, for example on the [wikipedia page](https://en.wikipedia.org/wiki/United_States_Electoral_College). I make the following simplifying assumptions: all registered voters will vote; and all states operate on a winner takes all principle. Thus, for simplicity, I am not using the apportionment scheme of Maine and Nebraska. (I suspect that taking this into account wouldn\u0026#39;t effect the result much anyway.) Suppose that the registered voting population of each state (including DC) is $v_i$ and the number of EC votes is $c_i$. For any state, either the winner will be chosen by a bare majority, or all the votes will go to the loser. This becomes then a simple integer programming problem; in fact a knapsack problem. For each state, define \\[ m_i = \\lfloor v_i/2\\rfloor +1 \\] for the majority votes needed. We want to minimize \\[ V = \\sum_{i=1}^{51}x_im_i \\] subject to the constraint \\[ \\sum_{k=1}^{51}c_ix_i \\ge 270 \\] and each $x_i$ is zero or one. Now all we need to is set up this problem in a suitable system and solve it! I chose [Julia](https://julialang.org/) and its [JuMP](https://jump.dev/) modelling language, and for actually doing the dirty work, [GLPK](https://www.gnu.org/software/glpk/). JuMP in fact can be used with pretty much any optimisation software available, including commercial systems. #+BEGIN_SRC Julia using JuMP, GLPK states = [\u0026#34;Alabama\u0026#34;,\u0026#34;Alaska\u0026#34;,\u0026#34;Arizona\u0026#34;,\u0026#34;Arkansas\u0026#34;,\u0026#34;California\u0026#34;,\u0026#34;Colorado\u0026#34;,\u0026#34;Connecticut\u0026#34;,\u0026#34;Delaware\u0026#34;,\u0026#34;DC\u0026#34;,\u0026#34;Florida\u0026#34;,\u0026#34;Georgia\u0026#34;,\u0026#34;Hawaii\u0026#34;, \u0026#34;Idaho\u0026#34;,\u0026#34;llinois\u0026#34;,\u0026#34;Indiana\u0026#34;,\u0026#34;Iowa\u0026#34;,\u0026#34;Kansas\u0026#34;,\u0026#34;Kentucky\u0026#34;,\u0026#34;Louisiana\u0026#34;,\u0026#34;Maine\u0026#34;,\u0026#34;Maryland\u0026#34;,\u0026#34;Massachusetts\u0026#34;,\u0026#34;Michigan\u0026#34;,\u0026#34;Minnesota\u0026#34;, \u0026#34;Mississippi\u0026#34;,\u0026#34;Missouri\u0026#34;,\u0026#34;Montana\u0026#34;,\u0026#34;Nebraska\u0026#34;,\u0026#34;Nevada\u0026#34;,\u0026#34;New Hampshire\u0026#34;,\u0026#34;New Jersey\u0026#34;,\u0026#34;New Mexico\u0026#34;,\u0026#34;New York\u0026#34;,\u0026#34;North Carolina\u0026#34;, \u0026#34;North Dakota\u0026#34;,\u0026#34;Ohio\u0026#34;,\u0026#34;Oklahoma\u0026#34;,\u0026#34;Oregon\u0026#34;,\u0026#34;Pennsylvania\u0026#34;,\u0026#34;Rhode Island\u0026#34;,\u0026#34;South Carolina\u0026#34;,\u0026#34;South Dakota\u0026#34;,\u0026#34;Tennessee\u0026#34;,\u0026#34;Texas\u0026#34;,\u0026#34;Utah\u0026#34;, \u0026#34;Vermont\u0026#34;,\u0026#34;Virginia\u0026#34;,\u0026#34;Washington\u0026#34;,\u0026#34;West Virginia\u0026#34;,\u0026#34;Wisconsin\u0026#34;,\u0026#34;Wyoming\u0026#34;] reg_voters = [3560686,597319,4281152,1755775,22047448,4238513,2375537,738563,504043,14065627,7233584,795248,1010984,7800000,4585024, 2245092,1851397,3565428,3091340,1063383,4141498,4812909,8127040,3588563,2262810,4213092,696292,1252089,1821356,913726,6486299, 1350181,13555547,6838231,540302,8080050,2259113,2924292,9091371,809821,3486879,578666,3931248,16211198,1857861,495267,5975696, 4861482,1268460,3684726,268837] majorities = [Int(floor(x/2+1)) for x in reg_voters] ec_votes = [9,3,11,6,55,9,7,3,3,29,16,4,4,20,11,6,6,8,8,4,10,11,16,10,6,10,3,5,6,4,14,5,29,15,3,18,7,7,20,4,9,3,11,38,6,3,13,12,5,10,3] potus = Model(GLPK.Optimizer) @variable(potus, x[i=1:51], Bin) @constraint(potus, sum(ec_votes .* x) \u0026gt;= 270) @objective(potus, Min, sum(majorities .* x)); #+END_SRC Solving the problem is now easy: #+BEGIN_SRC Julia optimize!(potus) #+END_SRC Now let\u0026#39;s see what we\u0026#39;ve got: #+BEGIN_SRC Julia vx = value.(x) sum(ec_votes .* x) 270 votes = Int(objective_value(potus)) 46146767 votes*100/sum(reg_voters) 21.584985938021866 #+END_SRC and we see we have elected a president with slightly less than 21.6% of the popular vote. Digging a little further, we first find the states in which a bare majority voted for the winner: #+BEGIN_SRC Julia f = findall(x -\u0026gt; x == 1.0, vx) for i in f print(states[i],\u0026#34;, \u0026#34;) end Alabama, Alaska, Arizona, Arkansas, California, Connecticut, Delaware, DC, Hawaii, Idaho, llinois, Indiana, Iowa, Kansas, Louisiana, Maine, Minnesota, Mississippi, Montana, Nebraska, Nevada, New Hampshire, New Mexico, North Dakota, Oklahoma, Oregon, Rhode Island, South Carolina, South Dakota, Tennessee, Utah, Vermont, West Virginia, Wisconsin, Wyoming, #+END_SRC and the other states, in which every voter voted for the loser: #+BEGIN_SRC Julia nf = findall(x -\u0026gt; x == 0.0, vx) for i in nf print(states[i],\u0026#34;, \u0026#34;) end Colorado, Florida, Georgia, Kentucky, Maryland, Massachusetts, Michigan, Missouri, New Jersey, New York, North Carolina, Ohio, Pennsylvania, Texas, Virginia, Washington, #+END_SRC In point of history, the election in which the president-elect did worst was in 1824, when John Quincy Adams was elected over Andrew Jackson; this was in fact a four-way contest, and the decision was in the end made by the House of Representatives, who elected Adams by one vote. And Jackson, never one to neglect an opportunity for vindictiveness, vowed that he would destroy Adams\u0026#39;s presidency, which he did. More recently, since the Electoral College has sat at 538 members, in 2000 George W. Bush won in spite of losing the popular vote by 0.51%, and in 2016 Donald Trump won in spite of losing the popular vote by 2.09%. Plenty of numbers can be found on [wikipedia](https://en.wikipedia.org/wiki/List_of_United_States_presidential_elections_by_popular_vote_margin) and elsewhere. * Enumerating the rationals :mathematics: :PROPERTIES: :EXPORT_FILE_NAME: enumerating_the_rationals :EXPORT_HUGO_CUSTOM_FRONT_MATTER: :mathjax true :EXPORT_DATE: 2020-01-18 :END: The /rational numbers/ are well known to be countable, and one standard method of counting them is to put the positive rationals into an infinite matrix $M=m_{ij}$, where $m_{ij}=i/j$ so that you end up with something that looks like this: \\[ \\left[\\begin{array}{ccccc} \\frac{1}{1}\u0026amp;\\frac{1}{2}\u0026amp;\\frac{1}{3}\u0026amp;\\frac{1}{4}\u0026amp;\\dots\\\\\\\\[1ex] \\frac{2}{1}\u0026amp;\\frac{2}{2}\u0026amp;\\frac{2}{3}\u0026amp;\\frac{2}{4}\u0026amp;\\dots\\\\\\\\[1ex] \\frac{3}{1}\u0026amp;\\frac{3}{2}\u0026amp;\\frac{3}{3}\u0026amp;\\frac{3}{4}\u0026amp;\\dots\\\\\\\\[1ex] \\frac{4}{1}\u0026amp;\\frac{4}{2}\u0026amp;\\frac{4}{3}\u0026amp;\\frac{4}{4}\u0026amp;\\dots\\\\\\\\[1ex] \\vdots\u0026amp;\\vdots\u0026amp;\\vdots\u0026amp;\\vdots\u0026amp;\\ddots \\end{array}\\right] \\] It is clear that not only will each positive rational appear somewhere in this matrix, but its value will appear an infinite number of times. For example $2 / 3$ will appear also as $4 / 6$, as $6 / 9$ and so on. Then we can enumerate all the elements of this matrix by traversing all the SW--NE diagonals: #+ATTR_HTML: :width 250 [[file:/enumerate_rationals.png]] This provides an enumeration of all the positive rationals: \\[ \\frac{1}{1}, \\frac{1}{2}, \\frac{2}{1}, \\frac{3}{1}, \\frac{2}{2}, \\frac{1}{3}, \\frac{1}{4}, \\frac{2}{3},\\ldots \\] To enumerate all rationals (positive and negative), we simply place the negative of each value immediately after it: \\[ \\frac{1}{1}, -\\frac{1}{1}, \\frac{1}{2}, -\\frac{1}{2}, \\frac{2}{1}, -\\frac{2}{1}, \\frac{3}{1}, -\\frac{3}{1}, \\frac{2}{2}, -\\frac{2}{2}, \\frac{1}{3}, -\\frac{1}{3}, \\frac{1}{4}, \\\\frac{1}{4}, \\frac{2}{3}, -\\frac{2}{3}\\ldots \\] This is all standard, well-known stuff, and as far as countability goes, pretty trivial. One might reasonably ask: is there a way of enumerating all rationals in such a way that no rational is repeated, and that every rational appears naturally in its lowest form? Indeed there is; in fact there are several, of which one of the newest, most elegant, and simplest, is using the [Calkin-Wilf tree](https://en.wikipedia.org/wiki/Calkin–Wilf_tree). This is named for its discoverers (or creators, depending on which philosophy of mathematics you espouse), who described it in [an article](https://www.math.upenn.edu/~wilf/website/recounting.pdf) happily available on the archived web site of the [second author](https://www.math.upenn.edu/~wilf/). Herbert Wilf died in 2012, but the Mathematics Department at the University of Pennsylvania have maintained the page as he left it, as an online memorial to him. The Calkin-Wilf tree is a binary tree with root $a / b = 1 / 1$. From each node $a / b$ the left child is $a / (a+b)$ and the right child is $(a+b) / b$. From each node $a / b$, the path back to the root contains the fractions which encode, as it were, the Euclidean algorithm for determining the greatest common divisor of $a$ and $b$. It is not hard to show that every fraction in the tree is in its lowest terms, and appears only once; also that every rational appears in the tree. The enumeration of the rationals can thus be made by a breadth-first transversal of the tree; in other words listing each level of the tree one after the other: \\[ \\underbrace{\\frac{1}{1}}_{\\text{The root}},\\; \\underbrace{\\frac{1}{2},\\; \\frac{2}{1}}_{\\text{first level}},\\; \\underbrace{\\frac{1}{3},\\; \\frac{3}{2},\\; \\frac{2}{3},\\; \\frac{3}{1}}_{\\text{second level}},\\; \\underbrace{\\frac{1}{4},\\; \\frac{4}{3},\\; \\frac{3}{5},\\; \\frac{5}{2},\\; \\frac{2}{5},\\; \\frac{5}{3},\\; \\frac{3}{4},\\; \\frac{4}{1}}_{\\text{third level}}\\;\\ldots \\] Note that the denominator of each fraction is the numerator of its successor (again, this is not hard to prove in general); thus given the sequence \\[ b_i=0,1,1,2,1,3,2,3,1,4,3,5,2,5,3,4,\\ldots \\] (indexed from zero), the rationals are enumerated by $b_i/b_{i+1}$. This sequence pre-dates Calkin and Wilf; is goes back to an older enumeration now called the [Stern-Brocot tree](https://en.wikipedia.org/wiki/Stern–Brocot_tree) named for the mathematician Moritz Stern and the clock-maker Achille Brocot (who was investigating gear ratios), who discovered this tree independently in the early 1860\u0026#39;s. The sequence $b_i$ is called /Stern\u0026#39;s diatomic sequence/ and can be generated recursively: \\[ b_i=\\left\\{\\begin{array}{ll} i,\u0026amp;\\text{if $i\\le 1$}\\\\ b_{i/2},\u0026amp;\\text{if $i$ is even}\\\\ b_{(i-1)/2}+b_{(i+1)/2},\u0026amp;\\text{if $i$ is odd} \\end{array} \\right. \\] Alternatively: \\[ b_0=0,\\;b_1=1,\\;b_{2i}=b_i,b_{2i+1}=b_i+b_{i+1}\\text{ for }i\\ge 1. \\] This is the form in which it appears as [sequence 2487 in the OEIS](https://oeis.org/A002487). So we can generate Stern\u0026#39;s diatomic sequence $b_i$, and then the successive fractions $b_i/b_{i+1}$ will generate each rational exactly once. If that isn\u0026#39;t remarkable enough, sometime prior to 2003, Moshe Newman showed that the Calkin-Wilf enumeration of the rationals can in fact be done directly: \\[ x_0 = 1,\\quad x_{i+1}=\\frac{1}{2\\lfloor x_i\\rfloor -x_i +1}\\;\\text{for}\\;i\\ge 1 \\] will generate all the rationals. I can\u0026#39;t find anything at all about Moshe Newman; he is always just mentioned as having \u0026#34;shown\u0026#34; this result. Never where, or to whom. There is a proof for this in an article \u0026#34;New Looks at Old Number Theory\u0026#34; by Aimeric Malter, Dierk Schleicher and Don Zagier, published in /The American Mathematical Monthly/ , Vol. 120, No. 3 (March 2013), pp. 243-264. The part of the article relating to enumeration of rationals is based on a prize-winning mathematical essay by the first author (who at the time was a high school student in Bremen, Germany), when he was only 13. Here is the skeleton of Malter\u0026#39;s proof: If $x$ is any node, then its left and right hand children are $L = x / (x+1)$ and $R = 1+x = 1 / (1-L)$ respectively. And clearly $R = 1/(2\\lfloor L\\rfloor -L +1)$. Suppose now that $A$ is a right child, and $B$ is its successor rational. Then $A$ and $B$ will have a common ancestor $z=p/q$, say $k$ generations ago. To get from $z$ to $A$ will require one left step and $k-1$ right steps. It is easy to show (by induction if you like), that \\[ A = k-1+\\frac{p}{p+q} \\] and for its successor $B$, obtained by one right step from $z$ and $k-1$ left steps: \\[ B = \\frac{1}{\\frac{q}{p+q}+k-1}. \\] Since $k-1=\\lfloor A\\rfloor$, and since \\[ \\frac{p}{p+q} = A-\\lfloor A\\rfloor \\] it follows that \\[ B=\\frac{1}{1-(A-\\lfloor A\\rfloor))+\\lfloor A\\rfloor}=\\frac{1}{2\\lfloor A\\rfloor-A+1}. \\] The remaining case is moving from the end of one row to the beginning of the next, that is, from $n$ to $1 / (n+1)$. And this is trivial. What\u0026#39;s more, we can write down the isomorphisms between this sequence of positive rationals and in positive integers. Define $N:\\Bbb{Q}\\to\\Bbb{Z}$ as follows: \\[ N(p/q)=\\left\\{\\begin{array}{ll} 1,\u0026amp;\\text{if $p=q$}\\\\ 2 N(p/(q-p)),\u0026amp;\\text{if $p\\lt q$}\\\\ 2 N((p-q)/q)+1,\u0026amp;\\text{if $p\\gt q$} \\end{array} \\right. \\] Without going through a formal proof, what this does is simply count the number of steps taken to perform the Euclidean algorithm on $p$ and $q$. The extra factors of 2 ensure that rationals in level $k$ have values between $2^k$ and $2^{k+1}$, and the final \u0026#34;$+1$\u0026#34; differentiates left and right children. This function assumes that $p$ and $q$ are relatively prime; that is, that the fraction $p/q$ is in its lowest terms. (The isomorphism in the other direction is given by $k\\mapsto b_k/b_{k+1}$ where $b_k$ are the elements of Stern\u0026#39;s diatomic sequence discussed above.) This is just the sort of mathematics I like: simple, but surprising, and with depth. What\u0026#39;s not to like? * Fitting the SIR model of disease to data in Julia :mathematics:julia: :PROPERTIES: :EXPORT_FILE_NAME: fitting_sir_to_data_in_julia :EXPORT_HUGO_CUSTOM_FRONT_MATTER: :mathjax true :EXPORT_DATE: 2020-01-15 :END: A few posts ago I showed how to do this in Python. Now it\u0026#39;s Julia\u0026#39;s turn. The data is the same: spread of influenza in a British boarding school with a population of 762. This was reported in the British Medical Journal on March 4, 1978, and you can read the original short article [here](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1603269/pdf/brmedj00115-0064.pdf). As before we use the [SIR model](https://en.wikipedia.org/wiki/Compartmental_models_in_epidemiology#The_SIR_model_without_vital_dynamics), with equations \\begin{aligned} \\frac{dS}{dt}\u0026amp;=-\\frac{\\beta IS}{N}\\\\ \\frac{dI}{dt}\u0026amp;=\\frac{\\beta IS}{N}-\\gamma I\\\\ \\frac{dR}{dt}\u0026amp;=\\gamma I \\end{aligned} where $S$, $I$, and $R$ are the numbers of /susceptible/, /infected/, and /recovered/ people. This model assumes a constant population - so no births or deaths - and that once recovered, a person is immune. There are more complex models which include a changing population, as well as other disease dynamics. The above equations can be written without the population; since it is constant, we can just write $\\beta$ instead of $\\beta/N$. The values $\\beta$ and $\\gamma$ are the parameters which affect the working of this model, their values and that of their ratio $\\beta/\\gamma$ provide information of the speed of the disease spread. As with Python, our interest will be to see if we can find values of $\\beta$ and $\\gamma$ which model the school outbreak. We will do this in three functions. The first sets up the differential equations: #+BEGIN_SRC Julia using DifferentialEquations function SIR!(du,u,p,t) S,I,R = u β,γ = p du[1] = dS = -β*I*S du[2] = dI = β*I*S - γ*I du[3] = dR = γ*I end #+END_SRC The next function determines the sum of squares between the data and the results of the SIR computations for given values of the parameters. Since we will put all our functions into one file, we can create the constant values outside any functions which might need them: #+BEGIN_SRC Julia data = [1, 3, 6, 25, 73, 222, 294, 258, 237, 191, 125, 69, 27, 11, 4] tspan = (0.0,14.0) u0 = [762.0,1.0,0.0] function ss(x) prob = ODEProblem(SIR!,u0,tspan,(x[1],x[2])) sol = solve(prob) sol_data = sol(0:14)[2,:] return(sum((sol_data - data) .^2)) end #+END_SRC Note that we don\u0026#39;t have to carefully set up the problem to produce values at each of the data points, which in our case are the integers from 0 to 14. Julia will use a standard numerical technique with a dynamic step size, and values corresponding to the data points can then be found by interpolation. All of this functionality is provided by the =DifferentialEquations= package. For example, =R[10]= will return the 10th value of the list of computed =R= values, but =R(10)= will produce the interpolated value of =R= at $t=10$. Finally we use the =Optim= package to minimize the sum of squares, and the =Plots= package to plot the result: #+BEGIN_SRC Julia using Optim using Plots function run_optim() opt = optimize(ss,[0.001,0.01],NelderMead()) beta,gamma = opt.minimizer prob = ODEProblem(SIR!,u0,tspan,(beta,gamma)) sol = solve(prob) plot(sol, linewidth=2, xaxis=\u0026#34;Time in days\u0026#34;, label=[\u0026#34;Susceptible\u0026#34; \u0026#34;Infected\u0026#34; \u0026#34;Recovered\u0026#34;]) plot!([0:14],data,linestyle=:dash,marker=:circle,markersize=4,label=\u0026#34;Data\u0026#34;) end #+END_SRC Running the last function will produce values \\begin{aligned} \\beta\u0026amp;=0.0021806887934782853\\\\ \\gamma\u0026amp;=0.4452595474326912 \\end{aligned} and the final plot looks like this: ![Julia SIR plot](/julia_sir_plot.png) This was at least as easy as in Python, and with a few extra bells and whistles, such as interpolation of data points. Nice! * The Butera-Pernici algorithm (2) :mathematics:computation: :PROPERTIES: :EXPORT_FILE_NAME: the_butera_pernici_algorithm_2 :EXPORT_HUGO_CUSTOM_FRONT_MATTER: :mathjax true :EXPORT_DATE: 2020-01-06 :END: The purpose of /this/ post will be to see if we can implement the algorithm in Julia, and thus leverage Julia\u0026#39;s very fast execution time. We are working with polynomials defined on nilpotent variables, which means that the degree of any generator in a polynomial term will be 0 or 1. Assume that our generators are indexed from zero: $x_0,x_1,\\ldots,x_{n-1}$, then any term in a polynomial will have the form \\[ cx_{i_1}x_{i_2}\\cdots x_{i_k} \\] where $\\{x_{i_1}, x_{i_2},\\ldots, x_{i_k}\\}\\subseteq\\{0,1,2,\\ldots,n-1\\}$. We can then express this term as an element of a dictionary ={k =\u0026gt; v}= where \\[ k = 2^{i_1}+2^{i_2}+\\cdots+2^{i_k}. \\] So, for example, the polynomial term $7x_2x_3x_5$ would correspond to the dictionary term =44 =\u0026gt; 7= since $44 = 2^2+2^3+2^5$. Two polynomial terms ={k1 =\u0026gt; v1}= and ={k2 =\u0026gt; v2}= with no variables in common can then be multiplied simply by adding the k terms, and multiplying the v values, to obtain ={k1+k2 =\u0026gt; v1*v2}= . And we can check if =k1= and =k2= have a common variable easily by evaluating =k1 \u0026amp; k2=; a non-zero value indicates a common variable. This leads to the following Julia function for multiplying two such dictionaries: #+BEGIN_SRC Julia function poly_dict_mul(p1, p2) p3 = Dict{BigInt,BigInt}() for (k1, v1) in p1 for (k2, v2) in p2 if k1 \u0026amp; k2 \u0026gt; 0 continue else if k1 + k2 in keys(p3) p3[k1+k2] += v1 * v2 else p3[k1+k2] = v1 * v2 end end end end return (p3) end #+END_SRC As you see, this is a simple double loop over the terms in each polynomial dictionary. If two terms have a non-zero conjunction, we simply move on. If two terms when added already exist in the new dictionary, we add to that term. If the sum of terms is new, we create a new dictionary element. The use of =BigInt= is to ensure that no matter how big the terms and coefficients become, we don\u0026#39;t suffer from arithmetic overflow. For example, suppose we consider the product \\[ (x_0+x_1+x_2)(x_1+x_2+x_3). \\] A straightforward expansion produces \\[ x_0x_3 + x_1x_3 + x_1^2 + x_0x_1 + x_0x_2 + 2x_1x_2 + x_2^2 + x_2x_3. \\] which by nilpotency becomes \\[ x_0x_3 + x_1x_3 + x_0x_1 + x_0x_2 + 2x_1x_2 + x_2x_3. \\] The dictionaries corresponding to the two polynomials are ={1 =\u0026gt; 1, 2 =\u0026gt; 1, 4 =\u0026gt; 1}= and ={2 =\u0026gt; 1, 4 =\u0026gt; 1, 8 =\u0026gt; 1}= Then: #+BEGIN_SRC Julia julia\u0026gt; poly_dict_mul(Dict(1=\u0026gt;1,2=\u0026gt;1,4=\u0026gt;1),Dict(2=\u0026gt;1,4=\u0026gt;1,8=\u0026gt;1)) Dict{BigInt,BigInt} with 6 entries: 9 =\u0026gt; 1 10 =\u0026gt; 1 3 =\u0026gt; 1 5 =\u0026gt; 1 6 =\u0026gt; 2 12 =\u0026gt; 1 #+END_SRC If we were to rewrite the keys as binary numbers, we would have ={1001 =\u0026gt; 1, 1010 =\u0026gt; 1, 11 =\u0026gt; 1, 101 =\u0026gt; 1, 110 =\u0026gt; 2, 1100 =\u0026gt; 1}= in which you can see that each term corresponds with the term of the product above. Having conquered multiplication, finding the permanent should then require two steps: 1. Turning each row of the matrix into a polynomial dictionary. 2. Starting with $p=1$, multiply all rows together, one at a time. For step 1, suppose we have a row $i$ of a matrix $M=m_{ij}$. Then starting with an empty dictionary =p=, we move along the row, and for each non-zero element $m_{ij}$ we add the term =p[BigInt(1)\u0026lt;\u0026lt;j] = M[i,j]=. For speed we use bit operations instead of arithmetic operations. This means we can create a list of all polynomial dictionaries: #+BEGIN_SRC Julia function mat_polys(M) (n,ncols) = size(M) ps = [] for i in 1:n p = Dict{BigInt,BigInt}() for j in 1:n if M[i,j] == 0 continue else p[BigInt(1)\u0026lt;\u0026lt;(j-1)] = M[i,j] end end push!(ps,p) end return(ps) end #+END_SRC Step 2 is a simple loop; the permanent will be given as the value in the final step: #+BEGIN_SRC Julia function poly_perm(M) (n,ncols) = size(M) mp = mat_polys(M) p = Dict{BigInt,BigInt}(0=\u0026gt;1) for i in 1:n p = poly_dict_mul(p,mp[i]) end return(collect(values(p))[1]) end #+END_SRC We don\u0026#39;t in fact need two separate functions here; since the polynomial dictionary for each row is only used once, we could simply create each one as we needed. However, given that none of our matrices will be too large, the saving of time and space would be minimal. Now for a few tests: #+BEGIN_SRC Julia julia\u0026gt; n = 10; M = [BigInt(1)*mod(j-i,n) in [1,2,3] for i = 1:n, j = 1:n); julia\u0026gt; poly_perm(M) 125 julia\u0026gt; n = 20; M = [BigInt(1)*mod(j-i,n) in [1,2,3] for i = 1:n, j = 1:n); julia\u0026gt; @time poly_perm(M) 0.003214 seconds (30.65 k allocations: 690.875 KiB) 15129 julia\u0026gt; n = 40; M = [BigInt(1)*mod(j-i,n) in [1,2,3] for i = 1:n, j = 1:n); julia\u0026gt; @time poly_perm(M) 0.014794 seconds (234.01 k allocations: 5.046 MiB) 228826129 julia\u0026gt; n = 100; M = [BigInt(1)*mod(j-i,n) in [1,2,3] for i = 1:n, j = 1:n); julia\u0026gt; @time poly_perm(M) 0.454841 seconds (3.84 M allocations: 83.730 MiB, 27.98% gc time) 792070839848372253129 julia\u0026gt; lucasnum(n)+2 792070839848372253129 #+END_SRC This is extraordinarily fast, especially compared with our previous attempts: naive attempts using all permutations, and using Ryser\u0026#39;s algorithm. **** A few comparisons Over the previous blog posts, we have explored various different methods of computing the permanent: - =permanent=, which is the most naive method, using the formal definition, and summing over all the permutations $S_n$. - =perm1=, Ryser\u0026#39;s algorithm, using the Combinatorics package and iterating over all non-empty subsets of $\\{1,2,\\ldots,n\\}$. - =perm2=, Same as =perm1= but instead of using subsets, we use all non-zero binary vectors of length n. - =perm3=, Ryser\u0026#39;s algorithm using Gray codes to speed the transition between subsets, and using a lookup table. All these are completely general, and aside from the first function, which is the most inefficient, can be used for any matrix up to size about $25\\times 25$. So consider the $n\\times n$ circulant matrix with three ones in each row, whose permanent is $L(n)+2$. The following table shows times in seconds (except where minutes is used) for each calculation: #+ATTR_CSS: :width 75% :text-align left | | 10 | 12 | 15 | 20 | 30 | 40 | 60 | 100 | |-------------+--------+-------+-------+-------+-------+------+------+------| | =permanent= | 9.3 | - | - | - | - | - | - | - | | =perm1= | 0.014 | 0.18 | 0.72 | 47 | - | - | - | - | | =perm2= | 0.03 | 0.105 | 2.63 | 166 | - | - | - | - | | =perm3= | 0.004 | 0.016 | 0.15 | 12.4 | - | - | - | - | | =poly_perm= | 0.0008 | 0.004 | 0.001 | 0.009 | 0.008 | 0.02 | 0.05 | 0.18 | Assuming that the time taken for =permanent= is roughly proportional to $n!n$, then we would expect that the time for matrices of sizes 23 and 24 would be about $1.5\\times 10^{17}$ and $3.8\\times 10^{18}$ seconds respectively. Note that the age of the universe is approximately $4.32\\times 10^{17}$ seconds, so my laptop would need to run for about the third of the universe\u0026#39;s age to compute the permanent of a $23\\times 23$ matrix. That\u0026#39;s about the time since the solar system and the Earth were formed. Note also that =poly_perm= will slow down if the number of non-zero values in each row increases. For example, with four consecutive ones in each row, it takes over 10 seconds for a $100\\times 100$ matrix. With five ones in each row, it takes about 2.7 and 21.6 seconds respectively for matrices of size 40 and 60. Extrapolating indicates that it would take about 250 seconds for the $100\\times 100$ matrix. In general, an $n\\times n$ matrix with $k$ non-zero elements in each row will have a time complexity approximately of order $n^k$. However, including the extra optimization (which we haven\u0026#39;t done) that allows for elements to be set to one before the multiplication, produces an algorithm whose complexity is $O(2^{\\text{min}(2w,n)}(w+1)n^2)$ where $n$ is the size of the matrix, and $w$ its band-width. See the [original paper](https://arxiv.org/abs/1406.5337) for details. * The Butera-Pernici algorithm (1) :mathematics:computation: :PROPERTIES: :EXPORT_FILE_NAME: the_butera_pernici_algorithm_1 :EXPORT_HUGO_CUSTOM_FRONT_MATTER: :mathjax true :EXPORT_DATE: 2020-01-04 :END: *** Introduction We know that there is no general sub-exponential algorithm for computing the permanent of a square matrix. But we may very reasonably ask -- might there be a faster, possibly even polynomial-time algorithm, for some specific classes of matrices? For example, a sparse matrix will have most terms of the permanent zero -- can this be somehow leveraged for a better algorithm? The answer seems to be a qualified \u0026#34;yes\u0026#34;. In particular, if a matrix is banded, so that most diagonals are zero, then a very fast algorithm can be applied. This algorithm is described in an online article by [Paolo Butera](https://arxiv.org/search/hep-lat?searchtype=author\u0026amp;query=Butera%2C+P) and [Mario Pernici](https://arxiv.org/search/hep-lat?query=pernici%2C+m\u0026amp;searchtype=author) called [Sums of permanental minors using Grassmann algebra](https://arxiv.org/abs/1406.5337). Accompanying software (Python programs) is available at [github](https://github.com/pernici/hobj). This software has been rewritten for the [SageMath](http://www.sagemath.org) system, and you can read about it in the [documentation](https://doc.sagemath.org/html/en/reference/matrices/sage/matrix/matrix_misc.html#sage.matrix.matrix_misc.permanental_minor_polynomial). The algorithm as described by Butera and Pernici, and as implemented in Sage, actually produces a generating function. Our intention here is to investigate a simpler version, which computes the permanent only. *** Basic outline Let $M$ be an $n\\times n$ square matrix, and consider the polynomial ring on $n$ variables $x_1,x_2,\\ldots,x_n$. Each row of the matrix will correspond to an element of this ring; in particular row $i$ will correspond to \\[ \\sum_{j=1}^nm_{ij}a_j=m_{i1}a_1+m_{i2}a_2+\\cdots+m_{in}a_n. \\] Suppose further that all the generating elements $x_i$ are nilpotent of order two, so that $x_i^2=0$. Now if we take all the row polynomials and multiply them, each term of the product will have order $n$. But by nilpotency, all terms which contain a repeated element will vanish. The result will be only those terms which contain each generator exactly once, of which there will be $n!$. To obtain the permanent all that is required is to set $x_i=1$ for each generator. Here\u0026#39;s an example in Sage. #+BEGIN_SRC Python sage: R.\u0026lt;a,b,c,d,e,f,g,h,i,x1,x2,x3\u0026gt; = PolynomialRing(QQbar) sage: M = matrix([[a,b,c],[d,e,f],[g,h,i]]) sage: X = matrix([[x1],[x2],[x3]]) sage: MX = M*X [a*x1 + b*x2 + c*x3] [d*x1 + e*x2 + f*x3] [g*x1 + h*x2 + i*x3] #+END_SRC To implement nilpotency, it\u0026#39;s easiest to reduce modulo the ideal defined by $x_i^2=0$ for all $i$. So we take the product of those row elements, and reduce: #+BEGIN_SRC Sage sage: I = R.ideal([x1^2, x2^2, x3^2]) sage: pr = MX[0,0]*MX[1,0]*MX[2,0] sage: pr.reduce(I) c*e*g*x1*x2*x3 + b*f*g*x1*x2*x3 + c*d*h*x1*x2*x3 + a*f*h*x1*x2*x3 + b*d*i*x1*x2*x3 + a*e*i*x1*x2*x3 #+END_SRC Finally, set each generator equal to 1: #+BEGIN_SRC Sage sage: pr.reduce(I).subs({x1:1, x2:1, x3:1}) c*e*g + b*f*g + c*d*h + a*f*h + b*d*i + a*e*i #+END_SRC and this is indeed the permanent for a general $3\\times 3$ matrix. *** Some experiments Let\u0026#39;s experiment now with the matrices we\u0026#39;ve seen in a [previous post](https://numbersandshapes.net/post/permanents_and_rysers_algorithm/), which contain three consecutive super-diagonals of ones, and the rest zero. Such a matrix is easy to set up in Sage: #+BEGIN_SRC Sage sage: n = 10 sage: v = n*[0]; v[1:4] = [1,1,1] sage: M = matrix.circulant(v) sage: M [0 1 1 1 0 0 0 0 0 0] [0 0 1 1 1 0 0 0 0 0] [0 0 0 1 1 1 0 0 0 0] [0 0 0 0 1 1 1 0 0 0] [0 0 0 0 0 1 1 1 0 0] [0 0 0 0 0 0 1 1 1 0] [0 0 0 0 0 0 0 1 1 1] [1 0 0 0 0 0 0 0 1 1] [1 1 0 0 0 0 0 0 0 1] [1 1 1 0 0 0 0 0 0 0] #+END_SRC Similarly we can define the polynomial ring: #+BEGIN_SRC Sage sage: R = PolynomialRing(QQbar,x,n) sage: R.inject_variables() Defining x0, x1, x2, x3, x4, x5, x6, x7, x8, x9 #+END_SRC And now the polynomials corresponding to the rows: #+BEGIN_SRC Sage sage: MX = M*matrix(R.gens()).transpose() sage: MX [x1 + x2 + x3] [x2 + x3 + x4] [x3 + x4 + x5] [x4 + x5 + x6] [x5 + x6 + x7] [x6 + x7 + x8] [x7 + x8 + x9] [x0 + x8 + x9] [x0 + x1 + x9] [x0 + x1 + x2] #+END_SRC If we multiply them, we will end up with a huge expression, far too long to display: #+BEGIN_SRC Sage sage: pr = prod(MX[i,0] for i in range(n)) sage: len(pr.monomials) 14103 #+END_SRC We /could/ reduce this by the ideal, but that would be slow. Far better to reduce after each separate multiplication: #+BEGIN_SRC Sage sage: I = R.ideal([v^2 for v in R.gens()]) sage: p = R.one() sage: for i in range(n): p = p*MX[i,0] p = p.reduce(I) sage: p.subs({v:1 for v in R.gens()) 125 #+END_SRC The answer is almost instantaneous. We can repeat the above list of commands starting with different values of =n=; for example with n=20 the result is 15129, as we expect. This is not yet optimal; for n=20 on my machine the final loop takes about 7.8 seconds. Butera and Pernici show that the multiplication and setting the variables to one can sometimes be done in the opposite order; that is, some variables can be identified to be set to one /before/ the multiplication. This can speed the entire loop dramatically, and this optimization has been included in the Sage implementation. For details, see their paper. * The size of the universe :science:astronomy: :PROPERTIES: :EXPORT_FILE_NAME: the_size_of_the_universe :EXPORT_HUGO_CUSTOM_FRONT_MATTER: :mathjax true :EXPORT_DATE: 2020-01-02 :END: As a first blog post for 2020, I\u0026#39;m dusting off one from my previous blog, which I\u0026#39;ve edited only slightly. --- I\u0026#39;ve been looking up at the sky at night recently, and thinking about the sizes of things. Now it\u0026#39;s all very well to say something is for example a million kilometres away; that\u0026#39;s just a number, and as far as the real numbers go, a pretty small one (all finite numbers are \u0026#34;small\u0026#34;). The difficulty comes in trying to marry very large distances and times with our own human scale. I suppose if you\u0026#39;re a cosmologist or astrophysicist this is trivial, but for the rest of us it\u0026#39;s pretty daunting. It\u0026#39;s all a problem of scale. You can say the sun has an average distance of 149.6 million kilometres from earth (roughly 93 million miles), but how big, really, is that? I don\u0026#39;t have any sense of how big such a distance is: my own sense of scale goes down to about 1mm in one direction, and up to about 1000km in the other. This is hopelessly inadequate for cosmological measurements. So let\u0026#39;s start with some numbers: - Diameter of Earth: 12,742 km - Diameter of the moon: 3,475km - Diameter of Sun: 1,391,684 km - Diameter of Jupiter: 139,822 km - Average distance of Earth to the sun: 149,597,870km - Average distance of Jupiter to the sun: 778.5 million km - Average distance of Earth to the moon: 384,400 km Of course since all orbits are elliptical, distances will both exceed and be less than the average at different times. However, for our purposes of scale, an average is quite sufficient. By doing a bit of division, we find that the moon is about 0.27 the width of the earth, Jupiter is about 11 times bigger (in linear measurements) and the Sun about 109.2 times bigger than the Earth. Now for some scaling. We will scale the earth down to the size of a mustard seed, which is about 1mm in diameter. On this scale, the Sun is about the size of a large grapefruit (which happily is large, round, and yellow), and the moon is about the size of a dust mite: On this new scale, with 12742 km equals 1 millimetre, the above distances become: - Diameter of Earth: 1mm - Diameter of the moon: 0.27mm - Diameter of Sun: 109.2m - Diameter of Jupiter: 10.97mm - Average distance of Earth to the sun: 11740mm = 11.74m - Average distance of Jupiter to the sun: 61097.2mm = 61.1m - Average distance of Earth to the moon: 30.2mm = 3cm So how long is the distance from the sun to the Earth? Well, a cricket pitch is 22 yards long, so 11 yards from centre to end, which is about 10.1 metres. So imagine our grapefruit placed at the centre of a cricket pitch. Go to an end of the pitch, and about 1.5 metres (about 5 feet) beyond. Place the mustard seed there. What you now have is a scale model of the sun and earth. Here\u0026#39;s a cricket pitch to give you an idea of its size: #+ATTR_HTML: :width 600 [[file:/Pollock_to_Hussey2.jpg]] Note that in this picture, the yellow circle is not drawn to size. If you look just left of centre, you\u0026#39;ll see the cricket ball, which has a diameter of about 73mm. Our \u0026#34;Sun\u0026#34; grapefruit should be about half again as wide as that. If you don\u0026#39;t have a sense of a cricket pitch (even with this picture), consider instead a tennis court: the distance from the net to the baseline is 39 feet, or 11.9m. At our scale, this is nearly exact: #+ATTR_HTML: :width 600 [[file:/tennis_court_annotated.png]] (Note that on this scale the sun is somewhat bigger than a tennis ball, and the Earth would in fact be too small to see on this picture.) So we now have a scale model of the Sun and Earth. If we wanted to include the Moon, start with its average distance from Earth (384,400 km), then we\u0026#39;d have a dust mite circling our mustard seed at a distance of 3cm. How about Jupiter? Well, we noted before that it is about 61m away. Continuing with our cricket pitch analogy, imagine three pitches laid end to end, which is 66 yards, or 60.35 metres. Not too far off, really! So place the grapefruit at the end of the first pitch, the mustard seed a little away from centre, and at the end of the third pitch place an 11mm ball for Jupiter: a glass marble will do nicely for this. And the size of the solar system? Assuming the edge is given by the [[http://en.wikipedia.org/wiki/Heliosphere#Heliopause][heliopause]] (where the Sun\u0026#39;s solar wind is slowed down by interstellar particles); this is at a distance of about 18,100,000,000 km from the Sun, which in our scale is about 1.42 km, or a bit less than a mile (0.88 miles). Get that? With Earth the size of a mustard seed, the edge of the solar system is nearly a mile away! *** Onwards and outwards So with this scaling we have got the solar system down to a reasonably manageable size. If 149,600,000 km seems too vast a distance to make much sense of, scaling it down to 11.7 metres is a lot easier. But let\u0026#39;s get cosmological here, and start with a light year, which is 9,460,730,472,580,800 m, or more simply (and inexactly) 9.46\\times 10^{15}m. In our scale, that becomes 742,483,948.562 mm, or about 742 km, which is about 461 miles. That\u0026#39;s about the distance from New York city to Greensboro, NC, or from Melbourne to Sydney. The nearest star is [[http://en.wikipedia.org/wiki/Proxima_Centauri][Proxima Centauri]], which is 4.3 light years away: at our Earth=mustard seed scale, that\u0026#39;s about 3192.6 km, or 1983.8 miles. This is the flight distance from Los Angeles to Detroit. Look at that distance on an atlas, imagine our planet home mustard seed at one place and consider getting to the other. The furthest any human has been from the mustard seed Earth is to the dust-mite Moon: 3cm, or 1.2 inches away. To get to the nearest star is, well, a /lot/ further! The [[http://en.wikipedia.org/wiki/Canis_Major_Overdensity][nearest galaxy]] to the Milky Way is about 0.025 mly away. (\u0026#34;mly\u0026#34; = \u0026#34;millions of light years\u0026#34;). Now we\u0026#39;re getting into the big stuff. At our scale, this distance will be 18,500,000 kilometres, which means that at our mustard seed scale, the nearest galaxy is about 18.5 /million/ kilometres away. And there are lots of other galaxies, and much further away than this. For example, the [[http://en.wikipedia.org/wiki/Andromeda_Galaxy][Andromeda Galaxy]] is 2,538,000 light years away, which at our scale is 1,884,465,000 km -- nearly /two billion/ kilometres! What\u0026#39;s remarkable is that even scaling the Earth down to a tiny mustard seed speck, we are still up against distances too vast for human scale. We could try scaling the Earth down to a ball whose diameter is the thickness of the finest human hair -- about 0.01 mm -- which is the smallest distance within reach of our own scale. But even at this scale distances are only reduced by a factor of 100, so the nearest galaxy is still 18,844,650 km away. One last try: suppose we scale the entire Solar System, out to the heliopause, down to a mustard seed. This means that the diameter of the heliopause: 36,200,000,000 km, is scaled down to 1mm. Note that the heliopause is about three times further away from the sun than the mean distance of Pluto. At this scale, one light year is a happily manageable 261mm, or about ten and a quarter inches. So the nearest star is 1.12m away, or about 44 inches. And the nearest galaxy? Well, it\u0026#39;s 25000 light years away, which puts it at about 6.5 km. The Andromeda Galaxy is somewhat over 663 km away. The furthest galaxy, with the enticing name of [GN-z11](https://en.wikipedia.org/wiki/GN-z11) is said to be about 34 billion light years away. On our heliopause=mustard seed scale, that\u0026#39;s about /9.1 million kilometres/. There\u0026#39;s no escaping it, the Universe is big, and the scales need to describe it, no matter how you approach them, quickly leap out of of our own human scale. * Permanents and Ryser\u0026#39;s algorithm :mathematics:computation:julia :PROPERTIES: :EXPORT_FILE_NAME: permanents_and_rysers_algorithm :EXPORT_HUGO_CUSTOM_FRONT_MATTER: :mathjax true :EXPORT_DATE: 2019-12-22 :END: As I discussed in my last blog post, the /permanent/ of an $n\\times n$ matrix $M=m_{ij}$ is defined as \\[ \\text{per}(M)=\\sum_{\\sigma\\in S_n}\\prod_{i=1}^nm_{i,\\sigma(i)} \\] where the sum is taken over all permutations of the $n$ numbers $1,2,\\ldots,n$. It differs from the better known determinant in having no sign changes. For example: $$\\text{per} \\begin{bmatrix} a\u0026amp;b\u0026amp;c\\\\ d\u0026amp;e\u0026amp;f\\\\ g\u0026amp;h\u0026amp;i \\end{bmatrix} =aei+afh+bfg+bdi+cdi+ceg.$$\nBy comparison, here is the determinant:\n$$\\text{det} \\begin{bmatrix} a\u0026amp;b\u0026amp;c\nd\u0026amp;e\u0026amp;f\ng\u0026amp;h\u0026amp;i\n\\end{bmatrix} =aei - afh + bfg - bdi + cdi - ceg.$$\nThe apparent simplicity of the permanent definition hides the fact that there is no known sub-exponential algorithm to compute it, nor does it satisfy most of the nice properties of determinants. For example, we have \\[ \\text{det}(AB)=\\text{det}(A)\\text{det}(B) \\]\nbut in general $\\text{per}(AB)\\ne\\text{per}(A)\\text{per}(B)$. Nor is the permanent zero if two rows are equal, or if any subset of rows is linearly dependent.\nApplying the definition and summing over all the permutations is prohibitively slow; of $O(n!n)$ complexity, and unusable except for very small matrices.\nIn the small but excellent textbook [\u0026#34;Combinatorial Mathematics\u0026#34;](https://bookstore.ams.org/car-14/) by Herbert J. Ryser and published in 1963, one chapter is devoted to the inclusion-exclusion principle, of which the computation of permanents is given as an example. The permanent may be considered as a sum of products, where in each product we choose one value from each row and one value from each column.\nSuppose we start by adding the rows together and multiplying them: \\[ P = (a+d+g)(b+e+h)(c+f+i). \\] This will certainly contain all elements of the permanent, but it also includes products we don\u0026#39;t want, such as for example $aef$ where the elements are chosen from only two rows, and $ghi$ where the elements are all in one row.\nTo eliminate all possible products from only two rows we subtract them:\n$$\\begin{aligned} P \u0026amp; -(a+d)(b+e)(c+f)\\text{ rows $1$ and $2$}\\\\ \u0026amp; - (a+g)(b+h)(c+i)\\text{ rows $1$ and $3$}\\\\ \u0026amp; - (d+g)(e+h)(f+i)\\text{ rows $2$ and $3$} \\end{aligned}$$\nThe trouble with that subtraction is that we are subtracting products of each individual row twice; for example $abc$ is in the first and second products. But we only want to subtract those products once from $P$. So we have to add them again:\n$$\\begin{aligned} P \u0026amp; -(a+d)(b+e)(c+f)\\qquad\\text{ rows $1$ and $2$}\\\\ \u0026amp; - (a+g)(b+h)(c+i)\\qquad\\text{ rows $1$ and $3$}\\\\ \u0026amp; - (d+g)(e+h)(f+i)\\qquad\\text{ rows $2$ and $3$}\\\\ \u0026amp; + abc\\\\ \u0026amp; + def\\\\ \u0026amp; + ghi. \\end{aligned}$$\nComputing the permanent of a $4\\times 4$ matrix would start by adding all the rows and multiplying the sums. Then we would subtract all products of rows taken three at a time. But this would subtract all products of rows taken two at a time twice for each pair, so we add those products back in again. Finally we find that the number of times we\u0026#39;ve subtracted products of a single row have cancelled out, so we need to subtract them again:\n$$\\begin{array}{ccccc} \u0026amp;1\u0026amp;2\u0026amp;3\u0026amp;4\\\\ -\u0026amp;1\u0026amp;2\u0026amp;3\u0026amp;\\\\ -\u0026amp;1\u0026amp;2\u0026amp;\u0026amp;4\\\\ -\u0026amp;1\u0026amp;\u0026amp;3\u0026amp;4\\\\ -\u0026amp;\u0026amp;2\u0026amp;3\u0026amp;4\\\\ +\u0026amp;1\u0026amp;2\u0026amp;\u0026amp;\\\\ +\u0026amp;1\u0026amp;\u0026amp;3\u0026amp;\\\\ +\u0026amp;1\u0026amp;\u0026amp;\u0026amp;4\\\\ +\u0026amp;\u0026amp;2\u0026amp;3\u0026amp;\\\\ +\u0026amp;\u0026amp;2\u0026amp;3\u0026amp;\\\\ +\u0026amp;\u0026amp;\u0026amp;3\u0026amp;4\\\\ -\u0026amp;1\\\\ -\u0026amp;\u0026amp;2\\\\ -\u0026amp;\u0026amp;\u0026amp;3\\\\ -\u0026amp;\u0026amp;\u0026amp;\u0026amp;4 \\end{array}$$\nAfter all of this we are left with only those products which include all fours rows and columns. And as you see, this is a standard inclusion-exclusion approach.\nFor an $n\\times n$ matrix, let $S=\\{1,2,\\ldots,n\\}$, and let $X\\subseteq S$. Then define $R(X)$ to be the product of the sums of elements of rows indexed by $X$. For example, with $S=\\{1,2,3\\}$ and $X=\\{1,3\\}$, then\n\\[ R(X)=(a+g)(b+h)(c+i). \\] We can thus write the above method for obtaining the permanent as:\n$$\\begin{aligned} \\text{per}(M)\u0026amp;=\\sum_{\\emptyset\\ne X\\subseteq S}(-1)^{n-|X|}R(X)\\\\ \u0026amp;= (-1)^n\\sum_{\\emptyset\\ne X\\subseteq S}(-1)^{|X|}R(X) \\end{aligned}$$\nThis is Ryser\u0026#39;s algorithm.\nNaive Implementation A naive implementation would be to simply iterate through all the non-empty subsets $X$ of $S=\\{1,2,\\ldots,n\\}$, and for each subset add those rows, and multiply the resulting sums.\nHere\u0026#39;s one such Julia function:\nusing Combinatorics function perm1(M) n, nc = size(M) S = 1:n P = 0 for X in setdiff(powerset(S),powerset([])) P += (-1)^length(X)*prod(sum(M[i,:] for i in X)) end return((-1)^n * P) end Alternatively, we can manage without any extra packages, and use the fact that the subsets of $S$ correspond to the binary digits of integers between 1 and $2^n-1$:\nfunction perm2(M) n,nc = size(m) P = 0 for i in (1:2^n-1) indx = digits(i,base=2,pad=n) P += (-1)^sum(indx)*prod(sum(M .* indx,dims=1)) end return((-1)^n * P) end Now for some tests. There are very few matrices for which the permanent has a known value; however there are some circulant matrices of zeros and ones whose permanent is known. One such is the $n\\times n$ matrix $M=m_{ij}$ whose first, second, and third circulant superdiagonals are ones; that is, for which\n\\[ m_{ij}=1 \\Leftrightarrow\\bmod(j-1,n)\\in\\{1,2,3\\}. \\]\n(Note that since the permanent is trivially unchanged by any permutation of the rows, $M$ can be also defined as being a circulant matrix each row of which has three consecutive ones.)\nThen \\[ \\text{per}(M)=F_{n-1}+F_{n+1}+2 \\]\nwhere $F_k$ is the $k$ -th Fibonacci number indexed from 1, so that $F_1=F_2=1$. Note that the sums of Fibonacci numbers whose indices differ by two form the Lucas numbers $L_n$.\nAlternatively,\n\\[ \\text{per}(M)=\\text{trace}(C_2^n)+2 \\]\nwhere\n$$C_2=\\begin{bmatrix} 0\u0026amp;1\\\\ 1\u0026amp;1 \\end{bmatrix}$$\nThis result, and some others, can be found in the article \u0026#34;Permanents\u0026#34; by Marvin Marcus and Henryk Minc, in The American Mathematical Monthly, Vol. 72, No. 6 (Jun. - Jul., 1965), pp. 577-591.\njulia\u0026gt; n = 10; julia\u0026gt; M = [1*(mod(j-i,n) in [1,2,3]) for i=1:n, j=1:n] 10×10 Array{Int64,2}: 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 julia\u0026gt; perm1(M) 125 julia\u0026gt; perm2(M) 125 julia\u0026gt; fibonaccinum(n-1)+fibonaccinum(n+1)+2 125 julia\u0026gt; lucasnum(n)+2 125 julia\u0026gt; C2=[0 1; 1 1]; julia\u0026gt; tr(C2^n)+2 125 However, it seems that using the machinery of subsets adds to the time, which becomes noticeable if $n$ is large:\njulia\u0026gt; using Benchmarktools julia\u0026gt; n = 20; M = [1*(mod(j-i,n) in [1,2,3]) for i=1:n, j=1:n]; julia\u0026gt; @time perm1(M) 6.703187 seconds (31.46 M allocations: 5.097 GiB, 17.93% gc time) 15129 julia\u0026gt; @time perm2(M) 1.677242 seconds (3.21 M allocations: 3.721 GiB, 8.69% gc time) 15129 That is, the perm2 implementation is about four times faster than perm1.\nImplementation with Gray codes Here is where we can show some cleverness. Recall that a Gray code is a listing of all numbers 0 through $2^n-1$ whose binary expansion changes in only one bit between consecutive values. It is also known that for $1\\le k\\le 2^n-1$ then the Gray code corresponding to $k$ is given by $k\\oplus (k\\gg 1)$. For example, for $n=4$:\njulia\u0026gt; for i in 1:15 j = xor(i,i\u0026gt;\u0026gt;1) println(lpad(i,2),\u0026#34;:\\t\\b\\b\\b\u0026#34;,bin(i,4),\u0026#39;\\t\u0026#39;,lpad(j,2),\u0026#34;:\\t\\b\\b\\b\u0026#34;,bin(j,4)) end 1: [1, 0, 0, 0] 1: [1, 0, 0, 0] 2: [0, 1, 0, 0] 3: [1, 1, 0, 0] 3: [1, 1, 0, 0] 2: [0, 1, 0, 0] 4: [0, 0, 1, 0] 6: [0, 1, 1, 0] 5: [1, 0, 1, 0] 7: [1, 1, 1, 0] 6: [0, 1, 1, 0] 5: [1, 0, 1, 0] 7: [1, 1, 1, 0] 4: [0, 0, 1, 0] 8: [0, 0, 0, 1] 12: [0, 0, 1, 1] 9: [1, 0, 0, 1] 13: [1, 0, 1, 1] 10: [0, 1, 0, 1] 15: [1, 1, 1, 1] 11: [1, 1, 0, 1] 14: [0, 1, 1, 1] 12: [0, 0, 1, 1] 10: [0, 1, 0, 1] 13: [1, 0, 1, 1] 11: [1, 1, 0, 1] 14: [0, 1, 1, 1] 9: [1, 0, 0, 1] 15: [1, 1, 1, 1] 8: [0, 0, 0, 1] Note that in the rightmost column of binary expansions, there is only one bit shift between consecutive values: either a single 1 is added, or removed.\nFor Ryser\u0026#39;s algorithm, we can consider this in terms of our sum of rows: if the subsets are given in Gray code order, then moving from one subset to the next is a matter of just adding or subtracting one row from the current sum. We are not so much interested in the codes themselves, as in their successive differences. For example here are the differences for $n=4$:\njulia\u0026gt; [xor(k,k\u0026gt;\u0026gt;1)-xor(k-1,(k-1)\u0026gt;\u0026gt;1) for k in 1:15]\u0026#39; 1×14 Adjoint{Int64,Array{Int64,1}}: 1 2 -1 4 1 -2 -1 8 1 2 -1 -4 1 -2 -1 This sequence is [A055975](https://oeis.org/A055975) in the [Online Encyclopaedia of Integer Sequences](https://oeis.org), and can be computed as\n\\[ a[n] = \\begin{cases} n,\u0026amp;\\text{if $n\\le 2$}\n2a[n/2],\u0026amp;\\text{if $n$ is even}\n(-1)(n-1)/2,\u0026amp;\\text{if $n$ is odd}\n\\end{cases} \\]\nWe can interpret each term in this sequence as the row that should be added or subtracted from the current sum. A value of $2^m$ means adding row $m+1$; a value of $-2^m$ means subtracting row $m+1$. From any difference, we can obtain the bit position by taking the logarithm to base 2, and whether to add or subtract by its sign. But in fact the number of differences is very small: only $2n$, so it will be much easier to create a lookup table, using a dictionary:\nv = vcat([(2^i,i+1,1) for i in 0:n-1],[(-2^i,i+1,-1) for i in 0:n-1]) lut = Dict(x[1] =\u0026gt; (x[2],x[3]) for x in v) For example, for $n=3$ the lookup dictionary is\n4 =\u0026gt; (3, 1) -4 =\u0026gt; (3, -1) 2 =\u0026gt; (2, 1) -2 =\u0026gt; (2, -1) -1 =\u0026gt; (1, -1) 1 =\u0026gt; (1, 1) Now we can implement the algorithm by first pre-computing the sequences of differences, and for each element of the sequence, use the lookup dictionary to determine what row is to be added or subtracted. We start with the first row.\nPutting all this together gives another implementation of the permanent:\nfunction perm3(M) # Gray code version with lookup tables n,nc = size(M) if n != nc error(\u0026#34;Matrix must be square\u0026#34;) end gd = zeros(Int64,1,2^n-1) gd[1] = 1 v = vcat([(2^i,i+1,1) for i in 0:n-1],[(-2^i,i+1,-1) for i in 0:n-1]) lut = Dict(x[1] =\u0026gt; (x[2],x[3]) for x in v) r = M[1,:] # r will contain the sum of rows s = -1 pm = s*prod(r) for i in (2:2^n-1) if iseven(i) gd[i] = 2*gd[div(i,2)] else gd[i] = (-1)^((i-1)/2) end r += M[lut[gd[i]][1],:]*lut[gd[i]][2] s *= -1 pm += s*prod(r) end return(pm * (-1)^n) end This can be timed as before:\njulia\u0026gt; @time perm3(M) 0.943328 seconds (3.15 M allocations: 728.004 MiB, 15.91% gc time) 15129 This is our best time yet.\nSpeeds of Julia and Python\u0026#xa0;\u0026#xa0;\u0026#xa0;programming\u0026#xa0;python\u0026#xa0;julia Introduction [Python](https://en.wikipedia.org/wiki/Python_(programming_language)) is of course one of the world\u0026#39;s currently most popular languages, and there are plenty of [statistics](https://www.infoworld.com/article/3401536/python-popularity-reaches-an-all-time-high.html) to show it. Of all languages in current use, Python is one of the oldest (in the very quick time-scale of programming languages) dating from 1990 - only C and its variants are older. However, it seems to keep its eternal youth by being re-invented, and by its constantly increasing libraries. Indeed, one of Python\u0026#39;s greatest strength is its libraries, and pretty much every Python user will have worked with [numpy](https://numpy.org), [scipy](https://www.scipy.org/scipylib/index.html), [matplotlib](https://matplotlib.org), [pandas](https://pandas.pydata.org), to name but four. In fact, aside from some specialized applications (mainly involving security, speed, or memory) Python can be happily used for almost everything.\n[Julia](https://en.wikipedia.org/wiki/Julia_(programming_language)) on the other hand is newer, dating from 2012. (Only Swift is newer.) It was designed to have the speed of C, the power of Matlab, and the ease of use of Python. Note the comparison with Matlab - Julia was designed as a language for technical computing, although it is very much a general purpose language. It can even be used for [low-level systems programming](https://web.archive.org/web/20181105083419/http://juliacon.org/2018/talks_workshops/42/).\nLike Python, Julia can be extended through packages, of which there are many: according to Julia\u0026#39;s [package repository](https://pkg.julialang.org/docs/) there are 2554 at the time of writing. Some of the packages are big, mature, and robust, others are smaller or represent a niche interest. You can go to [Julia Observer](https://juliaobserver.com) to get a sense of which packages are the most popular, largest, have the most commits on github, and so on. Because Julia is still relatively new, packages are still being actively developed. However, some such as [Plots](https://github.com/JuliaPlots/Plots.jl), [JuMP](https://github.com/JuliaOpt/JuMP.jl) for optimization, [Differential Equations](https://github.com/JuliaDiffEq/DifferentialEquations.jl), to name but three, are very much ready for the Big Time. The purpose of this post is to do a single comparison of Julia and Python for speed.\nMatrix permanents Given a square matrix, its [determinant](https://en.wikipedia.org/wiki/Determinant) is a well-known and useful construct (in spite of [Sheldon Axler](http://www.axler.net/DwD.html)).\nThe determinant of an $n\\times n$ matrix $M=m_{ij}$ can be formally defined as\n\\[ \\det(M)=\\sum_{\\sigma\\in S_n}\\left(\\text{sgn}(\\sigma)\\prod_{i=1}^nm_{i,\\sigma(i)}\\right) \\] where the sum is taken over all permutations $\\sigma$ of $1,2,\\ldots,n$, and where $\\text{sgn}(\\sigma)$ is the sign of the permutation; which is defined in terms of the number of digit swaps to get to it: an even number of swaps has a sign of 1, and an odd number a sign of $-1$. The determinant can be effectively computed by Gaussian elimination of a matrix into triangular form, which takes in the order of $n^3$ operations; the determinant is then the product of the diagonal elements.\nThe [permanent](https://en.wikipedia.org/wiki/Permanent_(mathematics)) is defined similarly, except for the sign:\n\\[ \\text{per}(M)=\\sum_{\\sigma\\in S_n}\\prod_{i=1}^nm_{i,\\sigma(i)}. \\]\nRemarkably enough, this simple change renders the permanent impossible to be computed effectively; all known algorithms have exponential orders. Computing by expanding each permutation takes $O(n!n)$ operations, some better algorithms (such as [Ryser\u0026#39;s algorithm](https://en.wikipedia.org/wiki/Permanent_(mathematics))) have order $O(2^{n-1}n)$. The permanent has some applications, although not as many as the determinant. An easy and immediate result is that if $M$ is a matrix consisting entirely of ones, except for the main diagonal of zeros (so that it is the \u0026#34;ones complement\u0026#34; of the identity matrix), its permanent is the number of derangements of $n$ objects; that is, the number of permutations in which there are no fixed points.\nFirst Python. Here is a simple program, saved as permanent.py to compute the permanent from its definition:\nimport itertools as it import numpy as np def permanent(m): nr,nc = np.shape(m) if nr != nc: raise ValueError(\u0026#34;Matrix must be square\u0026#34;) pm = 0 for p in it.permutations(range(nr)): pm += np.product([m[i,p[i]] for i in range(nr)]) return pm I am not interested in optimizing speed; simply to implement the same algorithm in Python and Julia to see what happens. Now lets run this in a Python REPL (I\u0026#39;m using IPython here):\nIn [1]: import permanent as pt In [2]: import numpy as np In [3]: M = (1 - np.identity(4)).astype(np.intc) In [4]: pt.permanent(M) Out[4]: 9 and this is correct. This result was practically instantaneous, but it slows down appreciably, as you\u0026#39;d expect, for larger matrices:\nIn [5]: from timeit import default_timer as timer In [6]: M = (1 - np.identity(8)).astype(np.intc) In [7]: t = timer();print(pt.permanent(M));timer()-t 14833 Out[7]: 0.7398275199811906 In [8]: M = (1 - np.identity(9)).astype(np.intc) In [9]: t = timer();print(pt.permanent(M));timer()-t 133496 Out[9]: 10.244881154998438 In [10]: M = (1 - np.identity(10)).astype(np.intc) In [11]: t = timer();print(pt.permanent(M));timer()-t 1334961 Out[11]: 86.57762016600464 Now no doubt this could be speeded up in numerous ways, but that is not my point: I am simply implementing the same algorithm in each language. At any rate, my elementary program becomes effectively unusable for matrices bigger than about $8\\times 8$.\nNow for Julia. Again, we start with a simple program:\nusing Combinatorics function permanent(m) nr,nc = size(m) if nr != nc error(\u0026#34;Matrix must be square\u0026#34;) end pm = 0 for p in permutations(1:nr) pm += prod(m[i,p[i]] for i in 1:nr) end return(pm) end You can see this program and the Python one above are, to all intents and purposes, identical. There are no clever optimizing tricks, it is a raw implementation of the basic definition.\nFirst, a quick test:\njulia\u0026gt; using LinearAlgebra julia\u0026gt; M = 1 .- Matrix(1I,4,4); julia\u0026gt; include(\u0026#34;permanent.jl\u0026#34;) julia\u0026gt; permanent(M) 9 So far, so good. Now for some time trials:\njulia\u0026gt; using BenchmarkTools julia\u0026gt; M = 1 .- Matrix(1I,8,8); julia\u0026gt; @time permanent(M) 0.020514 seconds (201.61 k allocations: 14.766 MiB) 14833 julia\u0026gt; M = 1 .- Matrix(1I,9,9); julia\u0026gt; @time permanent(M) 0.245049 seconds (1.81 M allocations: 143.965 MiB, 33.73% gc time) 133496 julia\u0026gt; M = 1 .- Matrix(1I,10,10); julia\u0026gt; @time permanent(M) 1.336724 seconds (18.14 M allocations: 1.406 GiB, 3.20% gc time) 1334961 You\u0026#39;ll see that Julia, thanks to its JIT compiler, is much much faster than Python. The point is that I didn\u0026#39;t have to do anything here to access that speed, it\u0026#39;s just a splendid part of the language.\nWinner: Julia, by a country mile.\nA few words at the end The timings given above are not absolute - running on a different system or with different versions of Python, Julia, and their libraries, will give different results. But the point is not the exact times taken, but the comparison of time between Julia and Python. For what it\u0026#39;s worth, I\u0026#39;m running a fairly recently upgraded version of Arch Linux on a Lenovo Thinkpad X1 Carbon, generation 3. I\u0026#39;m running Julia 1.3.0 and Python 3.7.4. The machine has 8Gb of memory, of which about 2Gb were free.\nPoles of inaccessibility\u0026#xa0;\u0026#xa0;\u0026#xa0;image_processing\u0026#xa0;julia Just recently there was a news item about a solo explorer being the first Australian to reach the Antarctic \u0026#34;Pole of Inaccessibility\u0026#34;. Such a Pole is usually defined as that place on a continent that is furthest from the sea. The [South Pole](https://en.wikivoyage.org/wiki/South_Pole) is about 1300km from the nearest open sea, and can be reached by specially fitted aircraft, or by tractors and sleds along the 1600km \u0026#34;South Pole Highway\u0026#34; from McMurdo Base. However, it is only about 500km from the nearest coast line on the Ross Ice Shelf. McMurdo Base is situated on the outside of the Ross Ice Shelf, so that it is accessible from the sea.\nThe Southern Pole of Inaccessibility is about 870km further inland from the South Pole, and is very hard to reach—indeed the first people there were a Russian party in 1958, whose enduring legacy is a bust of Lenin at that Pole. Unlike at the South Pole, there is no base or habitation there; just a frigid wilderness. The Southern Pole of Inaccessibility is 1300km from the nearest coast.\nA pole of inaccessibility on any landmass can be defined as the centre of the largest circle that can be drawn entirely within it. You can see all of these for the world\u0026#39;s continents at an [ArcGIS site](https://apl.maps.arcgis.com/apps/MapJournal/index.html?appid=ce19bec7a3c541d0b95c449df9bb8eb5). If you don\u0026#39;t want to wait for the images to load, here is Antarctica:\nYou\u0026#39;ll notice that this map of Antarctica is missing the ice shelves, which fill up most of the bays. If the ice shelves are included, then we can draw a larger circle.\nAs an image processing exercise, I decided to experiment, using the [distance transform](https://en.wikipedia.org/wiki/Distance_transform) to measure distances from the coasts, and [Julia](https://julialang.org) as the language. Although Julia has now been in development for a decade, it\u0026#39;s still a \u0026#34;new kid on the block\u0026#34;. But some of its libraries (known as \u0026#34;packages\u0026#34;) are remarkably mature and robust. One such is its imaging package [Images](https://juliaimages.org/latest/).\nIn fact, we need to install and use several packages as well as Images: Colors, FileIO, ImageView, Plots. These can all be added with Pkg.add(\u0026#34;packagename\u0026#34;) and brought into the namespace with using packagename. We also need an image to use, and for Antarctica I chose this very nice map of the ice surface:\ntaken from a [BBC report](https://www.bbc.com/news/science-environment-21692423). The nice thing about this map is that it shows the ice shelves, so that we can experiment with and without them. We start by reading the image, making it gray-scale, and thresholding it so as to remove the ice-shelves:\njulia\u0026gt; ant = load(\u0026#34;antarctica.jpg\u0026#34;); julia\u0026gt; G = Gray.(ant); julia\u0026gt; B = G .\u0026gt; 0.8; julia\u0026gt; imshow(B) which produces this:\nwhich as you see has removed the ice shelves. Now we apply the distance transform and find its maximum:\njulia\u0026gt; D = distance_transform(feature_transform(B)); julia\u0026gt; findmax(D) (116.48175822848829, CartesianIndex(214, 350)) This indicates that the largest distance from all edges is about 116, at pixel location (214,350). To show this circle on the image it\u0026#39;s easiest to simply plot the image, and then plot the circle on top of it. We\u0026#39;ll also plot the centre, as a smaller circle:\njulia\u0026gt; plot(ant,aspect_ratio = 1) julia\u0026gt; x(t) = 214 + 116*cos(t) julia\u0026gt; x(t) = 350 + 116*sin(t) julia\u0026gt; xc(t) = 214 + 5*cos(t) julia\u0026gt; xc(t) = 350 + 5*sin(t) julia\u0026gt; plot!(y,x,0,2*pi,lw=3,lc=\u0026#34;red\u0026#34;,leg=false) julia\u0026gt; plot!(yc,xc,0,2*pi,lw=2,lc=\u0026#34;white\u0026#34;,leg=false) This shows the Pole of Inaccessibility in terms of the actual Antarctic continent. However, in practical terms the ice shelves, even though not actually part of the landmass, need to be traversed just the same. To include the ice shelves, we just threshold at a higher value:\njulia\u0026gt; B1 = opening(G .\u0026gt; 0.95); julia\u0026gt; D1 = distance_transform(feature_transform(B1)); julia\u0026gt; findmax(D1) (160.22796260328596, CartesianIndex(225, 351)) The use of opening in the first line is to fill in any holes in the image: the distance transform is very sensitive to holes. Then we plot the continent again with the two circles as above, but using this new centre and radius. This produces:\nThe position of the Pole of Inaccessibility has not in fact changed all that much from the first one.\nAn interesting sum\u0026#xa0;\u0026#xa0;\u0026#xa0;mathematics\u0026#xa0;analysis I am not an analyst, so I find the sums of infinite series quite mysterious. For example, here are three. The first one is the value of $\\zeta(2)$, very well known, sometimes called the \u0026#34;Basel Problem\u0026#34; and first determined by (of course) Euler: \\[ \\sum_{n=1}^\\infty\\frac{1}{n^2}=\\frac{\\pi^2}{6}. \\] Second, subtracting one from the denominator: \\[ \\sum_{n=2}^\\infty\\frac{1}{n^2-1}=\\frac{3}{4} \\] This sum is easily demonstrated by partial fractions: \\[ \\frac{1}{n^2-1}=\\frac{1}{2}\\left(\\frac{1}{n-1}-\\frac{1}{n+1}\\right) \\] and so the series can be expanded as: \\[ \\frac{1}{2}\\left(\\frac{1}{1}-\\frac{1}{3}+\\frac{1}{2}-\\frac{1}{4}+\\frac{1}{3}-\\frac{1}{5}\\cdots\\right) \\] This is a telescoping series in which every term in the brackets is cancelled except for $1+1/ 2$, which produces the sum immediately.\nFinally, add one to the denominator: \\[ \\sum_{n=2}^\\infty\\frac{1}{n^2+1}=\\frac{1}{2}(\\pi\\coth(\\pi)-1). \\] And this sum is obtained from one of the [series representations](http://functions.wolfram.com/ElementaryFunctions/Coth/06/05/) for $\\coth(z)$: \\[ \\coth(z)=\\frac{1}{z}+2z\\sum_{n=1}^\\infty\\frac{1}{\\pi^2n^2+z^2} \\]\n(for all $z$ except for when $\\pi^2n^2+z^2=0$).\nI was looking around for infinite series to give my numerical methods students to test their powers of approximation, and I came across [this beauty](https://www.wolframalpha.com/input/?i=sum+1%2F(n^2%2Bn-1)%2C+n%3D1+to+infinity): \\[ \\sum_{n=2}^\\infty\\frac{1}{n^2+n-1}=1+\\frac{\\pi}{\\sqrt{5}}\\tan\\left(\\frac{\\sqrt{5}\\pi}{2}\\right). \\] This led me on a mathematical treasure hunt through books and all over the internet, until I had worked it out. My starting place, after googling \u0026#34;sum quadratic reciprocal\u0026#34; was a [very nice and detailed post on stackexchange](https://math.stackexchange.com/questions/1322086/series-of-reciprocals-of-a-quadratic-polynomial). This post then referred to a [previous one](https://math.stackexchange.com/questions/1294790/the-value-of-sum-n-inftyn-infty-frac1n2-z2-on-mathbbc-setmi/1294829#1294829) started with the infinite product expression for $\\sin(x)$ and turned it (by taking logarithms and differentiating) into a series for $\\cot(x)$.\nHowever, I want an expression for $\\tan(x)$, which means starting with the infinite product form for $\\sec(x)$, which is:\n\\[ \\sec(x)=\\prod_{n=1}^\\infty\\frac{\\pi^2(2n-1)^2}{\\pi^2(2n-1)^2-4x^2}. \\] Making a substitution simplifies the expression in the product: \\[ \\sec\\left(\\frac{\\pi x}{2}\\right)=\\prod_{n=1}^\\infty\\frac{(2n-1)^2}{(2n-1)^2-x^2}. \\] Now take logs of both sides:\n\\[ \\log\\left(\\sec\\left(\\frac{\\pi x}{2}\\right)\\right)= \\sum_{n=1}^\\infty\\log\\left(\\frac{(2n-1)^2}{(2n-1)^2-x^2}\\right) \\]\nand differentiate: \\[ \\frac{\\pi}{2}\\tan\\left(\\frac{\\pi x}{2}\\right)= \\sum_{n=1}^\\infty\\frac{2x}{(2n-1)^2-x^2}. \\] Now we have to somehow equate this new sum on the right with our original sum. So let\u0026#39;s go back to it.\nFirst of all, a bit of completing the square produces \\[ \\frac{1}{n^2+n-1}=\\frac{1}{\\left(n+\\frac{1}{2}\\right)^2-\\frac{5}{4}}=\\frac{4}{(2n+1)^2-5}. \\] This means that \\[ \\sum_{n=1}^\\infty\\frac{1}{n^2+n-1}=\\sum_{n=2}^\\infty\\frac{4}{(2n-1)^2-5}= \\frac{2}{\\sqrt{5}}\\sum_{n=2}^\\infty\\frac{2\\sqrt{5}}{(2n-1)^2-5}. \\] We have changed the index from $n=1$ to $n=2$ which allows the rewriting of $2n+1$ as $2n-1$. This means we are missing a first term. Comparing the final sum with that for $\\tan(x)$ above, we have \\[ \\sum_{n=1}^\\infty\\frac{1}{n^2+n-1}=\\frac{2}{\\sqrt{5}}\\left(\\frac{\\pi}{2}\\tan\\left(\\frac{\\pi \\sqrt{5}}{2}\\right)-\\frac{-\\sqrt{5}}{2}\\right) \\] where the last term is the missing first term: the summand for $n=1$. Simplifying the right hand side produces \\[ \\sum_{n=1}^\\infty\\frac{1}{n^2+n-1}=1+\\frac{\\pi}{\\sqrt{5}}\\tan\\left(\\frac{\\sqrt{5}\\pi}{2}\\right). \\] Note that the above series for $\\tan(x)$ can be obtained directly, using a general technique discussed (for example) in that [fine old text](https://en.wikipedia.org/wiki/A_Course_of_Modern_Analysis): \u0026#34;A Course in Modern Analysis\u0026#34;, by E. T. Whittaker and G. N. Watson. If $f(x)$ has only simple poles $a_n$ with residues $b_n$, then \\[ f(x) = f(0)+\\sum_{n=1}^\\infty\\left(\\frac{1}{x-a_n}+\\frac{1}{a_n}\\right). \\] Expressing a function as a series of such reciprocals is known as [Mittag-Leffler\u0026#39;s theorem](https://en.wikipedia.org/wiki/Mittag-Leffler\u0026#39;s_theorem) and in fact the series for $\\tan(x)$ is given there as one of the examples.\nRunge\u0026#39;s phenomenon in Geogebra\u0026#xa0;\u0026#xa0;\u0026#xa0;mathematics\u0026#xa0;computation\u0026#xa0;geogebra Runge\u0026#39;s phenomenon says roughly that a polynomial through equally spaced points over an interval will wobble a lot near the ends. Runge demonstrated this by fitting polynomials through equally spaced point in the interval $[-1,1]$ on the function \\[ \\frac{1}{1+25x^2} \\] and this function is now known as \u0026#34;Runge\u0026#39;s function\u0026#34;.\nIt turns out that Geogebra can illustrate this extremely well.\nEqually spaced vertices Either open up your local version of Geogebra, or go to http://geogebra.org/graphing. In the boxes on the left, enter the following expressions in turn:\nStart by entering Runge\u0026#39;s function \\[ f(x)=\\frac{1}{1+25x^2} \\] You should now either zoom in, or use the graph settings tool to display $x$ between $-1.5$ and $1.5$. Create a list of $x$ values: \\[ x1 = \\frac{\\\\{-5..5\\\\}}{5} \\] Use those values to create a set of points on the curve: \\[ p1 = (x1,f(x1)) \\] Now create an interpolating polynomial through them: \\[ \\mathsf{Polynomial}[p1(1)] \\] The resulting graph looks like this:\nChebyshev vertices For the purpose of this post, we\u0026#39;ll take the Chebyshev vertices to be those points in the graph whose $x$ coordinates are given by\n\\[ x_k = \\cos\\left(\\frac{k\\pi}{10}\\right) \\] for $k = 0,1,2,\\ldots 10$. These values are more clustered at the ends of the interval.\nIn Geogebra:\nAs before, enter the $x$ values first: \\[ x2 = \\cos(\\frac{\\\\{0..10\\\\}\\cdot\\pi}{10}) \\] Then turn them into a sequence of points on the curve \\[ p2 = (x2,f(x2)) \\] Finally create the polynomial through them: $\\mathsf{Polynomial}(p2(1))$. And this graph looks like this:\nYou\u0026#39;ll notice how better the second polynomial hugs the curve. The issue is even more pronounced with 21 points, either separated by $0.1$, or with $x$ values given by the cosine function again. All we need do is to change the definitions of the $x$ value sequences $x1$ and $x2$ to:\n\\[ \\eqalign{ x1 \u0026amp;= \\frac{\\\\{-10..10\\\\}}{10}\\\\ x2 \u0026amp;= \\cos(\\frac{\\\\{0..20\\\\}\\pi}{20}) } \\]\nIn fact, you can create a slider $1\\le N \\le 20$, say, and then define\n\\[ \\eqalign{ x1 \u0026amp;= \\frac{\\\\{-N..N\\\\}}{N}\\\\ x2 \u0026amp;= \\cos(\\frac{\\\\{0..2N\\\\}\\pi}{2N}) } \\]\nand then see how as $N$ increases, the \u0026#34;Chebyshev\u0026#34; interpolant fits the curve better than the equally spaced interpolant. For $N=20$, the turning points of the equally spaced polynomial have $y$ values as high as $59.78$.\nIntegration Using equally spaced values to create an interpolating polynomial and then integrating that polynomial is [Newton-Cotes integration](https://en.wikipedia.org/wiki/Newton-Cotes_formulas). Runge\u0026#39;s phenomenon shows why it is better to partition the interval into small sub-intervals and apply a low-order rule to each one. For example, with 20 points on the curve, we would be better applying Simpson\u0026#39;s rule to each pair of two sub-intervals, and adding the result. Using a 21-point polynomial is equivalent to a Newton-Cotes rule of order 20, which is far too inaccurate to use.\nWith our curve $f(x)$, and our equal-spaced polynomial $g(x)$, the integrals are\n\\[ \\eqalign{ \\int^1_{-1}\\frac{1}{1+25x^2}\\,dx\u0026amp;=\\frac{2}{5}\\arctan(5)\\approx 0.5493603067780064\\\\ \\int^1_{-1}g(x)\\,dx\u0026amp;\\approx -5.369910417304622 } \\]\nHowever, using the polynomial through the Chebyshev nodes:\n\\[ \\int^1_{-1}h(x)\\approx 0.5498082303389538. \\]\nThe absolute errors between the integral values and the exact values are thus (approximately) $5.92$ and $0.00045$ respectively.\nIntegrating an interpolating polynomial through Chebyshev nodes is one way of implementing [Clenshaw-Curtis quadrature](https://en.wikipedia.org/wiki/Clenshaw-Curtis_quadrature).\nNote that using Simpson\u0026#39;s rule on our 21 points produces a value of 0.5485816035037206, which has absolute error of about $0.0012$.\nFitting the SIR model of disease to data in Python\u0026#xa0;\u0026#xa0;\u0026#xa0;mathematics\u0026#xa0;computation\u0026#xa0;python Introduction and the problem The SIR model for spread of disease was first proposed in 1927 in a collection of three articles in the Proceedings of the Royal Society by [Anderson Gray McKendrick](https://en.wikipedia.org/wiki/Anderson_Gray_McKendrick) and [William Ogilvy Kermack](https://en.wikipedia.org/wiki/William_Ogilvy_Kermack); the resulting theory is known as [Kermack–McKendrick theory](https://en.wikipedia.org/wiki/Kermack%E2%80%93McKendrick_theory); now considered a subclass of a more general theory known as [compartmental models](https://en.wikipedia.org/wiki/Compartmental_models_in_epidemiology) in epidemiology. The three original articles were republished in 1991, in a special issue of the [Bulletin of Mathematical Biology](https://link.springer.com/journal/11538/53/1/page/1).\nThe SIR model is so named because it assumes a static population, so no births or deaths, divided into three mutually exclusive classes: those susceptible to the disease; those infected with the disease, and those recovered with immunity from the disease. This model is clearly not applicable to all possible epidemics: there may be births and deaths, people may be re-infected, and so on. More complex models take these and other factors into account.\nThe SIR model consists of three non-linear ordinary differential equations, parameterized by two growth factors $\\beta$ and $\\gamma$:\n\\begin{eqnarray*} \\frac{dS}{dt}\u0026amp;=\u0026amp;-\\frac{\\beta IS}{N}\\\\ \\frac{dI}{dt}\u0026amp;=\u0026amp;\\frac{\\beta IS}{N}-\\gamma I\\\\ \\frac{dR}{dt}\u0026amp;=\u0026amp;\\gamma I \\end{eqnarray*} Here $N$ is the population, and since each of $S$, $I$ and $R$ represent the number of people in mutually exclusive sets, we should have $S+I+R=N$. Note that the right hand sides of the equations sum to zero, hence \\[ \\frac{dS}{dt}+\\frac{dI}{dt}+\\frac{dR}{dt}=\\frac{dN}{dt}=0 \\] which indicates that the population is constant.\nThe problem here is to see if values of $\\beta$ and $\\gamma$ can be found which will provide a close fit to a well-known epidemiological case study: that of influenza in a British boarding school. This was described in a 1978 issue of the [British Medical Journal](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1603269/pdf/brmedj00115-0064.pdf).\nThis should provide a good test of the SIR model, as it satisfies all of the criteria. In this case there was a total population of 763, and an outbreak of 14 days, with infected numbers as:\nDay: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Infections: 1 3 6 25 73 222 294 258 237 191 125 69 27 11 4 Some years ago, on my old (and now taken off line) blog, I explored using [Julia](https://julialang.org/) for this task, which was managed easily (once I got the hang of Julia syntax). However, that was five years ago, and when I tried to recreate it I found that Julia has changed over the last five years, and my original comments and code no longer worked. So I decided to experiment with Python instead, in which I have more expertise, or at least, experience.\nUsing Python Setup and some initial computation We start by importing some of the modules and functions we need, and define the data from the table above:\nimport matplotlib.pyplot as plt import numpy as np from scipy.integrate import solve_ivp data = [1, 3, 6, 25, 73, 222, 294, 258, 237, 191, 125, 69, 27, 11, 4] Now we enter the SIR system with some (randomly chosen) values of $\\beta$ and $\\gamma$, using syntax conformable with the solver solve_ivp:\nbeta,gamma = [0.01,0.1] def SIR(t,y): S = y[0] I = y[1] R = y[2] return([-beta*S*I, beta*S*I-gamma*I, gamma*I]) We can now solve this system using solve\\_ivp and plot the results:\nsol = solve_ivp(SIR,[0,14],[762,1,0],t_eval=np.arange(0,14.2,0.2)) fig = plt.figure(figsize=(12,4)) plt.plot(sol.t,sol.y[0]) plt.plot(sol.t,sol.y[1]) plt.plot(sol.t,sol.y[2]) plt.plot(np.arange(0,15),data,\u0026#34;k*:\u0026#34;) plt.grid(\u0026#34;True\u0026#34;) plt.legend([\u0026#34;Susceptible\u0026#34;,\u0026#34;Infected\u0026#34;,\u0026#34;Removed\u0026#34;,\u0026#34;Original Data\u0026#34;]) Note that we have used the t_eval argument in our call to solve_ivp which allows us to exactly specify the points at which the solution will be given. This will allow us to align points in the computed values of $I$ with the original data.\nThe output is this plot:\n![First SIR plot](/first_sir_plot.png)\nThere is a nicer plot, with more attention paid to setup and colours, [here](https://scipython.com/book/chapter-8-scipy/additional-examples/the-sir-epidemic-model/). This use of scipy to solve the SIR equations uses the odeint tool. This works perfectly well, but I believe it is being deprecated in favour of solve_ivp.\nFitting the data To find values $\\beta$ and $\\gamma$ with a better fit to the data, we start by defining a function which gives the sum of squared differences between the data points, and the corresponding values of $I$. The problem will then be minimize that function.\ndef sumsq(p): beta, gamma = p def SIR(t,y): S = y[0] I = y[1] R = y[2] return([-beta*S*I, beta*S*I-gamma*I, gamma*I]) sol = solve_ivp(SIR,[0,14],[762,1,0],t_eval=np.arange(0,14.2,0.2)) return(sum((sol.y[1][::5]-data)**2)) As you see, we have just wrapped the SIR definition and its solution inside a calling function whose variables are the parameters of the SIR equations.\nTo minimize this function we need to import another python function first:\nfrom scipy.optimize import minimize msol = minimize(sumsq,[0.001,1],method=\u0026#39;Nelder-Mead\u0026#39;) msol.x with output:\narray([0.00218035, 0.44553886]) To see if this does provide a better fit, we can simply run the solver with these values, and plot them, as we did at the beginning:\nbeta,gamma = msol.x def SIR(t,y): S = y[0] I = y[1] R = y[2] return([-beta*S*I, beta*S*I-gamma*I, gamma*I]) sol = solve_ivp(SIR,[0,14],[762,1,0],t_eval=np.arange(0,14.2,0.2)) fig = plt.figure(figsize=(10,4)) plt.plot(sol.t,sol.y[0],\u0026#34;b-\u0026#34;) plt.plot(sol.t,sol.y[1],\u0026#34;r-\u0026#34;) plt.plot(sol.t,sol.y[2],\u0026#34;g-\u0026#34;) plt.plot(np.arange(0,15),data,\u0026#34;k*:\u0026#34;) plt.legend([\u0026#34;Susceptible\u0026#34;,\u0026#34;Infected\u0026#34;,\u0026#34;Removed\u0026#34;,\u0026#34;Original Data\u0026#34;]) with output:\n![Second SIR plot](/second_sir_plot.png)\nand as you see, a remarkably close fit!\nMapping voting gains between elections\u0026#xa0;\u0026#xa0;\u0026#xa0;voting\u0026#xa0;GIS\u0026#xa0;python So this goes back quite some time to the recent Australian Federal election on May 18. In my own electorate (known formally as a \u0026#34;Division\u0026#34;) of Cooper, the Greens, who until recently had been showing signs of winning the seat, were pretty well trounced by Labor.\nSome background asides First, \u0026#34;Labor\u0026#34; as in \u0026#34;Australian Labor Party\u0026#34; is spelled the American way; that is, without a \u0026#34;u\u0026#34;, even though \u0026#34;labour\u0026#34; meaning work, is so spelled in Australian English. This is because much of Australia\u0026#39;s pre-federal political history has a large American influence; indeed one of the loudest political voices in the 19th century was [King O\u0026#39;Malley](https://en.wikipedia.org/wiki/King_O%27Malley) who was born in 1858 in Kansas, and didn\u0026#39;t come to Australia until he was about 30. He was responsible, amongst other things, for selecting the site for Canberra (the nation\u0026#39;s capital) and in selecting Walter Burley Griffin as its architect. As well, the Australian Constitution shows a large American influence; the constitutions of both countries bear a remarkably close resemblance, and Australia\u0026#39;s parliament is modelled on that of the United States Congress.\nSecond, my electorate was formerly known as \u0026#34;Batman\u0026#34; after [John Batman](https://en.wikipedia.org/wiki/John_Batman), the supposed founder of Melbourne. However, Batman was known in his lifetime as a contemptible figure, and the historical record well bears out the description of him as \u0026#34;the vilest man I have ever known\u0026#34;. He was responsible for the slaughter of indigenous peoples, and his so called \u0026#34;treaty\u0026#34; with the people of the Kulin Nation in exchange for land on what would would become Melbourne is considered invalid. In respect for the local peoples (\u0026#34;who have never ceded sovereignty\u0026#34;), the electorate was renamed last year in honour of [William Cooper](https://en.wikipedia.org/wiki/William_Cooper_(Aboriginal_Australian)), a Yorta Yorta elder, a tireless activist for Aboriginal rights, and the only individual in the world to lodge a formal protest to a German embassy on the occasion of Kristallnacht.\nBack to mapping All I wanted to do was to map the size of Labor\u0026#39;s gains (over the Greens) between the last election in 2016 and this one, at each polling booth in the electorate. For this I used the following Python packages: [matplotlib](https://matplotlib.org/), [pandas](https://pandas.pydata.org/), [geopandas](http://geopandas.org/), [numpy](https://numpy.org/), [cartopy](https://scitools.org.uk/cartopy/docs/latest/). The nice thing about Python, for me at least, is the ease of prototyping, and the availability of packages for just about everything. Indeed, for an amateur programmer like myself, one of the biggest difficulties is finding the right package for the job. There\u0026#39;s a score or more for GIS alone.\nAll information about the election can be downloaded from the [Australian Electorial Commission tallyroom](https://tallyroom.aec.gov.au/HouseDefault-24310.htm). And the GIS information can be obtained also from the [AES](https://www.aec.gov.au/Electorates/gis/gis_datadownload.htm). The Victorian [shapefile](https://en.wikipedia.org/wiki/Shapefile) needs to be unzipped before using.\nThen the map set-up looks like this:\nshp = gpd.read_file(\u0026#39;VicMaps/E_AUGFN3_region.shp\u0026#39;) cooper = shp.loc[shp[\u0026#39;Elect_div\u0026#39;]==\u0026#39;Cooper\u0026#39;] bounds = cooper.geometry.bounds bg = bounds.values[0] # latitude, longitude of extent of region pad = 0.01 # padding for display extent = [bg[0]-pad,bg[2]+pad,bg[1]-pad,bg[3]+pad] The idea of the padding is simply to ensure that the map, once displayed, extends beyond the area of the electorate. The units are degrees of latitude and longitude. In Melbourne, at about 37.8 degrees south, 0.01 degrees latitude corresponds to about 1.11km, and 0.01 degrees longitude corresponds to about 0.88km.\nNow we need to determine the percentage of votes to Labor (on a two-party preferred computation) which again involves reading material from the AEC site. I downloaded it first, but it could also be read directly from the site into pandas.\n# Get all votes by polling booths in Cooper for 2019 b19 = pd.read_csv(\u0026#39;Elections/TCPByCandidateByPollingPlaceVIC-2019.csv\u0026#39;) # b19 = booths, 2019 v19 = b19.loc[b19.DivisionNm==\u0026#39;Cooper\u0026#39;] # v19 = Cooper Booths, 2019 v19 = v19[[\u0026#39;PollingPlace\u0026#39;,\u0026#39;PartyAb\u0026#39;,\u0026#39;OrdinaryVotes\u0026#39;]] v19r = v19.loc[v19.PartyAb==\u0026#39;GRN\u0026#39;] v19k = v19.loc[v19.PartyAb==\u0026#39;ALP\u0026#39;] v19c = v19r.merge(v19k,left_on=\u0026#39;PollingPlace\u0026#39;,right_on=\u0026#39;PollingPlace\u0026#39;) # Complete votes v19c[\u0026#39;Percent_x\u0026#39;] = (v19c[\u0026#39;OrdinaryVotes_x\u0026#39;]*100/(v19c[\u0026#39;OrdinaryVotes_x\u0026#39;]+v19c[\u0026#39;OrdinaryVotes_y\u0026#39;])).round(2) v19c[\u0026#39;Percent_y\u0026#39;] = (v19c[\u0026#39;OrdinaryVotes_y\u0026#39;]*100/(v19c[\u0026#39;OrdinaryVotes_x\u0026#39;]+v19c[\u0026#39;OrdinaryVotes_y\u0026#39;])).round(2) v19c = v19c.dropna() # note: suffix -x is GRN; suffix _y is ALP The next step is to determine the positions of all polling places. For simplification, I\u0026#39;m only interested in places used in both this most recent election, and the previous federal election in 2016:\nv16 = pd.read_csv(\u0026#39;Elections/Batman2016_TCP.csv\u0026#39;) v_both = v16.merge(v19c,right_on=\u0026#39;PollingPlace\u0026#39;,left_on=\u0026#39;Booth\u0026#39;) c19 = pd.read_csv(\u0026#39;Elections/Cooper_PollingPlaces.csv\u0026#39;) c19 = c19.drop(44).reset_index() # Drop Special Hospital, which has index 44 v_all=v_both.merge(c19,left_on=\u0026#39;PollingPlace\u0026#39;,right_on=\u0026#39;PollingPlaceNm\u0026#39;) lats = np.array(v_all[\u0026#39;Latitude\u0026#39;]) longs = np.array(v_all[\u0026#39;Longitude\u0026#39;]) booths = np.array(v_all[\u0026#39;PollingPlaceNm\u0026#39;]) diffs = np.array(v_all[\u0026#39;Percent_y\u0026#39;]-v_all[\u0026#39;ALP percent\u0026#39;]) # change in ALP percentage Having now got the boundary of the electorate, the positions of each polling booth, the percentage change in votes for Labor, the map can now be created and displayed:\nfig = plt.figure(figsize = (16,16)) tiler = GoogleTiles() ax = plt.axes(projection=tiler.crs) ax.set_extent(extent) ax.add_image(tiler, 13, interpolation=\u0026#39;hanning\u0026#39;) ax.add_geometries(cooper.geometry,crs=ccrs.PlateCarree(),facecolor=\u0026#39;none\u0026#39;,edgecolor=\u0026#39;k\u0026#39;,linewidth=2) for i in range(34): if diffs[i]\u0026gt;0: ax.plot(longs[i],lats[i],marker=\u0026#39;o\u0026#39;,markersize=diffs[i],markerfacecolor=\u0026#39;r\u0026#39;,transform=ccrs.Geodetic()) else: ax.plot(longs[i],lats[i],marker=\u0026#39;o\u0026#39;,markersize=-diffs[i],markerfacecolor=\u0026#39;g\u0026#39;,transform=ccrs.Geodetic()) plt.show() I found by trial and error that Hanning interpolation seemed to give the best results.\n![Swings 2019](/swings.jpg)\nSo this image shows not the size of Labor\u0026#39;s vote, but the size of Labor\u0026#39;s gain since the previous election. The larger gains are in the southern part of the electorate: the northern part has always been more Labor friendly, and so the gains were smaller there.\nEducational disciplines: size against market growth Here is an [interactive version](/educ.html) of this diagram:\n[![APF Figure 4](/APF_figure_4_small.png)](/APF_figure_4.png)\n(click on the image to show a larger version.)\nTschirnhausen transformations and the quartic\u0026#xa0;\u0026#xa0;\u0026#xa0;mathematics\u0026#xa0;algebra Here we show how a Tschirnhausen transformation can be used to solve a quartic equation. The steps are:\nEnsure the quartic is missing the cubic term, and its initial coefficient is 1. We can do this by first dividing by the initial coefficient to obtain an equation \\[ x^4+b_3x^3+b_2x^2+b_1x+b_0=0 \\] and then replace the variable $x$ with $y=x-b_3/ 4$. This will produce a monic quartic equation missing the cubic term. We can thus take \\[ x^4+bx^2+cx+d=0 \\] as a completely general formulation of the quartic equation. Now apply the Tschirnhausen transformation \\[ y = x^2+rx+s \\] using the resultant, which will produce a quartic equation in $y$. Chose $r$ and $s$ so that the coefficients of the linear and cubic terms are zero. This will require solving a cubic equation. Substitute those $r,s$ values into the resultant, which will produce a biquadratic equation in $y$: \\[ y^4+Ay^2+B=0. \\] This can be readily solved as it\u0026#39;s a quadratic in $y^2$. Finally, for each value of $y$, and using the $r,s$ values, solve \\[ x^2+rx+s-y=0. \\] This will in fact produce eight values, of which four are the solution to the original quartic. An example Consider the equation \\[ x^4-94x^2-480x-671=0 \\] which has solutions\n\\begin{aligned} x \u0026amp;= -2 \\, \\sqrt{5} \\pm \\sqrt{3} \\sqrt{9 -4\\, \\sqrt{5}},\\\\ x \u0026amp;= 2 \\, \\sqrt{5} \\pm \\sqrt{3} \\sqrt{9 + 4 \\, \\sqrt{5}} \\end{aligned} Note that these relatively nice solutions arise from the polynomial being factorizable in the number field $\\mathbb{Q}[\\sqrt{5}]$. We can show this using [Sagemath](http://www.sagemath.org): N.\u0026lt;a\u0026gt; = NumberField(x^2-5) K.\u0026lt;x\u0026gt; = N[] factor(x^4 - 94*x^2 - 480*x - 671) \\[ (x^{2} - 4 a x - 12 a - 7) \\cdot (x^{2} + 4 a x + 12 a - 7) \\] We shall continue to use Sagemath to perform all the dirty work; here\u0026#39;s how this solution works:\nvar(\u0026#39;x,y,r,s\u0026#39;) qx = x^4 - 94*x^2 - 480*x - 671 res = qx.resultant(y-x^2-r*x-s,x).poly(y) We now want to find the values of $r$ and $s$ to eliminate the linear and cubic terms. The cubic term is easy:\nres.coefficient(y,3) \\[ -4s-188 \\] and so\ns_sol = solve(res.coefficient(y,3),s,solution_dict=True) and we can substitute this into the linear coefficient:\nres.coefficient(y,1).subs(s_sol[0]).factor() \\[ -480(r+10)(r+8)(r+6). \\] In general the coefficient would not be as neatly factorizable as this, but we can still find the values of $r$:\nr_sol = solve(res.coefficient(y,1).subs(s_sol[0]),r,solution_dict=True) We can choose any value we like; here let\u0026#39;s choose the first value and substitute it into the resultant from above, first creating a dictionary to hold the $r$ and $s$ values:\nrs = s_sol[0].copy() rs.update(r_sol[0]) rs \\[ \\lbrace s:-47,r:-8\\rbrace \\]\nres.subs(rs) \\[ y^4-256y^2+1024 \\]\ny_sol = solve(res.subs(rs),y,solution_dict=True) This will produce four values of $y$, and for each one we solve the equation \\[ x^2+rx+s-y=0 \\] for $x$:\nfor ys in ysol: display(solve((x^2+r*x+s-y).subs(rs).subs(ys),x)) \\begin{aligned} x \u0026amp;= -\\sqrt{-4 \\, \\sqrt{2 \\, \\sqrt{15} + 8} + 63} + 4,\u0026amp; x \u0026amp;= \\sqrt{-4 \\, \\sqrt{2 \\, \\sqrt{15} + 8} + 63} + 4\\\\ x \u0026amp;= -\\sqrt{4 \\, \\sqrt{2 \\, \\sqrt{15} + 8} + 63} + 4,\u0026amp; x\u0026amp; = \\sqrt{4 \\,\\sqrt{2 \\, \\sqrt{15} + 8} + 63} + 4\\\\ x\u0026amp; = -\\sqrt{-4 \\, \\sqrt{-2 \\, \\sqrt{15} + 8} + 63} + 4,\u0026amp; x\u0026amp; = \\sqrt{-4 \\,\\sqrt{-2 \\, \\sqrt{15} + 8} + 63} + 4\\\\ x\u0026amp; = -\\sqrt{4 \\, \\sqrt{-2 \\, \\sqrt{15} + 8} + 63} + 4,\u0026amp; x\u0026amp; = \\sqrt{4 \\, \\sqrt{-2 \\, \\sqrt{15} + 8} + 63} + 4 \\end{aligned} We can check these values to see which ones are actually correct. But to experiment, we can determine the minimal polynomial of each value given:\nfor ys in ysol: s1 = solve((x^2+r*x+s-y).subs(rs).subs(ys),x,solution_dict=True) ql = [QQbar(z[x]).minpoly() for z in s1] display(ql) \\begin{aligned} \u0026amp;x^{4} - 94 x^{2} - 480 x - 671,\u0026amp;\u0026amp; x^{4} - 32 x^{3} + 290 x^{2} - 64 x - 6431\u0026amp;\\\\ \u0026amp;x^{4} - 94 x^{2} - 480 x - 671,\u0026amp;\u0026amp; x^{4} - 32 x^{3} + 290 x^{2} - 64 x - 6431\u0026amp;\\\\ \u0026amp;x^{4} - 32 x^{3} + 290 x^{2} - 64 x - 6431,\u0026amp;\u0026amp; x^{4} - 94 x^{2} - 480 x - 671\u0026amp;\\\\ \u0026amp;x^{4} - 94 x^{2} - 480 x - 671,\u0026amp;\u0026amp; x^{4} - 32 x^{3} + 290 x^{2} - 64 x - 6431\u0026amp; \\end{aligned} Half of these are the original equation we tried to solve. And the others?\nqx.subs(x=8-x).expand() \\[ x^{4} - 32 x^{3} + 290 x^{2} - 64 x - 6431 \\] This is in fact what we should expect, from solving the equation \\[ x^2+rx+s-y=0 \\] If the roots are $x_1$ and $x_2$, then by Vieta\u0026#39;s formulas $x_1+x_2=-(-r)=r$.\nFurther comments The trouble with this method is that it only works nicely on some equations. In general, the snarls of square, cube, and fourth roots become unwieldy very quickly. For example, consider the equation \\[ x^4+6x^2-60x+36=0 \\] which according to Cardan in [Ars Magna](https://en.wikipedia.org/wiki/Ars_Magna_(Gerolamo_Cardano)) (Chapter XXXIX, Problem V) was first solved by Ferrari.\nTaking the resultant with $y-x^2-rx-s$ as a polynomial on $y$, we find that the coefficient of $y^3$ is $-4s+12$, and so $s=3$. Substituting this in the linear coefficient, we obtain this cubic in $r$: \\[ 5r^3-9r^2-60r+300=0. \\] The simplest (real) solution is: \\[ r = \\frac{1}{5} \\, {\\left(50 \\, \\sqrt{3767} - 3273\\right)}^{\\frac{1}{3}} + \\frac{109}{5 \\, {\\left(50 \\, \\sqrt{3767} - 3273\\right)}^{\\frac{1}{3}}} + \\frac{3}{5} \\] Substituting these values of $r$ and $s$ into the resultant, we obtain the equation \\[ y^4+c_2y^2+c_0=0 \\] with \\[ c_2=\\frac{6 \\, {\\left({\\left(50 \\, \\sqrt{3767} - 3273\\right)}^{\\frac{2}{3}} {\\left(50 \\, \\sqrt{3767} - 18969\\right)} - {\\left(7200 \\, \\sqrt{3767} - 483193\\right)} {\\left(50 \\, \\sqrt{3767} - 3273\\right)}^{\\frac{1}{3}} + 100 \\, \\sqrt{3767} - 6546\\right)}}{25 \\, {\\left(50 \\, \\sqrt{3767} - 3273\\right)}} \\] and \\[ c_0=\\frac{27\\left(% \\begin{array}{l} 2\\,(14353657451700 \\, \\sqrt{3767} - 880869586608887)\\, (50 \\, \\sqrt{3767} - 3273)^{\\frac{2}{3}}\\\\ \\quad+109541 \\, (2077754350 \\, \\sqrt{3767} - 127532539917) (50 \\, \\sqrt{3767} - 3273)^{\\frac{1}{3}}\\quad{}\\\\ \\hspace{18ex} - 5262543802068000 \\, \\sqrt{3767} + 322980491997672634 \\end{array}% \\right)} {625 \\, {\\left(2077754350 \\, \\sqrt{3767} - 127532539917\\right)} {\\left(50 \\, \\sqrt{3767} - 3273\\right)}^{\\frac{1}{3}}} \\] Impressed?\nRight, so we solve the equation for $y$, to obtain \\[ y=\\pm\\sqrt{-\\frac{1}{2}c_2\\pm\\frac{1}{2}\\sqrt{c_2^2-4c_0}}. \\] For each of those values of $y$, we solve the equation \\[ x^2+rx+s-y=0 \\] to obtain (for example) \\[ x= = -\\frac{1}{2} \\, r \\pm \\frac{1}{2} \\, \\sqrt{r^{2} - 4 \\, s + 4 \\, \\sqrt{-\\frac{1}{2} \\, c_{2} + \\frac{1}{2} \\, \\sqrt{c_{2}^{2} - 4 \\, c_{0}}}} \\] With $r$ being the solution of a cubic equation, and $c_0$, $c_2$ being the appalling expressions above, you can see that this solution, while \u0026#34;true\u0026#34; in a logical sense, is hardly useful or enlightening.\nCardan again: \u0026#34;So progresses arithmetic subtlety, the end of which, it is said, is as refined as it is useless.\u0026#34;\nTschirnhausen\u0026#39;s solution of the cubic\u0026#xa0;\u0026#xa0;\u0026#xa0;mathematics\u0026#xa0;algebra A general cubic polynomial has the form \\[ ax^3+bx^2+cx+d \\] but a general cubic equation can have the form \\[ x^3+ax^2+bx+c=0. \\] We can always divide through by the coefficient of $x^3$ (assuming it to be non-zero) to obtain a monic equation; that is, with leading coefficient of 1. We can now remove the $x^2$ term by replacing $x$ with $y-a/3$: \\[ \\left(y-\\frac{a}{3}\\right)^{\\negmedspace 3}+a\\left(y-\\frac{a}{3}\\right)^{\\negmedspace 2} +b\\left(y-\\frac{a}{3}\\right)+c=0. \\] Expanding and simplifying produces \\[ y^3+\\left(b-\\frac{a^2}{3}\\right)y+\\frac{2}{27}a^3-\\frac{1}{3}ab+c=0. \\] In fact this can be simplified by writing the initial equation as \\[ x^3+3ax^2+bx+c=0 \\] and then substituting $x=y-a$ to obtain \\[ y^3+(b-3a^2)y+(2a^3-ab+c)=0. \\] This means that in fact an equation of the form \\[ y^3+Ay+B=0 \\] is a completely general form of the cubic equation. Such a form of a cubic equation, missing the quadratic term, is known as a depressed cubic.\nWe could go even further by substituting \\[ y=z\\sqrt{A} \\] to obtain\n\\[ A^{3/ 2}z^3+A\\sqrt{A}z+B=0 \\]\nand dividing through by \\(A^{3/ 2}\\) to produce\n\\[ z^3+z+BA^{-3/ 2}=0. \\]\nThis means that \\[ z^3+z+W=0 \\]\nis also a perfectly general form for the cubic equation.\nCardan\u0026#39;s method Although this is named for [Gerolamo Cardano](https://en.wikipedia.org/wiki/Gerolamo_Cardano) (1501-1576), the method was in fact discovered by [Niccolò Fontana](https://en.wikipedia.org/wiki/Niccolò_Fontana_Tartaglia) (1500-1557), known as Tartaglia (\u0026#34;the stammerer\u0026#34;) on account of a injury obtained when a soldier slashed his face when he was a boy. In the days before peer review and formal dissemination of ideas, any new mathematics was closely guarded: mathematicians would have public tests of skill, and a new solution method was invaluable. After assuring Tartaglia that his new method was safe with him, Cardan then proceeded to publish it as his own in his magisterial Ars Magna in 1545. A fascinating account of the mix of Cardan, Tartaglia, and several other egotistic mathematicians of the time, can be [read here](http://brain.caltech.edu/ist4/lectures/Cardano-Tartaglia_Dispute.pdf).\nCardan\u0026#39;s method solves the equation \\[ x^3-3ax-2b=0 \\] noting from above that this is a perfectly general form for the cubic, and where we have introduced factors of $-3$ and $-2$ to eliminate fractions later on. We start by assuming that the solution will have the form \\[ x=p^{1/ 3}+q^{1/ 3} \\] and so \\[ x^3=(p^{1/ 3}+q^{1/ 3})^3=p+3p^{2/ 3}q^{1/ 3}+3p^{1/ 3}q^{2/ 3}+q. \\] This last can be written as \\[ p+q+3p^{1/ 3}q^{1/ 3}(p^{1/ 3}+q^{1/ 3}). \\] We can thus write \\[ x^3=3p^{1/ 3}q^{1/ 3}x+p+q \\] and comparing with the initial cubic equation we have \\[ 3p^{1/ 3}q^{1/ 3}=3a,\\quad p+q=2b. \\] These can be written as \\[ pq=a^3,\\quad p+q=2b \\] for which the solutions are \\[ p,q=b\\pm\\sqrt{b^2-a^3} \\] and so \\[ x = (b+\\sqrt{b^2-a^3})^{1/ 3}+(b-\\sqrt{b^2-a^3})^{1/ 3}. \\] This can be written in various different ways.\nFor example, \\[ x^3-6x-6=0 \\] for which $a=2$ and $b=3$. Here $b^2-a^3=1$ and so one solution is \\[ x=4^{1/ 3}+2^{1/ 3}. \\] Given that a cubic must have three solutions, the other two are \\[ \\omega p^{1/ 3}+\\omega^2 q^{1/ 3},\\quad \\omega^2 p^{1/ 3}+\\omega q^{1/ 3} \\] where $\\omega$ is a cube root of 1, for example \\[ \\omega=\\frac{1}{2}+i\\frac{\\sqrt{3}}{2}. \\]\nAnd so to Tschirnhausen At the beginning we eliminated the $x^2$ terms from a cubic equation by a linear substitution $x=y-a/3$ or $y=x+a/3$. Writing in the year 1680, the German mathematician [Ehrenfried Walther von Tschirnhausen](https://en.wikipedia.org/wiki/Ehrenfried_Walther_von_Tschirnhaus) (1651-1708) began experimenting with more general polynomial substitutions, believing that it would be possible to eliminate other terms at the same time. Such substitutions are now known as [Tschirnhausen transformations](https://en.wikipedia.org/wiki/Tschirnhaus_transformation) and of course the modern general approach places them squarely within field theory. Tschirnhausen was only partially correct: it is indeed possible to remove some terms from a polynomial equation, and in 1858 the English mathematician [George Jerrard](https://en.wikipedia.org/wiki/George_Jerrard) (1804-1863) showed that it was possible to remove the terms of degree $n-1$, $n-2$ and $n-3$ from a polynomial of degree $n$. In particular, the general quintic equation can be reduced to \\[ x^5+px+q=0 \\] which is known as the Bring-Jerrard form; also honouring Jerrard\u0026#39;s predecessor, the Swedish mathematician [Erland Bring](https://en.wikipedia.org/wiki/Erland_Samuel_Bring) (1736-1798). Note that Jerrard was quite well aware of the work of Ruffini, Abel and Galois in proving the general unsolvability by radicals of the quintic equation.\nNeither Bring nor Tschirnhausen had the advantage of this knowledge, and both were working towards a general solution of the quintic.\nHappily, Tschirnhausen\u0026#39;s work is available in an English translation, published in the [ACM SIGSAM Bulletin by R. F. Green in 2003](https://dl.acm.org/citation.cfm?id=844078). For further delight, Jerrard\u0026#39;s text, with the splendidly formal English title \u0026#34;An Essay on the Resolution of Equations\u0026#34;, is also [available online](https://archive.org/details/essayonresolutio00jerrrich). After that history lesson, let\u0026#39;s explore how to remove both the quadratic and linear terms from a cubic equation using Tschirnhausen\u0026#39;s method, and also using [SageMath](http://www.sagemath.org) to do the heavy algebraic work. There is in fact nothing particularly conceptually difficult, but the algebra is quite messy and fiddly.\nWe start with a depressed cubic equation \\[ x^3+3ax+2b=0 \\] and we will use the Tschirnhausen transformation \\[ y=x^2+rx+s. \\]\nThis can be done by hand of course, using a somewhat fiddly argument, but for us the best approach is to compute the [resultant](https://en.wikipedia.org/wiki/Resultant) of the two polynomials, which is a polynomial expression equal to zero if the two polynomials have a common root. The resultant can be computed as the determinant of the [Sylvester matrix](https://en.wikipedia.org/wiki/Sylvester_matrix) (named for its discoverer); but we can simply use SageMath:\nvar(\u0026#39;a,b,c,x,y,r,s\u0026#39;) cb = x^3 + 3*a*x + 2*b res = cb.resultant(y-x^2-r*x-s,x).poly(y) res \\[ \\displaylines{ y^3+3(2a-s)y^{2}+3(ar^{2}+3a^{2}+2r-4as+s^{2})y\\\\ {\\ }\\mspace4em -4b^{2}+2br^{3}-3ar^{2}s+6abr-9a^{2}s-6brs+6as^{2}-s^{3} } \\]\nNow we find values of $r$ and $s$ for which the coefficients of $y^2$ and $y$ will be zero:\nsol = solve([res.coefficient(y,1),res.coefficient(y,2)],[r,s],solution_dict=True) sol \\[ \\left[\\left\\lbrace s : 2 \\, a, r : -\\frac{b + \\sqrt{a^{3} + b^{2}}}{a}\\right\\rbrace, \\left\\lbrace s : 2 \\, a, r : -\\frac{b - \\sqrt{a^{3} + b^{2}}}{a}\\right\\rbrace\\right] \\]\nWe can now substitute say the second solution into the resultant from above, which should produce an expression of the form $y^3+A$:\ncby = res.subs(sol[1]).canonicalize_radical().poly(y) cby \\[ y^3-8 \\, a^{3} - 16 \\, b^{2} + 8 \\, \\sqrt{a^{3} + b^{2}} b - \\frac{8 \\, b^{4}}{a^{3}} + \\frac{8 \\, \\sqrt{a^{3} + b^{2}} b^{3}}{a^{3}} \\] We can simply take the cube root of the constant term as our solution:\nsol_y = solve(cby,y,solution_dict=True) sol_y[2] \\[ \\left\\lbrace y : \\frac{2 \\, {\\left(a^{6} + 2 \\, a^{3} b^{2} - \\sqrt{a^{3} + b^{2}} a^{3} b + b^{4} - \\sqrt{a^{3} + b^{2}} b^{3}\\right)}^{\\frac{1}{3}}}{a}\\right\\rbrace \\]\nNow we solve the equation $y=x^2+rx+s$ using the values $r$ and $s$ from above, and the value of $y$ just obtained:\neq = x^2+r*x+s-y eqrs = eq.subs(sol[1]) eqx = eqrs.subs(sol_y[2]) solx = solve(eqx,x,solution_dict=True) solx[0] \\[ \\left\\lbrace x : \\frac{b - \\sqrt{a^{3} + b^{2}} - \\sqrt{-7 \\, a^{3} + 2 \\, b^{2} - 2 \\, \\sqrt{a^{3} + b^{2}} b + 8 \\, {\\left(a^{6} + 2 \\, a^{3} b^{2} + b^{4} - {\\left(a^{3} b + b^{3}\\right)} \\sqrt{a^{3} + b^{2}}\\right)}^{\\frac{1}{3}} a}}{2 \\, a}\\right\\rbrace \\]\nA equation, of err… rare beauty, or if not beauty, then something else. It certainly lacks the elegant simplicity of Cardan\u0026#39;s solution. On the other hand, the method can be applied to quartic (and quintic) equations, which Cardan\u0026#39;s solution can\u0026#39;t.\nFinally, let\u0026#39;s test this formula, again on the equation $x^3-6x-6=0$, for which $a=-2$ and $b=-3$:\nxs = solx[0][x].subs({a:-2, b:-3}) xs \\[ \\frac{1}{4} \\, \\sqrt{-16 \\cdot 4^{\\frac{1}{3}} + 80} + 1 \\]\nThis can clearly be simplified to\n\\[ 1+\\sqrt{5-4^{1/ 3}} \\] It certainly looks different from Cardan\u0026#39;s result, but watch this:\nxt = QQbar(xs) xt.radical_expression() \\[ \\frac{1}{2}4^{2/ 3}+4^{1/ 3} \\]\nwhich is Cardan\u0026#39;s result, only very slightly rewritten. And finally:\nxt.minpoly() \\[ x^3-6x-6 \\]\nColonial massacres, 1794 to 1928\u0026#xa0;\u0026#xa0;\u0026#xa0;history\u0026#xa0;GIS\u0026#xa0;python The date January 26 is one of immense current debate in Australia. Officially it\u0026#39;s the date of [Australia Day](https://en.wikipedia.org/wiki/Australia_Day), which supposedly celebrates the founding of Australia. To Aboriginal peoples it is a day of deep mourning and sadness, as the date commemorates over two centuries of oppression, bloodshed, and dispossession. To them and their many supporters, January 26 is [Invasion Day](https://www.creativespirits.info/aboriginalculture/history/australia-day-invasion-day).\nThe date commemorates the landing in 1788 of [Arthur Phillip](https://en.wikipedia.org/wiki/Arthur_Phillip), in charge of the First Fleet and the first Governor of the colony of New South Wales.\nThe trouble is that \u0026#34;Australia\u0026#34; means two things: the island continent, and the country. The country didn\u0026#39;t exist until Federation on January 1, 1901; before which time the land since 1788 was subdivided into independent colonies. Many people believe that Australia Day would be better moved to January 1; the trouble with that is that it\u0026#39;s already a public holiday, and apparently you can\u0026#39;t have a national day that doesn\u0026#39;t have its own public holiday. And [many other dates](https://en.wikipedia.org/wiki/Australia_Day#Suggested_alternative_dates) have been proposed.\nMy own preferred date is June 3; this is the date of the High Court \u0026#34;Mabo\u0026#34; decision in 1992 which formally recognized native title and rejected the doctrine of terra nullius under which the British invaded.\nThat the continent was invaded rather than settled is well established: all serious historians take this view, and it can be backed up with legal arguments. The Aboriginal peoples, numbering maybe close to a million in 1788, had mastered the difficult continent and all of its many ecosystems, and had done so for around 80,000 years. Aboriginal culture is the oldest continually maintained culture on the planet, and by an enormous margin.\nArthur Phillip did in fact arrive with [formal instructions](https://www.foundingdocs.gov.au/resources/transcripts/nsw2_doc_1787.pdf) to create \u0026#34;amity\u0026#34; with the \u0026#34;natives\u0026#34; and indeed to live in \u0026#34;kindness\u0026#34; with them, but this soon went downhill. Although Phillip himself seems to have been a man of rare understanding for his time (when speared in the shoulder, for example, he refused to let his soldiers retaliate), he was no match for the many convicts and soldiers under his rule. When he retired back to England in 1792 the colony was ruled by a series of weak and ineffective governors, and in particular by the military, culminating in the governorship of [Lachlan Macquarie](https://en.wikipedia.org/wiki/Lachlan_Macquarie) who is seen as a mass murderer of Aboriginal peoples, although the [evidence is not clear-cut](https://www.abc.net.au/news/2017-09-27/fact-check-did-lachlan-macquarie-commit-mass-murder-and-genocide/8981092). What is clear is that mass murders of Aboriginal peoples were common and indiscriminate, and often with appalling cruelty. On many occasions large groups were poisoned with strychnine: this works by affecting the nerves which control muscle movement, so that the body goes into agonizing spasms resulting in death by asphyxiation. Strychnine is considered by toxicologists to be one of the most painful acting of all poisons. Even though Macquarie himself ordered retribution only after \u0026#34;resistance\u0026#34;; groups considered harmless, or consisting only of old men, women and children, were brutally murdered.\nPeople were routinely killed by gunfire, or by being hacked to death; there is at least one report of a crying baby - the only survivor of a massacre - being thrown onto the fire made to burn the victims.\nMany more died of disease: smallpox and tuberculosis were responsible for deaths of over 50% of Aboriginal peoples. Their numbers today are thus tiny, and as in the past they are still marginalized. Only recently has this harrowing part of Australia\u0026#39;s past been formally researched; the casual nature of the massacres meant that many were not recorded, and it has taken a great deal of time and work to uncover their details. This work has been headed by [Professor Lyndall Ryan](https://www.newcastle.edu.au/profile/lyndall-ryan) at the University of Newcastle. The painstaking and careful work by her team has unearthed much detail, and their results are available at their site [Colonial Frontier Massacres in Central and Eastern Australia 1788-1930](https://c21ch.newcastle.edu.au/colonialmassacres/)\nAs a January 26 exercise I decided to rework one of their maps, producing a single map which would show the sites of massacres by markers whose size is proportional to the number of people killed. This turned out to be quite easy using Python and its folium library, but naturally it took me a long time to get it right.\nI started by downloading the [timeline](https://c21ch.newcastle.edu.au/colonialmassacres/timeline.php) from the Newcastle site as a csv file, and going through each massacre adding its location. The project historians point out that the locations are deliberately vague. Sometimes this is because the vagueness of the historical record; but also (from the [Introduction](https://c21ch.newcastle.edu.au/colonialmassacres/introduction.php)):\nIn order to protect the sites from desecration, and respect for the wishes of Aboriginal communities to observe the site as a place of mourning, the points have been made purposefully imprecise by rounding coordinates to 3 digits, meaning the point is precise only to around 250m.\nGiven the database, the Python commands were:\nimport folium import pandas as pd mass = pd.read_csv(\u0026#39;massacres.csv\u0026#39;) a = folium.Map(location=[-27,134],width=1000, height=1000,tiles=\u0026#39;OpenStreetMap\u0026#39;,zoom_start=4.5) for i in range(0,len(mass)): number_killed = mass.iloc[i][\u0026#39;Estimated Aboriginal People Killed\u0026#39;] folium.Circle( location=[float(mass.iloc[i][\u0026#39;Lat\u0026#39;]), float(mass.iloc[i][\u0026#39;Long\u0026#39;])], tooltip=mass.iloc[i][\u0026#39;Location\u0026#39;]+\u0026#39;: \u0026#39;+str(number_killed), radius=int(number_killed)*150, color=\u0026#39;goldenrod\u0026#39;, fill=True, fill_color=\u0026#39;gold\u0026#39; ).add_to(a) a.save(\u0026#34;massacres.html\u0026#34;) The result is shown below. You can zoom in and out, and hovering over a massacre site will produce the location and number of people murdered.\nThe research is ongoing and this data is incomplete \u0026lt;iframe seamless src=\u0026#34;/massacres.html\u0026#34; width=\u0026#34;1000\u0026#34; height=\u0026#34;1000\u0026#34;\u0026gt;\u0026lt;/iframe\u0026gt; The data was extracted from: Ryan, Lyndall; Richards, Jonathan; Pascoe, William; Debenham, Jennifer; Anders, Robert J; Brown, Mark; Smith, Robyn; Price, Daniel; Newley, Jack Colonial Frontier Massacres in Eastern Australia 1788 – 1872, v2.0 Newcastle: University of Newcastle, 2017, http://hdl.handle.net/1959.13/1340762 (accessed 08/02/2019). This project has been funded by the Australian Research Council (ARC).\nNote finally that Professor Ryan and team have defined a massacre to be a killing of at least six people. Thus we can assume there are many other killings of five or less people which are not yet properly documented, or more likely shall never been known. A shameful history indeed.\nVote counting in the Australian Senate\u0026#xa0;\u0026#xa0;\u0026#xa0;voting Recently we have seen senators behaving in ways that seem stupid, or contrary to accepted public opinion. And then people will start jumping up and down and complaining that such a senator only got a tiny number of first preference votes. [One commentator](https://www.news.com.au/national/politics/19-people-got-this-bloke-a-200k-job/news-story/f8d8aaa83f0c2bcab53626455a3698d6) said that one senator, with 19 first preference votes, \u0026#34;couldn’t muster more than 19 members of his extended family to vote for him\u0026#34;. This displays an ignorance of how senate counting works. In fact first preference votes are almost completely irrelevant; or at least, far less relevant than they are in the lower house.\nSenate counting works on a proportional system, where multiple candidates are elected from the same group of ballots. This is different from the lower house (the House of Representatives federally) where only one person is elected. For the lower house, first preference votes are indeed much more important. As for the lower house, senate voting is preferential: voters number their preferred candidates starting with 1 for their most preferred, and so on (but see below).\nA full explanation is given by the Australian Electoral Commission on their [Senate Counting page](https://www.aec.gov.au/voting/counting/senate_count.htm); this blog post will run through a very simple example to demonstrate how a senator can be elected with a tiny number of first preference votes.\nAn aside on micro parties and voting One problem in Australia is the proliferation of micro parties, many of which hold racist, anti-immigration, or hard-line religious views, or who in some other ways represent only a tiny minority of the electorate. The problem is just as bad at State level; in my own state of Victoria we have the Shooters, Fishers and Farmers Party, the Aussie Battlers Party, and the Transport Matters Party (who represent taxi drivers) to name but three. This has the affect that the number of candidates standing for senate election has become huge, and the senate ballot papers absurdly unwieldy:\nInitially the law required voters to number every box starting from 1: on a large paper this would mean numbering carefully from 1 up to at least 96 in one recent election. To save this trouble (and most Australian voters are nothing if not lazy), \u0026#34;above the line voting\u0026#34; was introduced. This gave voters the option to put just a single \u0026#34;1\u0026#34; in the box representing the party of choice: you will see from the image above that the ballot paper is divided: the columns represent all the parties; the boxes below the line represent all the candidates from that party, and the single box above just the party name. Here is a close up of a NSW senate ballot:\nAlmost all voters willingly took advantage of that and voted above the line. The trouble is then that voters have no control over where their preferences go: that is handled by the parties themselves. By law, all parties must make their preferences available before the election, and they are published on the site of the relevant Electoral Commission. But the only people who carefully check this site and the party\u0026#39;s preferences are the sort of people who would carefully number each box below the line anyway. Most people don\u0026#39;t care enough to be bothered.\nThis enables all the micro-parties to make \u0026#34;preference deals\u0026#34;; in effect they act as one large bloc, ensuring that at least some of them get a senate seat. This has been handled by a so-called [\u0026#34;preference whisperer\u0026#34;](https://en.wikipedia.org/wiki/Glenn_Druery).\nThe current system in the state of Victoria has been to encourage voting below the line by allowing, instead of all boxes to be numbered, at least six. And there are strong calls for voting above the line to be abolished. A simple example To show how senate counting works, we suppose an electorate of 100 people, and three senators to be elected from five candidates. We also suppose that every ballot paper has been numbered from 1 to 5 indicating each voter\u0026#39;s preferences. Before the counting begins we need to determine the number of votes each candidate must amass to be elected: this is chosen as the smallest number of votes for which no more candidates can be elected. If there are $n$ formal votes cast, and $k$ senators to be elected, that number is clearly\n$$\\left\\lfloor\\frac{n}{k+1}\\right\\rfloor + 1.$$\nThis value is known as the Droop quota. In our example, this quota is\n\\[ \\left\\lfloor\\frac{100}{3+1}\\right\\rfloor +1 = 26. \\]\nYou can see that it is not possible for four candidates to obtain this value.\nSuppose that the ballots are distributed as follows, where the numbers under the candidates indicate the preferences cast:\nNumber of votes A B C D E 20 1 2 3 4 5 20 1 5 4 3 2 40 2 1 5 4 3 5 2 3 5 1 4 4 4 3 1 2 5 1 2 3 4 5 1 Counting first preferences produces:\nCandidate First Prefs A 40 B 40 C 4 D 5 E 1 The first step in the counting is to determine if any candidate has amassed first preference votes equal to or greater than the Droop quota. In the example just given, both A and B have 40 first preferences each, so they are both elected.\nSince only 26 votes are needed for election, for each of A and B there are 14 votes remaining which can be passed on to other candidates according to the voting preferences. Which votes are passed on? For B it doesn\u0026#39;t matter, but which votes do we deem surplus for A? The Australian choice is to pass on all votes, but at a reduced value known as the transfer value. This value is simply the fraction of surplus votes over total votes; in our case it is\n$$\\frac{14}{40}=0.35$$\nfor each of A and B. Looking at the first line of votes: the next highest preference from A to a non-elected candidate is C, so C gets 0.35 of those 20 votes. From the second line, E gets 0.35 of those 20 votes. From the third line, E gets 0.35 of all 40 votes. The votes now allocated to the remaining candidates are as follows:\nC: $4 + 0.35\\times 20 = 11$\nD: 5\nE: $1 + 0.35\\times 20 + 0.35\\times 40 = 22$\nAt this stage no candidate has amassed a quota, so the lowest ranked candidate in the counting is eliminated - in this case D - and all of those votes are passed on to the highest candidate (of those that are left, which is now only C and E) in those preferences, which is E. This produces:\nC: 11\nE: $22 + 5 = 27$\nwhich means E has achieved the quota and thus is elected.\nThis is of course a very artificial example, but it shows two things:\nHow a candidate with a very small number of first preference votes can still be elected: in this case E had the lowest number of first preference votes. The importance of preferences. So let\u0026#39;s have no more complaining about the low number of first preference votes in a senate count. In a lower house count, sure, the candidate with the least number of first preference votes is eliminated, but in a senate count such a candidate might amass votes (or reduced values of votes) in such a way as to achieve the quota. Concert review: Lixsania and the Labyrinth\u0026#xa0;\u0026#xa0;\u0026#xa0;music This evening I saw the [Australia Brandenburg Orchestra](https://en.wikipedia.org/wiki/Australian_Brandenburg_Orchestra) with guest soloist [Lixsania Fernandez](https://lixsania.wordpress.com), a virtuoso player of the [viola da gamba](https://en.wikipedia.org/wiki/Viol), from Cuba. (Although she studied, and now lives, in Spain.) Lixsania is quite amazing: tall, statuesque, quite absurdly beautiful, and plays with a technique that encompasses the wildest of baroque extravagances as well as the most delicate and refined tenderness.\nThe trouble with the viol, being a fairly soft instrument, is that it\u0026#39;s not well suited to a large concert hall. This means that it\u0026#39;s almost impossible to get any sort of balance between it and the other instruments. Violins, for example, even if played softly, can overpower it.\n[Thomas Mace](https://en.wikipedia.org/wiki/Thomas_Mace), in his \u0026#34;Musick\u0026#39;s Monument\u0026#34;, published in 1676, complained vigorously about violins:\nMace has been described as a \u0026#34;conservative old git\u0026#34; which he certainly was, but I do love the idea of this last hold-out against the \u0026#34;High-Priz\u0026#39;d Noise\u0026#34; of the violin. And I can see his point!\nBut back to Lixsania. The concert started with a \u0026#34;pastiche\u0026#34; of La Folia, taking in parts of Corelli\u0026#39;s well known set for solo violin, Vivaldi\u0026#39;s for two, Scarlatti for harpsichord, and of course Marin Marais \u0026#34;32 couplets de Folies\u0026#34; from his second book of viol pieces. The Australian Brandenburgs have a nice line in stagecraft, and this started with a dark stage with only Lixsania lit, playing some wonderful arpeggiated figurations over all the strings, with a bowing of utter perfection. I was sitting side on to her position here, and I could see with what ease she moved over the fingerboard - the mark of a true master of their instrument - being totally at one with it. Little by little other instrumentalists crept in: a violinist here and there, Paul Dyer (leader of the orchestra) to the harpsichord, cellists and a bassist, until there was a sizable group on stage all playing madly. I thought it was just wonderful.\nFor this first piece Lixsania was wearing a black outfit with long and full skirts and sort of halter top which left her arms, sides and back bare. This meant I had an excellent view of her rib-cage, which was a first for me in a concert.\nThe second piece was the 12th concerto, the so called \u0026#34;Harmonic Labyrinth\u0026#34; from Locatelli\u0026#39;s opus 3. These concertos contain, in their first and last movements, a wild \u0026#34;capriccio\u0026#34; for solo violin. This twelfth concerto contains capricci of such superhuman difficulty that even now, nearly 300 years after they were composed, they still stand at the peak of virtuosity. The Orchestra\u0026#39;s concertmaster, Shaun Lee-Chen, was however well up to the challenge, and powered his way through both capricci with the audience hardly daring to breathe. Even though conventional concert behaviour does not include applause after individual movements, so excited was the audience that there was an outburst of clapping after the first movement. And quite right too.\nThe final piece of the first half was a Vivaldi concerto for two violins and cello, the cello part being taken by Lixsania on viol. I felt this didn\u0026#39;t come across so well; the viol really couldn\u0026#39;t be heard much, and you really do need the strength of the cello to make much sense of the music. However, it did give Lixsania some more stage-time. After interval we were treated to a concerto for viol by [Johann Gottlieb Graun](https://en.wikipedia.org/wiki/Johann_Gottlieb_Graun), a court composer to Frederick the Great of Prussia. Graun wrote five concertos for the instrument - all monumentally difficult to play - which have been recorded several times. However, a sixth one has recently been unearthed in manuscript - and apparently we were hearing it for the first time in this concert series. The softness of the viol in the largeness of the hall meant that it was not always easy to hear: I solved that by closing my eyes, so I could focus on the sound alone. Lixsania played, as you would imagine, as though she owned it, and its formidable technical difficulties simply melted away under the total assurance of her fingers. She\u0026#39;d changed into a yellow outfit for this second half, and all the male players were wearing yellow ties.\nThen came a short Vivaldi sinfonia - a quite remarkable piece; very stately and with shifting harmonies that gave it a surprisingly modern feel. Just when you think Vivaldi is mainly about pot-boilers, he gives you something like this. Short, but superb.\nFinally, the fourth movement of a concerto written in 2001 for two viols by \u0026#34;Renato Duchiffre\u0026#34; (the pen name of René Schiffer, cellist and violist with Apollo\u0026#39;s Fire): a Tango. Now my exposure to tangos has mainly been through that arch-bore Astor Piazolla. But this tango was magnificent. The other violist was Anthea Cottee, of whom I\u0026#39;d never heard, but she\u0026#39;s no mean player. She and Lixsania made a fine pair, playing like demons, complementing each other and happily grinning at some of the finer passages. One of the many likeable characteristics of Lixsania is that she seems to really enjoy playing, and smiles a lot - I hate that convention of players who adopt a poker-face. And she has a great smile.\nIn fact the whole orchestra has a wonderful enjoyment about them, led by Paul Dyer who displays a lovely dynamism at the harpsichord. Not for him the expressionless sitting still; he will leap up if given half an opportunity and conduct a passage with whichever hand is free; sometimes he would play standing and sort of conduct with his body; between him and Lixsania there was a chemistry of heart and mind, both leaning towards each other, as if inspiring each other to reach higher musical heights. This was one of the most delightful displays of communicative musicianship I\u0026#39;ve ever seen. Naturally there had to be an encore: and it was Lixsania singing a Cuban lullaby, accompanying herself by plucking the viol - which was stood on a chair for easier access - with Anthea Cottee providing a bowed accompaniment. Lixsania told us (of course she speaks English fluently, with a charming Cuban accent) that it was a lullaby of special significance, as it was the first song she\u0026#39;d ever sang to her son. There\u0026#39;s no reason why instrumentalists should be able to sing well, but in fact Lixsania has a lovely, rich, warm, enveloping sort of voice, and the effect was breathtakingly lovely. Lucky son!\nThis was a great concert.\nLinear programming in Python (2)\u0026#xa0;\u0026#xa0;\u0026#xa0;linear_programming\u0026#xa0;python Here\u0026#39;s an example of a transportation problem, with information given as a table:\nDemands 300 360 280 340 220 750 100 150 200 140 35 Supplies\u0026nbsp; 400 50 70 80 65 80 350 40 90 100 150 130 This is an example of a balanced, non-degenerate transportation problem. It is balanced since the sum of supplies equals the sum of demands, and it is non-degenerate as there is no proper subset of supplies whose sum is equal to that of a proper subset of demands. That is, there are no balanced \u0026#34;sub-problems\u0026#34;.\nIn such a problem, the array values may be considered to be the costs of transporting one object from a supplier to a demand. (In the version of the problem I pose to my students it\u0026#39;s cars between distributors and car-yards; in another version it\u0026#39;s tubs of ice-cream between dairies and supermarkets.) The idea of course is to move all objects from supplies to demands while minimizing the total cost.\nThis is a standard linear optimization problem, and it can be solved by any method used to solve such problems, although generally specialized methods are used.\nBut the intention here is to show how easily this problem can be managed using myMathProg (and with numpy, for the simple use of printing an array):\nimport pymprog as py import numpy as np py.begin(\u0026#39;transport\u0026#39;) M = range(3) # number of rows and columns N = range(5) A = py.iprod(M,N) # Cartesian product x = py.var(\u0026#39;x\u0026#39;, A, kind=int) # all the decision variables are integers costs = [[100,150,200,140,35],[50,70,80,65,80],[40,90,100,150,130]] supplies = [750,400,350] demands = [300,360,280,340,220] py.minimize(sum(costs[i][j]*x[i,j] for i,j in A)) # the total sum in each row must equal the supplies for k in M: sum(x[k,j] for j in N)==supplies[k] # the total sum in each column must equal the demands for k in N: sum(x[i,k] for i in M)==demands[k] py.solve() print(\u0026#39;\\nMinimum cost: \u0026#39;,py.vobj()) A = np.array([[x[i,j].primal for j in N] for i in M]) print(\u0026#39;\\n\u0026#39;) print(A) print(\u0026#39;\\n\u0026#39;) #py.sensitivity() py.end() with solution:\nGLPK Simplex Optimizer, v4.65 n8 rows, 15 columns, 30 non-zeros 0: obj = 0.000000000e+00 inf = 3.000e+03 (8) 7: obj = 1.789500000e+05 inf = 0.000e+00 (0) ,* 12: obj = 1.311000000e+05 inf = 0.000e+00 (0) OPTIMAL LP SOLUTION FOUND GLPK Integer Optimizer, v4.65 8 rows, 15 columns, 30 non-zeros 15 integer variables, none of which are binary Integer optimization begins... Long-step dual simplex will be used + 12: mip = not found yet \u0026gt;= -inf (1; 0) + 12: \u0026gt;\u0026gt;\u0026gt;\u0026gt;\u0026gt; 1.311000000e+05 \u0026gt;= 1.311000000e+05 0.0% (1; 0) + 12: mip = 1.311000000e+05 \u0026gt;= tree is empty 0.0% (0; 1) INTEGER OPTIMAL SOLUTION FOUND Minimum cost: 131100.0 [[ 0. 190. 0. 340. 220.] [ 0. 120. 280. 0. 0.] [300. 50. 0. 0. 0.]] As you see, the definition of the problem in Python is very straightforward. Linear programming in Python\u0026#xa0;\u0026#xa0;\u0026#xa0;linear_programming\u0026#xa0;python For my elementary linear programming subject, the students (who are all pre-service teachers) use Excel and its Solver as the computational tool of choice. We do this for several reasons: Excel is software with which they\u0026#39;re likely to have had some experience, also it\u0026#39;s used in schools; it also means we don\u0026#39;t have to spend time and mental energy getting to grips with new and unfamiliar software. And indeed the [mandated curriculum](https://www.vcaa.vic.edu.au/Documents/vce/adviceforteachers/furthermaths/sample_learning_activity_graphs_relations.docx) includes computer exploration, using either [Excel Solver](https://www.excel-easy.com/data-analysis/solver.html), or the Wolfram Alpha [Linear Programming widget](http://www.wolframalpha.com/widgets/gallery/view.jsp?id=1e692c6f72587b2cbd3e7be018fd8960).\nThis is all very well, but I balk at the reliance on commercial software, no matter how widely used it may be. And for my own exploration I\u0026#39;ve been looking for an open-source equivalent.\nIn fact there are plenty of linear programming tools and libraries; two of the most popular open-source ones are:\nThe GNU Linear Programming Kit, [GLPK](https://www.gnu.org/software/glpk/) Coin-or Linear Programming, [Clp](https://projects.coin-or.org/Clp) There\u0026#39;s a [huge list on wikipedia](https://en.wikipedia.org/wiki/List_of_optimization_software) which includes open-source and proprietary software.\nFor pretty much any language you care to name, somebody has taken either GLPK or Clp (or both) and produced a language API for it. For Python there\u0026#39;s [PuLP](https://pythonhosted.org/PuLP/); for [Julia](https://julialang.org) there\u0026#39;s [JuMP](http://www.juliaopt.org); for Octave there\u0026#39;s the `glpk` command, and so on. Most of the API\u0026#39;s include methods of calling other solvers, if you have them available.\nHowever not all of these are well documented, and in particular some of them don\u0026#39;t allow sensitivity analysis: computing shadow prices, or ranges of the objective coefficients. I discovered that JuMP doesn\u0026#39;t yet support this - although to be fair sensitivity analysis does depend on the problem being solved, and the solver being used.\nBeing a Python aficionado, I thought I\u0026#39;d check out some Python packages, of which a list is given at an [operations research page](https://wiki.python.org/moin/PythonForOperationsResearch).\nHowever, I then discovered the Python package [PyMathProg](http://pymprog.sourceforge.net) which for my purposes is perfect - it just calls GLPK, but in a nicely \u0026#34;pythonic\u0026#34; manner, and the design of the package suits me very well.\nA simple example Here\u0026#39;s a tiny two-dimensional problem I gave to my students:\nA furniture workshop produces chairs and tables. Each day 30m2 of wood board is delivered to the workshop, of which chairs require 0.5m2 and tables 1.5m2. (We assume, of course, that all wood is used with no wastage.) All furniture needs to be laminated; there is only one machine available for 10 hours per day, and chairs take 15 minutes each, tables 20 minutes. If chairs are sold for $30 and tables for $60, then maximize the daily profit (assuming that all are sold).\nLetting $x$ be the number of chairs, and $y$ be the number of tables, the problem is to maximize \\[ 30x+60y \\] given\n$$\\begin{aligned} 0.5x+1.5y\u0026amp;\\le 30\\\\ 15x+20y\u0026amp;\\le 600\\\\ x,y\u0026amp;\\ge 0 \\end{aligned}$$\nProblems don\u0026#39;t get much simpler than this. In pyMathProg:\nimport pymathprog as pm pm.begin(\u0026#39;furniture\u0026#39;) # pm.verbose(True) x, y = pm.var(\u0026#39;x, y\u0026#39;) # variables pm.maximize(30 * x + 60 * y, \u0026#39;profit\u0026#39;) 0.5*x + 1.5*y \u0026lt;= 30 # wood 15*x + 20*y \u0026lt;= 600 # laminate pm.solve() print(\u0026#39;\\nMax profit:\u0026#39;,pm.vobj()) pm.sensitivity() pm.end() with output:\nGLPK Simplex Optimizer, v4.65 2 rows, 2 columns, 4 non-zeros ,* 0: obj = -0.000000000e+00 inf = 0.000e+00 (2) ,* 2: obj = 1.440000000e+03 inf = 0.000e+00 (0) OPTIMAL LP SOLUTION FOUND Max profit: 1440.0 PyMathProg 1.0 Sensitivity Report Created: 2018/10/28 Sun 21:42PM ================================================================================ Variable Activity Dual.Value Obj.Coef Range.From Range.Till -------------------------------------------------------------------------------- ,*x 24 0 30 20 45 ,*y 12 0 60 40 90 ================================================================================ ================================================================================ Constraint Activity Dual.Value Lower.Bnd Upper.Bnd RangeLower RangeUpper -------------------------------------------------------------------------------- R1 30 24 -inf 30 20 45 R2 600 1.2 -inf 600 400 900 ================================================================================ From that output, we see that the required maximum is $1440, obtained by making 24 chairs and 12 tables. We also see that the shadow prices for the constraints are 24 and 1.2. Furthermore, the ranges of objective coefficients which will not affect the results are $[20,45]$ for prices for chairs, and $[40,90]$ for table prices.\nThis is the simplest API I\u0026#39;ve found so far which provides that sensitivity analysis.\nNote that if we just want a solution, we can use the linprog command from scipy:\nfrom scipy.optimize import linprog linprog([-30,-60],A_ub=[[0.5,1.5],[15,20]],b_ub=[30,600]) linprog automatically minimizes a function, so to maximize we use a negative function. The output is\nfun: -1440.0 message: \u0026#39;Optimization terminated successfully.\u0026#39; nit: 2 slack: array([0., 0.]) status: 0 success: True x: array([24., 12.]) The negative value given as fun above simply reflects that we are entering a negative function. In respect of our problem, we simply negate that value to obtain the required maximum of 1440.\nA test of OpenJSCAD\u0026#xa0;\u0026#xa0;\u0026#xa0;CAD Here\u0026#39;s an example of a coloured tetrahedron:\nhello The power of two irrational numbers being rational\u0026#xa0;\u0026#xa0;\u0026#xa0;mathematics There\u0026#39;s a celebrated elementary result which claims that:\nThere are irrational numbers $x$ and $y$ for which $x^y$ is rational.\nThe standard proof goes like this. Now, we know that $\\sqrt{2}$ is irrational, so let\u0026#39;s consider $r=\\sqrt{2}^\\sqrt{2}$. Either $r$ is rational, or it is not. If it is rational, then we set $x=\\sqrt{2}$, $y=\\sqrt{2}$ and we are done. If $r$ is irrational, then set $x=r$ and $y=\\sqrt{2}$. This means that \\[ x^y=\\left(\\sqrt{2}^\\sqrt{2}\\right)^{\\sqrt{2}}=\\sqrt{2}^2=2 \\] which is rational.\nThis is a perfectly acceptable proof, but highly non-constructive, And for some people, the fact that the proof gives no information about the irrationality of $\\sqrt{2}^\\sqrt{2}$ is a fault.\nSo here\u0026#39;s a lovely constructive proof I found on [reddit](https://www.reddit.com/r/math/comments/9i8lvl/classic/e6hnape) . Set $x=\\sqrt{2}$ and $y=2\\log_2{3}$. The fact that $y$ is irrational follows from the fact that if $y=p/q$ with $p$ and $q$ integers, then $2\\log_2{3}=p/q$ so that $2^{p/2q}=3$, or $2^p=3^{2q}$ which contradicts the fundamental theorem of arithmetic. Then:\n\\begin{eqnarray*} x^y\u0026amp;=\u0026amp;\\sqrt{2}^{2\\log_2{3}}\\\\ \u0026amp;=\u0026amp;2^{\\log_2{3}}\\\\ \u0026amp;=\u0026amp;3. \\end{eqnarray*} Wrestling with Docker For years I have been running a blog and other web apps on a VPS running Ubuntu 14.04 and Apache - a standard [LAMP](https://en.wikipedia.org/wiki/LAMP_(software_bundle)) system. However, after experimenting with some apps - temporarily installing them and testing them, only to discard them, the system was becoming a total mess. Worst of all, various MySQL files were ballooning out in size: the ibdata1 file in /var/lib/mysql was coming in at a whopping 37Gb (39568015360 bytes to be more accurate).\nNow, there are ways of dealing with this, but I don\u0026#39;t want to have to become an expert in MySQL; all I wanted to do was to recover my system and make it more manageable.\nI decided to use [Docker](www.docker.com). This is a \u0026#34;container system\u0026#34; where each app runs in its own container - a sort of mini system which contains all the files required to serve it up to the web. This clearly requires a certain amount of repetition between containers, but that\u0026#39;s the price to be paid for independence. The idea is that you can start or stop any container without affecting any of the others. For web apps many containers are based on [Alpine Linux](https://alpinelinux.org) which is a system designed to be as tiny as possible, along with the [nginx](https://www.nginx.com) web server.\nThere seems to be a sizable ecosystem of tools to help manage and deploy docker containers. Given my starting position of knowing nothing, I wanted to keep my extra tools to a minimum; I went with just two over and above docker itself: [docker-compose](https://docs.docker.com/compose/), which helps design, configure, and run docker containers, and [traefik](https://traefik.io), a reverse proxy, which handles all requests from the outside world to docker containers - thus managing things like ports - as well as interfacing with the certificate authority [Lets Encrypt](https://letsencrypt.org).\nMy hope was that I should be able to get these all set up so they would work as happily together as they were supposed to do. And so indeed it has turned out, although it took many days of fiddling, and innumerable questions to forums and web sites (such as reddit) to make it work.\nSo here\u0026#39;s my traefik configuration:\ndefaultEntryPoints = [\u0026#34;http\u0026#34;, \u0026#34;https\u0026#34;] [web] address = \u0026#34;:8080\u0026#34; [web.auth.basic] users = [\u0026#34;admin:$apr1$v7kJtvT7$h0F7kxt.lAzFH4sZ8Z9ik.\u0026#34;] [entryPoints] [entryPoints.http] address = \u0026#34;:80\u0026#34; [entryPoints.http.redirect] entryPoint = \u0026#34;https\u0026#34; [entryPoints.https] address = \u0026#34;:443\u0026#34; [entryPoints.https.tls] [traefikLog] filePath=\u0026#34;./traefik.log\u0026#34; format = \u0026#34;json\u0026#34; # Below here comes from # www.smarthomebeginner.com/traefik-reverse-proxy-tutorial-for-docker/ # with values adjusted for local use, of course # Let\u0026#39;s encrypt configuration [acme] email=\u0026#34;amca01@gmail.com\u0026#34; storage=\u0026#34;./acme.json\u0026#34; acmeLogging=true onHostRule = true entryPoint = \u0026#34;https\u0026#34; # Use a HTTP-01 acme challenge rather than TLS-SNI-01 challenge [acme.httpChallenge] entryPoint = \u0026#34;http\u0026#34; [[acme.domains]] main = \u0026#34;numbersandshapes.net\u0026#34; sans = [\u0026#34;monitor.numbersandshapes.net\u0026#34;, \u0026#34;adminer.numbersandshapes.net\u0026#34;, \u0026#34;portainer.numbersandshapes.net\u0026#34;, \u0026#34;kanboard.numbersandshapes.net\u0026#34;, \u0026#34;webwork.numbersandshapes.net\u0026#34;, \u0026#34;blog.numbersandshapes.net\u0026#34;] # Connection to docker host system (docker.sock) [docker] endpoint = \u0026#34;unix:///var/run/docker.sock\u0026#34; domain = \u0026#34;numbersandshapes.net\u0026#34; watch = true # This will hide all docker containers that don\u0026#39;t have explicitly set label to \u0026#34;enable\u0026#34; exposedbydefault = false and (part of) my docker-compose configuration, the file docker-compose.yml:\nversion: \u0026#34;3\u0026#34; networks: proxy: external: true internal: external: false services: traefik: image: traefik:1.6.0-alpine container_name: traefik restart: always command: --web --docker --logLevel=DEBUG volumes: - /var/run/docker.sock:/var/run/docker.sock - $PWD/traefik.toml:/traefik.toml - $PWD/acme.json:/acme.json networks: - proxy ports: - \u0026#34;80:80\u0026#34; - \u0026#34;443:443\u0026#34; labels: - traefik.enable=true - traefik.backend=traefik - traefik.frontend.rule=Host:monitor.numbersandshapes.net - traefik.port=8080 - traefik.docker.network=proxy blog: image: blog volumes: - /home/amca/docker/whats_this/public:/usr/share/nginx/html networks: - internal - proxy labels: - traefik.enable=true - traefik.backend=blog - traefik.docker.network=proxy - traefik.port=80 - traefik.frontend.rule=Host:blog.numbersandshapes.net The way this works, at least in respect of this blog, is that files copied into the directory /home/amca/docker/whats_this/public on my VPS will be automatically served by nginx. So all I now need is a command on my local system (on which I do all my blog writing), which serves up these files. I\u0026#39;ve called it docker-deploy:\nhugo -b \u0026#34;https://blog.numbersandshapes.net/\u0026#34; -t \u0026#34;blackburn\u0026#34; \u0026amp;\u0026amp; rsync -avz -e \u0026#34;ssh\u0026#34; --delete public/ amca@numbersandshapes.net:~/docker/whats_this/public Remarkably enough, it all works!\nOne issue I had at the beginning was that my original blog was served up at the URL https://numberdsandshapes.net/blog and for some reason these links were still appearing in my new blog. It turned out (after a lot of anguished messages) that it was my mis-handling of rsync. I just ended up deleting everything except for the blog source files, and re-created everything from scratch.\nHouseholder\u0026#39;s methods\u0026#xa0;\u0026#xa0;\u0026#xa0;mathematics\u0026#xa0;algebra These are a class of root-finding methods; that is, for the numerical solution of a single nonlinear equation, developed by [Alston Scott Householder](https://en.wikipedia.org/wiki/Alston_Scott_Householder) in 1970. They may be considered a generalisation of the well known [Newton-Raphson method](https://en.wikipedia.org/wiki/Newton\u0026#39;s_method) (also known more simply as Newton\u0026#39;s method) defined by\n\\[ x\\leftarrow x-\\frac{f(x)}{f\u0026#39;(x)}. \\]\nwhere the equation to be solved is $f(x)=0$.\nFrom a starting value $x_0$ a sequence of iterates can be generated by\n\\[ x_{n+1}=x_n-\\frac{f(x_n)}{f\u0026#39;(x_n)}. \\]\nAs is well known, Newton\u0026#39;s method exhibits quadratic convergence; that is, if the sequence of iterates converges to a root value $r$, then the limit\n\\[ \\lim_{n\\to\\infty}\\frac{x_{n+1}-r}{(x_n-r)^2} \\]\nis finite. This means, in effect, that the number of correct decimal places doubles at each step. Householder\u0026#39;s method for a rate of convergence $d+1$ is defined by\n\\[ x\\leftarrow x-d\\frac{(1/f)^{(d-1)}(x)}{(1/f)^{(d)}(x)}. \\]\nWe show how this definition can be rewritten in terms of ratios of derivatives, by using Python and its symbolic toolbox [SymPy](https://www.sympy.org/en/index.html).\nWe start by defining some variables and functions.\nfrom sympy import * x = Symbol(\u0026#39;x\u0026#39;) f = Function(\u0026#39;f\u0026#39;)(x) Now we can define the first Householder formula, with $d=1$:\nd = 1 H1 = x + d*diff(1/f,x,d-1)/diff(1/f,x,d) H1 \\[ x-\\frac{f(x)}{\\frac{d}{dx}f(x)} \\]\nwhich is Newton\u0026#39;s formula. Now for $d=2$:\nd = 2 H2 = x + d*diff(1/f,x,d-1)/diff(1/f,x,d) H2 \\[ x - \\frac{2 \\frac{d}{d x} f{\\left (x \\right )}}{- \\frac{d^{2}}{d x^{2}} f{\\left (x \\right )} + \\frac{2 \\left(\\frac{d}{d x} f{\\left (x \\right )}\\right)^{2}}{f{\\left (x \\right )}}} \\]\nThis is a mighty messy formula, but it can be greatly simplified by using ratios of derivatives defined by\n\\[ r_k=\\frac{f^{(d-1}(x)}{f^{(d)}(x)} \\] This means that \\[ r_1=\\frac{f}{f\u0026#39;},\\quad r_2=\\frac{f\u0026#39;}{f^{\\prime\\prime}} \\] To make the substitution into the current expression above, we can use the substitutions \\[ f^{\\prime\\prime}=f\u0026#39;/r_2,\\quad f\u0026#39;=f/r_1 \\] to be done sequentially (first defining the new symbols)\nr_1,r_2,r_3 = symbols(\u0026#39;r_1,r_2,r_3\u0026#39;) H2r = H2s.subs([(Derivative(f,x,2), Derivative(f,x)/r_2), (Derivative(f,x), f/r_1)]).simplify() H2r \\[ -\\frac{2r_1r_1}{r_1-2r_2} \\] Dividing the top and bottom by $2r_2$ produces the formulation \\[ \\frac{r_1}{1-\\displaystyle{\\frac{r_1}{2r_2}}} \\] and so Householder\u0026#39;s method for $d=2$ is defined by the recurrence \\[ x\\leftarrow x-\\frac{r_1}{1-\\displaystyle{\\frac{r_1}{2r_2}}}. \\] This is known as [Halley\u0026#39;s method](https://en.wikipedia.org/wiki/Halley\u0026#39;s_method), after [Edmond Halley](https://en.wikipedia.org/wiki/Edmond_Halley), also known for his comet. This method has been called the most often rediscovered iteration formula in the literature.\nIt would exhibit cubic convergence, which means that the number of correct figures roughly triples at each step.\nApply the same sequence of steps for $d=3$, and including the substitution \\[ f^{\\prime\\prime\\prime} = f^{\\prime\\prime}/r_3 \\] produces the fourth order formula \\[ x\\leftarrow x-\\frac{3 r_{1} r_{3} \\left(2r_{2} - r_{1}\\right)}{r_{1}^{2} - 6 r_{1} r_{3} + 6 r_{2} r_{3}} \\]\nA test We\u0026#39;ll use the equation \\[ x^5+x-1=0 \\] which has a root close to $0.7$. First Newton\u0026#39;s method, which is the Householder method of order $d=1$, and we start by defining the symbol $x$ and the function $f$:\nx = Symbol(\u0026#39;x\u0026#39;) f = x**5+x-1 Next define the iteration of Newton\u0026#39;s method, which can be turned into a function with the handy tool lambdify:\nnr = lambdify(x, x - f/diff(f,x)) Now, a few iterations, and print them as strings:\ny = 0.7 ys = [y] for i in range(10): y = N(nr(y),100) ys += [y] for i in ys: print(str(i)) 0.7 0.7599545557827765973613054484303575009107589721679687500000000000000000000000000000000000000000000000 0.7549197891599746887794253559985793967456078439525201893202319456623650882121929457935763902468565963 0.7548776691557956141971506438033504033307707534709697222674827264390889507161368160254597915269779252 0.7548776662466927739251146002523856449587324643131536407777773148939177229546284200355119465808326870 0.7548776662466927600495088963585290075677963335246916447723036615900830138144428153523526591809355834 0.7548776662466927600495088963585286918946066177727931439892839706462440390043279509776806970677946058 0.7548776662466927600495088963585286918946066177727931439892839706460806551280810907382270928422503037 0.7548776662466927600495088963585286918946066177727931439892839706460806551280810907382270928422503037 0.7548776662466927600495088963585286918946066177727931439892839706460806551280810907382270928422503037 0.7548776662466927600495088963585286918946066177727931439892839706460806551280810907382270928422503037 We can easily compute the number of correct decimal places each time by simply finding the first place in each string where it differs from the previous one:\nfor i in range(1,7): d = [ys[i][j] == ys[i+1][j] for j in range(102)] print(d.index(False)-2) \\begin{array}{r} 2\\cr 3\\cr 8\\cr 16\\cr 32\\cr 66\n\\end{array}\nand we see a remarkable closeness with doubling of the number of correct values each iteration.\nNow, the fourth order method, with $d=3$:\nr1 = lambdify(x,g(x)/diff(g(x),x)) r2 = lambdify(x,diff(g(x),x)/diff(g(x),x,2)) r3 = lambdify(x,diff(g(x),x,2)/diff(g(x),x,3)) h3 = lambdify(x,x-3*r1(x)*r3(x)*(2*r2(x)-r1(x))/(r1(x)**2-6*r1(x)*r3(x)+6*r2(x)*r3(x))) Now we basically copy down the above commands, except that we\u0026#39;ll use 1500 decimal places instead of 100:\ny = 0.7 ys = [str(x)] for i in range(10): y = N(h3(x),1500) ys += [str(y)] for i in range(1,6): d = [xs[i][j] == xs[i+1][j] for j in range(1502)] print(d.index(False)-2) \\begin{array}{r} 4\n19\n76\n308\n1233\n\\end{array}\nand we that the number of correct decimal places at each step is indeed increased by a factor very close to 4.\nThe Joukowsky Transform\u0026#xa0;\u0026#xa0;\u0026#xa0;mathematics\u0026#xa0;geometry\u0026#xa0;jsxgraph The [Joukowksy Transform](https://en.wikipedia.org/wiki/Joukowsky_transform) is an elegant and simple way to create an airfoil shape. Let $C$ be a circle in the complex plane that passes through the point $z=1$ and encompasses the point $z=-1$. The transform is defined as\n\\[ \\zeta=z+\\frac{1}{z}. \\]\nWe can explore the transform by looking at the circles centred at $(-r,0)$ with $r\u0026lt;0$ and with radius $1+r$:\n\\[ \\|z-r\\|=1+r \\]\nor in cartesian coordinates with parameter $t$:\n\\[\n\\begin{aligned} x \u0026amp;= -r+(1+r)\\cos(t)\\\\ y \u0026amp;= (1+r)\\sin(t) \\end{aligned} \\]\nso that \\[ (x,y)\\rightarrow \\left(x+\\frac{x}{x^2+y^2},y-\\frac{y}{x^2+y^2}\\right). \\]\nTo see this in action, move the point $c$ in this diagram about. You\u0026#39;ll get the best result when it is close to the origin.\nDouble Damask\u0026#xa0;\u0026#xa0;\u0026#xa0;humour This was a comedy sketch initially performed in the revue [\u0026#34;Clowns in Clover\u0026#34;](http://www.guidetomusicaltheatre.com/shows_c/clownsclover.htm) which had its first performance at the Adelphi Theatre in London on December 1, 1927. This particular sketch was written by [Dion Titheradge](http://en.wikipedia.org/wiki/Dion_Titheradge) and starred the inimitable [Cicely Courtneidge](https://en.wikipedia.org/wiki/Cicely_Courtneidge) as the annoyed customer Mrs Spooner. It has been recorded and is available on many different collections; you can also hear it on [youtube](https://www.youtube.com/watch?v=0P8XSUGSR-c).\nI have loved this sketch since I first heard it as a teenager on a three record collection called something like \u0026#34;Masters of Comedy\u0026#34;, being a collection of classic sketches. Double Damask has also been performed by Beatrice Lillie, and you can search for this also on youtube. For example, [here](https://www.youtube.com/watch?v=GiRyqDfNxqU). I hope admirers of the excellent Ms Lillie will not be upset by my saying I far prefer Cicely Courtneidge, whose superb diction and impeccable comic timing are beyond reproach.\nNo doubt the original script is available somewhere, but in the annoying way of the internet, I couldn\u0026#39;t find it. So here is my transcription of the Courtneidge version of \u0026#34;Double Damask\u0026#34;.\n—\nDouble Damask\nwritten by\nDion Titheradge\nCharacters:\\ A customer, Mrs Spooner\\ A shop assistant (unnamed)\\ A manager, Mr Peters\nScene: The linen department of a large store.\nMRS SPOONER: I wonder if you could tell me if my order has gone off yet?\nASSISTANT: Not knowing your order, madam, I really couldn\u0026#39;t say.\nMRS SPOONER: But I was in here an hour ago and gave it to you.\nASSISTANT: What name, madam?\nMRS SPOONER: Spooner, Mrs Spooner,\nASSISTANT: Have you an address?\nMRS SPOONER: Do I look as if I live in the open air? I gave a large order for sheets and tablecloths, to be sent to Bacon Villa, Egham. (pronounced \u0026#34;Eg\u0026#39;m\u0026#34;)\nASSISTANT: Eg\u0026#39;m?\nMRS SPOONER: I hope I speak plainly: Egg Ham!\nASSISTANT: Oh yes, yes I remember perfectly now, Madam. Let me see now… no, your order won\u0026#39;t go through until tomorrow morning. Is there anything further?\nMRS SPOONER: Yes, (very quickly) I want two dozen double damask dinner napkins.\nASSISTANT: I beg your pardon?\nMRS SPOONER (as quicky as before): I said two dozen double damask dinner napkins.\nASSISTANT: I\u0026#39;m sorry madam, I don\u0026#39;t quite catch -\nMRS SPOONER: Dinner napkins, man! Dinner napkins!\nASSISTANT: Of course madam. Plain?\nMRS SPOONER: Not plain, double damask.\nASSISTANT: Yes… would you mind repeating your order Madam? I\u0026#39;m not quite sure.\nMRS SPOONER: I want two dozen dammle dubbuck; I want two dammle dubb… oh dear, stupid of me! I want two dozen dammle dizzick danner nipkins.\nASSISTANT: Danner nipkins Madam?\nMRS SPOONER: Yes.\nASSISTANT: You mean dinner napkins.\nMRS SPOONER: That\u0026#39;s what I said.\nASSISTANT: No, pardon me, Madam, you said danner nipkins!\nMRS SPOONER: Don\u0026#39;t be ridiculous! I said dinner napkins, and I meant danner nipkins. Nipper dank…you know you\u0026#39;re getting me muddled now.\nASSISTANT: I\u0026#39;m sorry Madam. You want danner nipkins, exactly. How many?\nMRS SPOONER: Two duzzle.\nASSISTANT: Madam?\nMRS SPOONER: Oh, gracious, young man - can\u0026#39;t you get it right? I want two dubbin duzzle damask dinner napkins.\nASSISTANT: Oh no, Madam, not two dubbin - you mean two dozen!\nMRS SPOONER: I said two dozen! Only they must be dammle duzzick!\nASSISTANT: No, we haven\u0026#39;t any of that in stock, Madam.\nMRS SPOONER (in a tone of complete exasperation): Oh dear, of all the fools! Can\u0026#39;t I find anybody, just anybody with a modicum of intelligence in this store?\nASSISTANT: Well, here is our Mr Peters, Madam. Now perhaps if you ask him he might-\nMR PETERS (In an authoritative \u0026#34;we can fix anything\u0026#34; kind of voice): Can I be of any assistance to you, Madam?\nMRS SPOONER: I\u0026#39;m sorry to say that your assistant doesn\u0026#39;t appear to speak English. I\u0026#39;m giving an order, but it might just as well be in Esperanto for all he understands.\nMR PETERS: Allow me to help you Madam. You require?\nMRS SPOONER: I require (as quickly as before) two dozen double damask dinner napkins.\nMR PETERS: I beg pardon, Madam?\nMRS SPOONER: Oh heavens - can\u0026#39;t you understand?\nMR PETERS: Would you mind repeating your order, Madam.\nMRS SPOONER: I want two dazzen -\nMR PETERS: Two dozen!\nMRS SPOONER: I said two dozen!\nMR PETERS: Oh no no Madam - no, you said two dazzen. But I understand perfectly what you mean. You mean two dozen; in other words - a double dozen.\nMRS SPOONER: That\u0026#39;s it! A duzzle dubbin double damask dinner napkins.\nMR PETERS: Oh no, pardon me, Madam, pardon me: you mean a double dozen double dummick dinner napkins.\nASSISTANT: Double damask, sir.\nMR PETERS: I said double damask! It\u0026#39;s… dapper ninkins you require, sir.\nMRS SPOONER: Please get it right, I want dinner napkins, dinner napkins.\nMR PETERS: I beg pardon, Madam. So stupid of me…one gets so confused… (Laughs)\nMRS SPOONER: It is not a laughing matter.\nMR PETERS: Of course. Dipper nankins, madam.\nASSISTANT: Dapper ninkins, sir.\nMRS SPOONER: Danner nipkins.\nMR PETERS: I understand exactly what Madam wants. It is two d-d-d-d-..two d- Would you mind repeating your order please, Madam?\nMRS SPOONER: Ohhh, dear.. I want two duzzle dizzen damask dinner dumplings!\nMR PETERS: Allow me, Madam, allow me. The lady requires (quickly) two dubbin double damask dunner napkins.\nASSISTANT: Dunner napkins sir?\nMR PETERS: Certainly! Two dizzen.\nMRS SPOONER: Not two dizzen - I want two dowzen!\nMR PETERS: Quite so, Madam, quite so. If I may say so we\u0026#39;re getting a little bit confused, splitting it up, as it were. Now, the full order, the full order, is two dazzen dibble dummisk n\u0026#39;dipper dumkins.\nASSISTANT: Excuse me, sir, you mean two dummen dammle dimmick dizzy napkins.\n(The next four four lines are spoken almost on top of each other)\nMRS SPOONER: I do not want dizzy napkins, I want two dizzle dammen damask -\nMR PETERS: No - two dizzle dammle dizzick!\nASSISTANT: Two duzzle dummuck dummy!\nMRS SPOONER: Two damn dizzy diddle dimmer dipkins!\nMR PETERS (Shocked): Madam, Madam! Please, please - your language!\nMRS SPOONER: Oh, blast. Give me twenty four serviettes.\nGraphs of Eggs\u0026#xa0;\u0026#xa0;\u0026#xa0;geometry\u0026#xa0;jsxgraph I recently came across some nice material on [John Cook\u0026#39;s blog](https://www.johndcook.com/blog/) about equations that described eggs.\nIt turns out there are vast number of equations whose graphs are egg-shaped: that is, basically ellipse shape, but with one end \u0026#34;rounder\u0026#34; than the other.\nYou can see lots at Jürgen Köller\u0026#39;s [Mathematische Basteleien](http://www.mathematische-basteleien.de/eggcurves.htm) page. (Although this blog is mostly in German, there are enough English language pages for monoglots such as me). And plenty of egg equations can be found in the [2dcurves](http://www.2dcurves.com/) pages. Another excellent source of eggy equations is [TDCC Laboratory](http://www.geocities.jp/nyjp07/index_egg_E.html) from Japan (the link here is to their English language page). For the purposes of experimenting we will use equations from this TDCC, adjusted as necessary. Many of their equations are given in parametric form, which means they can be easily graphed and explored using [JSXGraph](https://jsxgraph.org/wp/index.html).\nThe first set of parametric equations, whose author is given to be Nobuo Yamamoto, is:\n$$\\begin{aligned} x\u0026amp;=(a+b+b\\cos\\theta)\\cos\\theta\\\\ y\u0026amp;=(a+b\\cos\\theta)\\sin\\theta \\end{aligned}$$\nIf we divide these equations by $a$, and use the parameter $c$ for $b/a$ we obtain slightly simpler equations:\n$$\\begin{aligned} x\u0026amp;=(1+c+c\\cos\\theta)\\cos\\theta\\\\ y\u0026amp;=(1+c\\cos\\theta)\\sin\\theta \\end{aligned}$$\nHere you can explore values of $c$ between 0 and 1:\nAnother [set of equations](http://www.geocities.jp/nyjp07/index_egg_by_Itou_E.html) is said to be due to [Tadao Ito](http://web1.kcn.jp/hp28ah77/us_author.htm) (whose surname is sometimes transliterated as Itou):\n$$\\begin{aligned} x\u0026amp;=\\cos\\theta\\\\ y\u0026amp;=c\\cos\\frac{\\theta}{4}\\sin\\theta \\end{aligned}$$\nMany more equations: parametric, implicit, can be found at the sites linked above.\nExploring JSXGraph\u0026#xa0;\u0026#xa0;\u0026#xa0;jsxgraph [JSXGraph](https://jsxgraph.org/wp/index.html) is a graphics package deveoped in Javascript, and which seems to be tailor-made for a static blog such as this. It consists of only two files: the javascript file itself, and an accompanying css file, which you can download. Alternaively you can simply link to the online files at the Javascript content delivery site [cdnjs](https://cdnjs.com/about) managed by [cloudflare](https://www.cloudflare.com/). There are cloudflare servers all over the world - even in my home town of Melbourne, Australia. So I modified the head.html file of my theme to include a link to the necessary files:\nSo I downloaded the javascript and css files as described [here](https://jsxgraph.uni-bayreuth.de/wp/download/index.html) and also, for good measure, added the script line (from that page) to the layouts/partials/head.html file of the theme. Then copied the following snippet from the JSXGraph site:\n\u0026lt;div id=\u0026#34;box\u0026#34; class=\u0026#34;jxgbox\u0026#34; style=\u0026#34;width:500px; height:500px;\u0026#34;\u0026gt;\u0026lt;/div\u0026gt; \u0026lt;script type=\u0026#34;text/javascript\u0026#34;\u0026gt; var board = JXG.JSXGraph.initBoard(\u0026#39;box\u0026#39;, {boundingbox: [-10, 10, 10, -10], axis:true}); \u0026lt;/script\u0026gt; However, to make this work the entire script needs to be inside a \u0026lt;div\u0026gt;, \u0026lt;/div\u0026gt; pair, like this:\n\u0026lt;div id=\u0026#34;box\u0026#34; class=\u0026#34;jxgbox\u0026#34; style=\u0026#34;width:500px; height:500px;\u0026#34;\u0026gt; \u0026lt;script type=\u0026#34;text/javascript\u0026#34;\u0026gt; var board = JXG.JSXGraph.initBoard(\u0026#39;box\u0026#39;, {boundingbox: [-10, 10, 10, -10], axis:true}); \u0026lt;/script\u0026gt; \u0026lt;/div\u0026gt; Just to see how well this works, here\u0026#39;s Archimedes\u0026#39; neusis construction of an angle trisection: given an angle $\\theta$ in a unit semicircle, its trisection is obtained by laying against the circle a straight line with points spaced 1 apart (drag point A about the circle to see this in action):\nFor what it\u0026#39;s worth, here is the splendid javascript code to produce the above figure:\n\u0026lt;div id=\u0026#34;box\u0026#34; class=\u0026#34;jxgbox\u0026#34; style=\u0026#34;width:500px; height:333.33px;\u0026#34;\u0026gt; \u0026lt;script type=\u0026#34;text/javascript\u0026#34;\u0026gt; JXG.Options.axis.ticks.insertTicks = false; JXG.Options.axis.ticks.drawLabels = false; var board = JXG.JSXGraph.initBoard(\u0026#39;box\u0026#39;, {boundingbox: [-1.5, 1.5, 3, -1.5],axis:true}); var p = board.create(\u0026#39;point\u0026#39;,[0,0],{visible:false,fixed:true}); var neg = board.create(\u0026#39;point\u0026#39;,[-0.67,0],{visible:false,fixed:true}); var c = board.create(\u0026#39;circle\u0026#39;,[[0,0],1.0]); var a = board.create(\u0026#39;glider\u0026#39;,[-Math.sqrt(0.5),Math.sqrt(0.5),c],{name:\u0026#39;A\u0026#39;}); var l1 = board.create(\u0026#39;segment\u0026#39;,[a,p]); var ang = board.create(\u0026#39;angle\u0026#39;,[a,p,neg],{radius:0.67,name:\u0026#39;θ\u0026#39;}); var theta = JXG.Math.Geometry.rad(a,p,neg); var bb = board.create(\u0026#39;point\u0026#39;,[function(){return Math.cos(Math.atan2(a.Y(),-a.X())/3);},function(){return Math.sin(Math.atan2(a.Y(),-a.X())/3);}],{name:\u0026#39;B\u0026#39;}); var w = board.create(\u0026#39;point\u0026#39;,[function(){return Math.cos(Math.atan2(a.Y(),-a.X())/3)/0.5;},0]); var l2 = board.create(\u0026#39;line\u0026#39;,[a,w]); var l3 = board.create(\u0026#39;segment\u0026#39;,[p,bb]); var l4 = board.create(\u0026#39;segment\u0026#39;,[bb,w],{strokeWidth:6,strokeColor:\u0026#39;#FF0000\u0026#39;}); var ang2 = board.create(\u0026#39;angle\u0026#39;,[bb,w,neg],{radius:0.67,name:\u0026#39;θ/3\u0026#39;}); \u0026lt;/script\u0026gt; \u0026lt;/div\u0026gt; Quite wonderful, it is.\nThe trinomial theorem\u0026#xa0;\u0026#xa0;\u0026#xa0;mathematics\u0026#xa0;algebra When I was teaching the binomial theorem (or, to be more accurate, the binomial expansion) to my long-suffering students, one of them asked me if there was a trinomial theorem. Well, of course there is, although in fact expanding sums of greater than two terms is generally not classed as a theorem described by the number of terms. The general result is\n\\[ (x_1+x_2+\\cdots+x_k)^n=\\sum_{a_1+a_2+\\cdots+a_k=n} {n\\choose a_1,a_2,\\ldots,a_k}x_1^{a_1}x_2^{a_2}\\cdots x_k^{a_k} \\]\nso in particular a \u0026#34;trinomial theorem\u0026#34; would be\n\\[ (x+y+z)^n=\\sum_{a+b+c=n}{n\\choose a,b,c}x^ay^bz^c. \\]\nHere we define\n\\[ {n\\choose a,b,c}=\\frac{n!}{a!b!c!} \\]\nand this is known as a trinomial coefficient; more generally, for an arbitrary number of variables, it is a multinomial coefficient. It is guaranteed to be an integer if the lower values sum to the upper value.\nSo to compute $(x+y+z)^5$ we could list all integers $a,b,c$ with $0\\le a,b,c\\le 5$ for which $a+b+c=5$, and put them all into the above sum. But of course there\u0026#39;s a better way, and it comes from expanding $(x+y+z)^5$ as a binomial $(x+(y+z))^5$ so that\n\\begin{array}{rcl} (x+(y+x))^5\u0026amp;=\u0026amp;x^5\n\u0026amp;\u0026amp;+5x^4(y+z)\n\u0026amp;\u0026amp;+10x^3(y+z)^2\n\u0026amp;\u0026amp;+10x^2(y+z)^3\n\u0026amp;\u0026amp;+5x(y+z)^4\n\u0026amp;\u0026amp;+(y+z)^5\n\\end{array}\nNow we can expand each of those binomial powers:\n\\begin{array}{rcl} (x+(y+x))^5\u0026amp;=\u0026amp;x^5\n\u0026amp;\u0026amp;+5x^4(y+z)\n\u0026amp;\u0026amp;+10x^3(y^2+2yz+z^2)\n\u0026amp;\u0026amp;+10x^2(y^3+3y^2z+3yz^2+z^3)\n\u0026amp;\u0026amp;+5x(y^4+4y^3z+6y^2z^2+4yz^3+z^4)\n\u0026amp;\u0026amp;+(y^5+5y^4z+10y^3z^2+10y^2z^3+5yz^4+z^5)\n\\end{array}\nExpanding this produces\n\\begin{split} x^5\u0026amp;+5x^4y+5x^4z+10x^3y^2+20x^3yz+10x^3z^2+10x^2y^3+30x^2y^2z+30x^2yz^3\\\\ \u0026amp;+10x^2z^3+5zy^4+20xy^3z+30xy^2z^2+20xyz^3+5xz^4+y^5+5y^4z+10y^3z^2\\\\ \u0026amp;+10y^2z^3+5yz^4+z^5 \\end{split} which is an equation of rare beauty.\nBut there\u0026#39;s a nice way of setting this up, which involves writing down Pascal\u0026#39;s triangle to the fifth row, and putting a fifth row, as a column, on the side. Then multiply across:\n\\begin{array}{lcccccccccc} 1\u0026amp;\u0026amp;\u0026amp;\u0026amp;\u0026amp;\u0026amp;1\u0026amp;\u0026amp;\u0026amp;\u0026amp;\u0026amp;\n5\u0026amp;\u0026amp;\u0026amp;\u0026amp;\u0026amp;1\u0026amp;\u0026amp;1\u0026amp;\u0026amp;\u0026amp;\u0026amp;\n10\\quad×\u0026amp;\u0026amp;\u0026amp;\u0026amp;1\u0026amp;\u0026amp;2\u0026amp;\u0026amp;1\u0026amp;\u0026amp;\u0026amp;\n10\u0026amp;\u0026amp;\u0026amp;1\u0026amp;\u0026amp;3\u0026amp;\u0026amp;3\u0026amp;\u0026amp;1\u0026amp;\u0026amp;\n5\u0026amp;\u0026amp;1\u0026amp;\u0026amp;4\u0026amp;\u0026amp;6\u0026amp;\u0026amp;4\u0026amp;\u0026amp;1\u0026amp;\n1\u0026amp;1\u0026amp;\u0026amp;5\u0026amp;\u0026amp;10\u0026amp;\u0026amp;10\u0026amp;\u0026amp;5\u0026amp;\u0026amp;1\n\\end{array}\nto produce the final array of coefficients (with index numbers at the left):\n\\begin{array}{l*{10}{c}} 0\\qquad{}\u0026amp;\u0026amp;\u0026amp;\u0026amp;\u0026amp;\u0026amp;1\u0026amp;\u0026amp;\u0026amp;\u0026amp;\u0026amp;\n1\u0026amp;\u0026amp;\u0026amp;\u0026amp;\u0026amp;5\u0026amp;\u0026amp;5\u0026amp;\u0026amp;\u0026amp;\u0026amp;\n2\u0026amp;\u0026amp;\u0026amp;\u0026amp;10\u0026amp;\u0026amp;20\u0026amp;\u0026amp;10\u0026amp;\u0026amp;\u0026amp;\n3\u0026amp;\u0026amp;\u0026amp;10\u0026amp;\u0026amp;30\u0026amp;\u0026amp;30\u0026amp;\u0026amp;10\u0026amp;\u0026amp;\n4\u0026amp;\u0026amp;5\u0026amp;\u0026amp;20\u0026amp;\u0026amp;30\u0026amp;\u0026amp;20\u0026amp;\u0026amp;5\u0026amp;\n5\u0026amp;1\u0026amp;\u0026amp;5\u0026amp;\u0026amp;10\u0026amp;\u0026amp;10\u0026amp;\u0026amp;5\u0026amp;\u0026amp;1\n\\end{array}\nRow $i$ of this array corresponds to $x^{5-i}$ and all combinations of powers $y^bz^c$ for $0\\le b,c\\le i$. Thus for example the fourth row down, corresponding to \\( i=3 \\), may be considered as the coefficients of the terms\n\\[ x^2y^3,\\quad x^2y^2z,\\quad x^2yz^2,\\quad xz^3. \\]\nNote that the triangle of coefficients is symmetrical along all three centre lines, as well as rotationally symmetric by 120°. Playing with Hugo\u0026#xa0;\u0026#xa0;\u0026#xa0;hugo\u0026#xa0;org I\u0026#39;ve been using wordpress as my blogging platform since I first started, about 10 years ago. (In fact the first post I can find is dated March 30, 2008.) I chose [wordpress.com](http://wordpress.com) back then because it was (a) free, and (b) supported mathematics through a version (or subset) of [LaTeX](https://www.latex-project.org). As I have used LaTeX extensively for all my writing since the early 1990\u0026#39;s, it\u0026#39;s a standard requirement for me.\nSome time later I decided to start hosting my own server (well, a VPS), on which I could use [wordpress.org](https://wordpress.org), which is the self-hosted version of wordpress. The advantages of a self hosted blog are many, but I particularly like the greater freedom, the ability to include a far greater variety of plugins, and the larger choice of themes. And one of the plugins I liked particularly was [WP QuickLaTeX](https://wordpress.org/plugins/wp-quicklatex/) which provided a LaTeX engine far superior to the in-built one of wordpress.com. Math bloggin heaven!\nHowever, hosting my own wordpress site was not without difficulty. First I had to install it and get it up and running (even this was non-trivial), and then I had to manage all the users and passwords: myself as a standard user, wp-admin for accessing the Wordpress site itself, a few others. I have quite a long list containing all the commands I used, and all the users and passwords I created.\nThis served me well, but it was also slow to use. My VPS is perfectly satisfactory, but it is not fast (I\u0026#39;m too cheap to pay for much more than a low-powered one), and the edit-save-preview cycle of online blogging with my wordpress installation was getting tiresome. Plus the issue of security. I\u0026#39;ve been hacked once, and I\u0026#39;ve since managed to secure my site with a free certificate from [Let\u0026#39;s Encrypt](https://letsencrypt.org). In fact, in many ways Let\u0026#39;s Encrypt is one of the best things to have happened for security. An open Certificate Authority is manna from heaven, as far as I\u0026#39;m concerned.\nWordpress is of course more than just blogging software. It now grandly styles itself as Site Building software and Content Management System, and the site claims that \u0026#34;30% of the web uses Wordpress\u0026#34;. It is in fact hugely powerful and deservedly popular, and can be used for pretty much whatever sort of site you want to build. Add to that a seemingly infinite set of plugins, and you have an entire ecosystem of web-building.\nHowever, all of that popularity and power comes at a cost: it is big, confusing, takes work to maintain, keep secure, and keep up-to-date, and is a target for hackers. Also for me, it has become colossal overkill. I don\u0026#39;t need all those bells and whistles; all I want to do is host my blog and share my posts with the world (the $1.5\\times 10^{-7}\\%$ of the world who reads it).\nThe kicker for me was checking out a [mathematics education blog](http://rtalbert.org) by an author I admire greatly, to discover it was built with the static blog engine [jekyll](https://jekyllrb.com). So being the inventive bloke I am, I thought I\u0026#39;d do the same.\nBut a bit of hunting led me to [Hugo](https://gohugo.io), which apparently is very similar to jekyll, but much faster, and written in [Go](https://golang.org) instead of [Ruby](https://www.ruby-lang.org/en/). Since I know nothing about either Go or Ruby I don\u0026#39;t know if it\u0026#39;s the language which makes the difference, or something else. But it sure looks nice, and supports [mathjax](https://www.mathjax.org) for LaTeX.\nSo my current plan is to migrate from wordpress to Hugo, and see how it goes!\nPython GIS, and election results\u0026#xa0;\u0026#xa0;\u0026#xa0;GIS\u0026#xa0;python\u0026#xa0;voting Election mapping A few weeks ago there was a by-election in my local electorate (known as an electoral division) of Batman here in Australia. I was interested in comparing the results of this election with the previous election two years ago. In this division it\u0026#39;s become a two-horse race: the Greens against the Australian Labor Party. Although Batman had been a solid Labor seat for almost its entire existence - it used to be considered one of the safest Labor seats in the country - over the past decade or so the Greens have been making inroads into this Labor heartland, to the extent that is no longer considered a safe seat. And in fact for this particular election the Greens were the popular choice to win. In the end Labor won, but my interest is not so much tracing the votes, but trying to map them.\nPython has a vast suite of mapping tools, so much so that it may be that Python has become the GIS tool of choice. And there are lots of web pages devoted to discussing these tools and their uses, such as [this one](http://matthewrocklin.com/blog/work/2017/09/21/accelerating-geopandas-1).\nMy interest was producing maps such as are produced by [pollbludger](https://www.pollbludger.net/by-elections/fed-2018-03-batman.htm) This is the image from that page:\n![pollbludger](/pollbludger_batman.png)\nAs you can see there are basically three elements:\nthe underlying streetmap the border of the division the numbers showing the percentage wins of each party at the various polling booths. I wanted to do something similar, but replace the numbers with circles whose sizes showed the strength of the percentage win at each place.\nGetting the information Because this election was in a federal division, the management of the polls and of the results (including counting the votes) was managed by the Australian Electoral Commission, whose [pages about this by-election]( http://www.aec.gov.au/Elections/supplementary_by_elections/2018-batman/) contain pretty much all publicly available information. You can copy and paste the results from their pages, or download them as CSV files.\nThen I needed to find the coordinates (Longitude and Latitude) of all the polling places, of which there were 42 at fixed locations. There didn\u0026#39;t seem to be a downloadable file for this, so for each booth address (given on the AEC site), I entered it into Google Maps and copied down the coordinates as given.\nThe boundaries of all the divisions can again be downloaded from the [AEC GIS page](http://www.aec.gov.au/Electorates/gis/index.htm). These are given in various standard GIS files.\nPutting it all together The tools I felt brave enough to use were:\n[Pandas:](https://pandas.pydata.org) Python\u0026#39;s data analysis library. I really only needed to read information from CSV files that I could then use later. [Geopandas:](http://geopandas.org) This is a GIS library with Pandas-like syntax, and is designed in part to be a GIS extension to Pandas. I would use it to extract and manage the boundary data of the electoral division. [Cartopy:](http://scitools.org.uk/cartopy/) which is a library of \u0026#34;cartographic tools\u0026#34;. And of course the standard [matplotlib](http://matplotlib.org) for plotting, [numpy](http://www.numpy.org) for array handling.\nMy guides were the [London tube stations example](http://scitools.org.uk/cartopy/docs/latest/gallery/tube_stations.html) from Cartopy and a local (Australian) data analysis blog which discussed the [use of Cartopy](http://www.net-analysis.com/blog/cartopytiles.html) including adding graphics to an map image.\nThere are lots of other GIS tools for Python, some of which seem to be very good indeed, and all of which I downloaded:\n[Fiona](https://github.com/Toblerity/Fiona): which is a \u0026#34;nimble\u0026#34; API for handling maps [Descartes](https://bitbucket.org/sgillies/descartes/): which provides a means by which matplotlib can be used to manage geographic objects [geoplotlib](https://github.com/andrea-cuttone/geoplotlib): for \u0026#34;visualizing geographical data and making maps\u0026#34; [Folium](http://python-visualization.github.io/folium/): for visualizing maps using the [leaflet.js](http://leafletjs.com) library. It may be that the mapping I wanted to do with Python could have been done just as well in Javascript alone. And probably other languages. I stuck with Python simply because I knew it best. [QGIS](https://qgis.org/en/site/): which is designed to be a complete free and open source GIS, and with APIs both for Python and C++ [GDAL](http://www.gdal.org): the \u0026#34;Geospatial Data Abstraction Library\u0026#34; which has a [Python package](https://pypi.python.org/pypi/GDAL) also called GDAL, for manipulating geospatial raster and vector data. I suspect that if I was professionally working in the GIS area some or all of these packages would be at least as - and maybe even more - suitable than the ones I ended up using. But then, I was starting from a position of absolute zero with regards to GIS, and also I wanted to be able to make use of the tools I already knew, such as Pandas, matplotlib, and numpy.\nHere\u0026#39;s the start, importing the libraries, or the bits of them I needed:\nimport matplotlib.pyplot as plt import numpy as np import cartopy.crs as ccrs from cartopy.io.img_tiles import GoogleTiles import geopandas as gpd import pandas as pd I then had to read in the election data, which was a CSV files from the AEC containing the Booth, and the final distributed percentage weighting to the ALP and Greens candidates, and heir percentage scores. As well, I read in the boundary data:\nbb = pd.read_csv(\u0026#39;Elections/batman_booths_coords.csv\u0026#39;) # contains all election info plus lat, long of booths longs = np.array(bb[\u0026#39;Long\u0026#39;]) lats = np.array(bb[\u0026#39;Lat\u0026#39;]) v = gpd.read_file(\u0026#39;VicMaps/VIC_ELB.MIF\u0026#39;) # all electoral divisions in MapInfo form bg = v.loc[2].geometry # This is the Polygon representing Batman b_longs = bg.exterior.xy[0] # These next two lines are the longitudes and latitudes b_lats = bg.exterior.xy[1] # Notice that bb uses Pandas to read in the CSV files which contains all the AEC information, as well as the latitude and longitude of each Booth, which I\u0026#39;d added myself. Here longs and lats are the coordinates of the polling booths, and b_longs and b-lats are all the vertices which form the boundary of the division.\nNow it\u0026#39;s all pretty straigtforward, especially with the examples mentioned above:\nfig = plt.figure(figsize=(16,16)) tiler = GoogleTiles() ax = plt.axes(projection=tiler.crs) margin=0.01 ax.set_extent((bg.bounds[0]-margin, bg.bounds[2]+margin,bg.bounds[1]-margin, bg.bounds[3]+margin)) ax.add_image(tiler,12) for i in range(44): plt.plot(longs[i],lats[i],ga2[i],markersize=abs(ga[i]),alpha=0.7,transform=ccrs.Geodetic()) plt.plot(b_longs,b_lats,\u0026#39;k-\u0026#39;,linewidth=5,transform=ccrs.Geodetic()) plt.title(\u0026#39;Booth results in the 2018 Batman by-election\u0026#39;) plt.show() Here GoogleTiles provide the street map to be used as the \u0026#34;base\u0026#34; of our map. Open Streep Map (as OSM) is available too, but I thin in this instance, Google Maps is better. Because the map is rendered as an image (with some unavoidable blurring), I find that Google gave a better result than OSM.\nAlso, ga2 is a little array which simply produces plotting of the style ro (red circle) or go (green circle). Again, I make the program do most of the work.\nAnd here is the result, saved as an image:\n![Batman 2018](/batman2018trim.png)\nI\u0026#39;m quite pleased with this output. https://www.gnu.org/software/glpk/\nPresentations and the delight of js-reveal Presentations are a modern bugbear. Anybody in academia or business, or any professional field really, will have sat through untold hours of presentations. And almost all of them are terrible. Wordy, uninteresting, too many \u0026#34;transition effects\u0026#34;, low information content, you know as well as I do.\nPretty much every speaker reads the words on their slides, as though the audience were illiterate. I went to a talk once which consisted of 60 – yes, sixty – slides of very dense text, and the presenter read through each one. I think people were gnawing their own limbs off out of sheer boredom by the end. Andy Warhol\u0026#39;s \u0026#34;Empire\u0026#34; would have been a welcome relief.\nSince most of my talks are technical and full of mathematics, I have naturally gravitated to the LaTeX presentation tool Beamer. Now Beamer is a lovely thing for LaTeX: as part of the LaTeX ecosystem you get all of LaTeX loveliness along with elegant slide layouts, transitions, etc. My only issue with Beamer (and this is not a new observation by any means), is that all Beamer presentations have a certain sameness to them. I suspect that this is because most Beamer users are mathematicians, who are rightly more interested in cohttps://orgmode.orgntent than appearance. It is quite possible of course to make Beamer look like something new and different, but hardly anybody does.\nHowever, I am not a mathematician, I am a mathematics educator, and I do like my presentations to look good, and if possible to stand out a little. I also have a minor issue in that I use Linux on my laptop, which sometimes means my computer won\u0026#39;t talk to an external projector system. Or my USB thumb drive won\u0026#39;t be recognized by the computer I\u0026#39;ll be using, and so on. One way round all this is to use an online system; maybe one which can be displayed in a browser, and which can be placed on a web server somewhere. There are of course plenty of such tools, and I have had a brief dalliance with prezi, but for me prezi was not the answer: yes it was fun and provided a new paradigm for organizing slides, but really, when you took the whizz-bang aspect out, what was left? The few prezis I\u0026#39;ve seen in the wild showed that you can be as dull with prezi as with any other software. Also, at the time it didn\u0026#39;t support mathematics.\nIn fact I have an abiding distrust of the whole concept of \u0026#34;presentations\u0026#34;. Most are a colossal waste of time – people can read so there\u0026#39;s no need for wordiness, and most of the graphs and charts that make up the rest of most slides are dreary and lacklustre. Hardly anybody knows how to present information graphically in a way that really grabs people\u0026#39;s attention. It\u0026#39;s lazy and insulting to your audience to simply copy a chart from your spreadsheet and assume they\u0026#39;ll be delighted by it. Then you have the large class of people who fill their blank spaces with cute cartoons and clip art. This sort of thing annoys me probably more than it should – when I\u0026#39;m in an audience I don\u0026#39;t want to be entertained with cute irrelevant additions, I want to learn. This comes to the heart of presenting. A presenter is acting as a teacher; the audience the learners. So presenting should be about engaging the audience. What\u0026#39;s in your slides comes a distant second. I don\u0026#39;t want new technology with clever animations and transitions, bookmarks, non-linear slide shows; I want presenters to be themselves interesting. (As an aside, some of the very worst presentations have been at education conferences.)\nFor a superb example of attention-grabbing graphics, check out the TED talk by the late Hans Rosling. Or you can admire the work of David McCandless.\nI seem to have digressed, from talking about presentation software to banging on about the awfulness of presentations generally. So, back to the topic.\nFor a recent conference I determined to do just that: use an online presentation tool, and I chose reveal.js. I reckon reveal.js is presentations done right: elegant, customizable, making the best use of html for content and css for design; and with nicely chosen defaults so that even if you just put a few words on your slides the result will still look good. Even better, you can take your final slides and put them up on github pages so that you can access them from anywhere in the world with a web browser. And if you\u0026#39;re going somewhere which is not networked, you can always take your slides on some sort of portable media. And it has access to almost all of LaTeX via MathJax.\nOne minor problem with reveal.js is that the slides are built up with raw html code, and so can be somewhat verbose and hard to read (at least for me). However, there is a companion software for emacs org mode called org-reveal, which enables you to structure your reveal.js presentation as an org file. This is presentation heaven. The org file gives you structure, and reveal.js gives you a lovely presentation.\nTo make it available, you upload all your presentations to github.pages, and you can present from anywhere in the world with an internet connection! You can see an example of one of my short presentations at\nhttps://amca01.github.io/ATCM_talks/lindenmayer.html\nOf course the presentation (the software and what you do with it), is in fact the least part of your talk. By far the most important part is the presenter. The best software in the world won\u0026#39;t overcome a boring speaker who can\u0026#39;t engage an audience.\nI like my presentations to be simple and effect-free; I don\u0026#39;t want the audience to be distracted from my leaping and capering about. Just to see how it works\nThe Vigenere cipher in haskell\u0026#xa0;\u0026#xa0;\u0026#xa0;cryptography\u0026#xa0;haskell Programming the Vigenère cipher is my go-to problem when learning a new language. It\u0026#39;s only ever a few lines of code, but it\u0026#39;s a pleasant way of getting to grips with some of the basics of syntax. For the past few weeks I\u0026#39;ve been wrestling with Haskell, and I\u0026#39;ve now got to the stage where a Vigenère program is in fact pretty easy.\nAs you know, the Vigenère cipher works using a plaintext and a keyword, which is repeated as often as need be:\nT H I S I S T H E P L A I N T E X T K E Y K E Y K E Y K E Y K E Y K E Y The corresponding letters are added modulo 26 (using the values A=0, B=1, C=2, and on up to Z=25), then converted back to letters again. So for the example above, we have these corresponding values:\n19 7 8 18 8 18 19 7 4 15 11 0 8 13 19 4 23 19 10 4 24 10 4 24 10 4 24 10 4 24 10 4 24 10 4 24 Adding modulo 26 and converting back to letters:\n3 11 6 2 12 16 3 11 2 25 15 24 18 17 17 D L G C M Q D L C Z P Y S R R gives us the ciphertext.\nThe Vigenère cipher is historically important as it is one of the first cryptosystems where a single letter may be encrypted to different characters in the ciphertext. For example, the two \u0026#34;S\u0026#34;s are encrypted to \u0026#34;C\u0026#34; and \u0026#34;Q\u0026#34;; the first and last \u0026#34;T\u0026#34;s are encrypted to \u0026#34;D\u0026#34; and \u0026#34;R\u0026#34;. For this reason the cipher was considered unbreakable - as indeed it was for a long time - and was known to the French as le chiffre indéchiffrable - the unbreakable cipher. It was broken in 1863. See the Wikipedia page for more history.\nSuppose the length of the keyword is . Then the -th character of the plaintext will correspond to the character of the keyword (assuming a zero-based indexing). Thus the encryption can be defined as\n\\[ c_i = p_i+k_{i\\pmod{n}}\\pmod{26} \\]\nHowever, encryption can also be done without knowing the length of the keyword, but by shifting the keyword each time - first letter to the end - and simply taking the left-most letter. Like this:\nT H I S I S T H E P L A I N T E X T K E Y so \u0026#34;T\u0026#34;+\u0026#34;K\u0026#34; (modulo 26) is the first encryption. Then we shift the keyword:\nT H I S I S T H E P L A I N T E X T E Y K and \u0026#34;H\u0026#34;+\u0026#34;E\u0026#34; (modulo 26) is the second encrypted letter. Shift again:\nT H I S I S T H E P L A I N T E X T Y K E for \u0026#34;I\u0026#34;+\u0026#34;Y\u0026#34;; shift again:\nT H I S I S T H E P L A I N T E X T K E Y for \u0026#34;S\u0026#34;+\u0026#34;K\u0026#34;. And so on.\nThis is almost trivial in Haskell. We need two extra functions from the module Data.Char: chr which gives the character corresponding to the ascii value, and ord which gives the ascii value of a character:\nλ\u0026gt; ord \u0026#39;G\u0026#39; 71 λ\u0026gt; chr 88 \u0026#39;X\u0026#39; So here\u0026#39;s what might go into a little file called vigenere.hs:\nimport Data.Char (ord,chr) vige :: [Char] -\u0026gt; [Char] -\u0026gt; [Char] vige [] k = [] vige p [] = [] vige (p:ps) (k:ks) = (encode p k):(vige ps (ks++[k])) where encode a b = chr $ 65 + mod (ord a + ord b) 26 vigd :: [Char] -\u0026gt; [Char] -\u0026gt; [Char] vigd [] k = [] vigd p [] = [] vigd (p:ps) (k:ks) = (decode p k):(vigd ps (ks++[k])) where decode a b = chr $ 65 + mod (ord a - ord b) 26 And a couple of tests: the example from above, and the one on the Wikipedia page:\nλ\u0026gt; vige \u0026#34;THISISTHEPLAINTEXT\u0026#34; \u0026#34;KEY\u0026#34; \u0026#34;DLGCMQDLCZPYSRROBR\u0026#34; λ\u0026gt; vige \u0026#34;ATTACKATDAWN\u0026#34; \u0026#34;LEMON\u0026#34; \u0026#34;LXFOPVEFRNHR\u0026#34; Analysis of a recent election\u0026#xa0;\u0026#xa0;\u0026#xa0;voting\u0026#xa0;python On November 18, 2017, a by-election was held in my suburb of Northcote, on account of the death by cancer of the sitting member. It turned into a two-way contest between Labor (who had held the seat since its inception in 1927), and the Greens, who are making big inroads into the inner city. The Greens candidate won, much to Labor\u0026#39;s surprise. As I played a small part in this election, I had some interest in its result. And so I thought I\u0026#39;d experiment with the results and see how close the result was, and what other voting systems might have produced.\nIn Australia, the voting method used for almost all lower house elections (state and federal), is Instant Runoff Voting, also known as the Alternative Vote, and known locally as the \u0026#34;preferential method\u0026#34;. Each voter must number the candidates sequentially starting from 1. All boxes must be filled in (except the last); no numbers can be repeated or missed. In Northcote there were 12 candidates, and so each voter had to number the boxes from 1 to 12 (or 1 to 11); any vote without those numbers is invalid and can\u0026#39;t be counted. Such votes are known as \u0026#34;informal\u0026#34;. Ballots are distributed according to first preferences. If no candidate has obtained an absolute majority, then the candidate with the lowest count is eliminated, and all those ballots distributed according to their second preferences. This continues through as many redistributions as necessary until one candidate ends up with an absolute majority of ballots. So at any stage the candidate with the lowest number of ballots is eliminated, and those ballots redistributed to the remaining candidates on the basis of the highest preferences. As voting systems go it\u0026#39;s not the worst, although it has many faults. However, it is too entrenched in Australian political life for change to be likely.\nEach candidate had prepared a How to Vote card, listing the order of candidates they saw as being most likely to ensure a good result for themselves. In fact there is no requirement for any voter to follow a How to Vote card, but most voters do. For this reason the ordering of candidates on these cards is taken very seriously, and one of the less savoury aspects of Australian politics is backroom \u0026#34;preference deals\u0026#34;, where parties will wheel and deal to ensure best possible preference positions on other How to Vote cards.\nHere are the 12 candidates and their political parties, in the order as listed on the ballots:\nAttention: The internal data of table \u0026#34;4\u0026#34; is corrupted!\nFor this election the How to Vote cards can be seen at the ABC news site. The only candidate not to provide a full ordered list was Joseph Toscano, who simple advised people to number his square 1, and the other squares in any order they liked, along with a recommendation for people to number Lidia Thorpe 2.\nAs I don\u0026#39;t have a complete list of all possible ballots with their orderings and numbers, I\u0026#39;m going to make the following assumptions:\nEvery voter followed the How to Vote card of their preferred candidate exactly. Joseph Toscano\u0026#39;s preference ordering is: 3,4,2,5,6,7,8,9,1,10,11,12 (This gives Toscano 1; Thorpe 2; and puts the numbers 3 – 12 in order in the remaining spaces). These assumptions are necessarily crude, and don\u0026#39;t reflect the nuances of the election. But as we\u0026#39;ll see they end up providing a remarkably close fit with the final results.\nFor the exploration of the voting data I\u0026#39;ll use Python, and so here is all the How to Vote information as a dictionary:\nIn [ ]: htv = dict() htv[\u0026#39;Hayward\u0026#39;]=[1,10,7,6,8,5,12,11,3,2,4,9] htv[\u0026#39;Sanaghan\u0026#39;]=[3,1,2,5,6,7,8,9,10,11,12,4] htv[\u0026#39;Thorpe\u0026#39;]=[6,9,1,3,10,8,12,2,7,4,5,11] htv[\u0026#39;Lenk\u0026#39;]=[7,8,3,1,5,11,12,2,9,4,6,10] htv[\u0026#39;Chipp\u0026#39;]=[10,12,4,5,1,6,7,3,11,9,2,8] htv[\u0026#39;Cooper\u0026#39;]=[5,12,8,6,2,1,7,3,11,9,10,4] htv[\u0026#39;Rossiter\u0026#39;]=[6,12,9,11,2,7,1,5,8,10,3,4] htv[\u0026#39;Burns\u0026#39;]=[10,12,5,3,2,4,6,1,11,9,8,7] htv[\u0026#39;Toscano\u0026#39;]=[3,4,2,5,6,7,8,9,1,10,11,12] htv[\u0026#39;Edwards\u0026#39;]=[2,10,4,3,8,9,12,6,5,1,7,11] htv[\u0026#39;Spirovska\u0026#39;]=[2,12,3,7,4,5,6,8,10,9,1,11] htv[\u0026#39;Fontana\u0026#39;]=[2,3,4,5,6,7,8,9,10,11,12,1] In [ ]: cands = list(htv.keys()) voting took place at different voting centres (also known as \u0026#34;booths\u0026#34;), and the first preferences for each candidate at each booth can be found at the Victorian Electoral Commission. I copied this information into a spreadsheet and saved it as a CSV file. I then used the data analysis library pandas to read it in as a DataFrame:\nIn [ ]: import pandas as pd firstprefs = pd.read_csv(\u0026#39;northcote_results.csv\u0026#39;) firsts = firstprefs.loc[:,\u0026#39;Hayward\u0026#39;:\u0026#39;Fontana\u0026#39;].sum(axis=0) firsts Out[ ]: Hayward 354 Sanaghan 208 Thorpe 16254 Lenk 770 Chipp 1149 Cooper 433 Rossiter 1493 Burns 12721 Toscano 329 Edwards 154 Spirovska 214 Fontana 1857 dtype: int64 As Thorpe has more votes than any other candidate, then by the voting system of simple plurality (or First Past The Post) she would win. This system is used in the USA, and is possibly the worst of all systems for more than two candidates.\nChecking IRV So let\u0026#39;s first check how IRV works, with a little program that starts with a dictionary and first preferences of each candidate. Recall our simplifying assumption that all voters vote according to the How to Vote cards, which means that when a candidate is eliminated, all those votes will go to just one other remaining candidate. In practice, of course, those ballots would be redistributed across a number of candidates.\nHere\u0026#39;s a simple program to manage this version of IRV:\ndef IRV(votes): # performs an IRV simulation on a list of first preferences: at each stage # deleting the candidate with the lowest current score, and distributing # that candidates votes to the highest remaining candidate vote_counts = votes.copy() for i in range(10): m = min(vote_counts.items(), key = lambda x: x[1]) ind = next(j for j in range(2,11) if cands[htv[m[0]].index(j)] in vote_counts) c = cands[htv[m[0]].index(ind)] vote_counts += m[1] del(vote_counts[m[0]]) return(vote_counts) We could make this code a little more efficient by stopping when any candidate has amassed over 50% pf the votes. But for simplicity we\u0026#39;ll eliminate 10 of the 12 candidates, so it will be perfectly clear who has won. Let\u0026#39;s try it out:\nIn [ ]: IRV(firsts) Out[ ]: Thorpe 18648 Burns 17288 dtype: int64 Note that this is very close to the results listed on the VEC site:\nThorpe: 18380 Burns: 14410 Fontana: 3298 At this stage it doesn\u0026#39;t matter where Fontana\u0026#39;s votes go (in fact they would go to Burns), as Thorpe already has a majority. But the result we obtained above with our simplifying assumptions gives very similar values.\nNow lets see what happens if we work through each booth independently:\nIn [ ]: finals = {\u0026#39;Thorpe\u0026#39;:0,\u0026#39;Burns\u0026#39;:0} In [ ]: for i in firstprefs.index: ...: booth = dict(firstprefs.loc[i,\u0026#39;Hayward\u0026#39;:\u0026#39;Fontana\u0026#39;]) ...: f = IRV(booth) ...: finals[\u0026#39;Thorpe\u0026#39;] += f[\u0026#39;Thorpe\u0026#39;] ...: finals[\u0026#39;Burns\u0026#39;] += f[\u0026#39;Burns\u0026#39;] ...: print(firstprefs.loc[i,\u0026#39;Booth\u0026#39;],\u0026#39;: \u0026#39;,f) ...: Alphington : {\u0026#39;Thorpe\u0026#39;: 524, \u0026#39;Burns\u0026#39;: 545} Alphington North : {\u0026#39;Thorpe\u0026#39;: 408, \u0026#39;Burns\u0026#39;: 485} Bell : {\u0026#39;Thorpe\u0026#39;: 1263, \u0026#39;Burns\u0026#39;: 893} Croxton : {\u0026#39;Thorpe\u0026#39;: 950, \u0026#39;Burns\u0026#39;: 668} Darebin Parklands : {\u0026#39;Thorpe\u0026#39;: 180, \u0026#39;Burns\u0026#39;: 204} Fairfield : {\u0026#39;Thorpe\u0026#39;: 925, \u0026#39;Burns\u0026#39;: 742} Northcote : {\u0026#39;Thorpe\u0026#39;: 1043, \u0026#39;Burns\u0026#39;: 875} Northcote North : {\u0026#39;Thorpe\u0026#39;: 1044, \u0026#39;Burns\u0026#39;: 1012} Northcote South : {\u0026#39;Thorpe\u0026#39;: 1392, \u0026#39;Burns\u0026#39;: 1137} Preston South : {\u0026#39;Thorpe\u0026#39;: 677, \u0026#39;Burns\u0026#39;: 639} Thornbury : {\u0026#39;Thorpe\u0026#39;: 1158, \u0026#39;Burns\u0026#39;: 864} Thornbury East : {\u0026#39;Thorpe\u0026#39;: 1052, \u0026#39;Burns\u0026#39;: 804} Thornbury South : {\u0026#39;Thorpe\u0026#39;: 1310, \u0026#39;Burns\u0026#39;: 1052} Westgarth : {\u0026#39;Thorpe\u0026#39;: 969, \u0026#39;Burns\u0026#39;: 536} Postal Votes : {\u0026#39;Thorpe\u0026#39;: 1509, \u0026#39;Burns\u0026#39;: 2262} Early Votes : {\u0026#39;Thorpe\u0026#39;: 5282, \u0026#39;Burns\u0026#39;: 3532} In [ ]: finals Out[ ]: {\u0026#39;Burns\u0026#39;: 16250, \u0026#39;Thorpe\u0026#39;: 19686} Note again that the results are surprisingly close to the \u0026#34;two-party preferred\u0026#34; results as reported again on the VEC site. This adds weight to the notion that our assumptions, although crude, do in fact provide a reasonable way of experimenting with the election results.\nBorda counts These are named for Jean Charles de Borda (1733 – 1799) an early voting theorist. The idea is to weight all the preferences, so that a preference of 1 has a higher weighting that a preference of 2, and so on. All the weights are added, and the candidate with the greatest total is deemed to be the winner. With candidates, there are different methods of determining weighting; probably the most popular is a simple linear weighting, so that a preference of is weighted as . This gives weightings from down to zero. Alternatively a weighting of can be used, which gives weights of down to\nBoth are equivalent in determining a winner. Another possible weighting is .\nHere\u0026#39;s a program to compute Borda counts, again with our simplification:\ndef borda(x): # x is 0 or 1 borda_count = dict() for c in cands: borda_count=0.0 for c in cands: v = firsts # number of 1st pref votes for candidate c for i in range(1,13): appr = cands[htv.index(i)] # the candidate against position i on c htv card if x==0: borda_count[appr] += v/i else: borda_count[appr] += v*(11-i) if x==0: for k, val in borda_count.items(): borda_count[k] = float(\u0026#34;{:.2f}\u0026#34;.format(val)) else: for k, val in borda_count.items(): borda_count[k] = int(val) return(borda_count) Now we can run this, and to make our lives easier we\u0026#39;ll sort the results:\nIn [ ]: sorted(borda(1).items(), key = lambda x: x[1], reverse = True) Out[ ]: [(\u0026#39;Burns\u0026#39;, 308240), (\u0026#39;Thorpe\u0026#39;, 279392), (\u0026#39;Lenk\u0026#39;, 266781), (\u0026#39;Chipp\u0026#39;, 179179), (\u0026#39;Cooper\u0026#39;, 167148), (\u0026#39;Spirovska\u0026#39;, 165424), (\u0026#39;Edwards\u0026#39;, 154750), (\u0026#39;Hayward\u0026#39;, 136144), (\u0026#39;Fontana\u0026#39;, 88988), (\u0026#39;Toscano\u0026#39;, 80360), (\u0026#39;Rossiter\u0026#39;, 75583), (\u0026#39;Sanaghan\u0026#39;, 38555)] In [ ]: sorted(borda(0).items(), key = lambda x: x[1], reverse = True) Out[ ]: [(\u0026#39;Burns\u0026#39;, 22409.53), (\u0026#39;Thorpe\u0026#39;, 20455.29), (\u0026#39;Lenk\u0026#39;, 11485.73), (\u0026#39;Chipp\u0026#39;, 10767.9), (\u0026#39;Spirovska\u0026#39;, 6611.22), (\u0026#39;Cooper\u0026#39;, 6592.5), (\u0026#39;Edwards\u0026#39;, 6569.93), (\u0026#39;Hayward\u0026#39;, 6186.93), (\u0026#39;Fontana\u0026#39;, 6006.25), (\u0026#39;Rossiter\u0026#39;, 5635.08), (\u0026#39;Toscano\u0026#39;, 4600.15), (\u0026#39;Sanaghan\u0026#39;, 4196.47)] Note that in both cases Burns has the highest output. This is in general to be expected of Borda counts: that the highest value does not necessarily correspond to the candidate which is seen as better overall. For this reason Borda counts are rarely used in modern systems, although they can be used to give a general picture of an electorate.\nCondorcet criteria There are a vast number of voting systems which treat the vote as simultaneous pairwise contests. For example in a three way contest, between Alice, Bob, and Charlie the system considers the contest between Alice and Bob, between Alice and Charlie, and between Bob and Charlie. Each of these contests will produce a winner, and the outcome of all the pairwise contests is used to determine the overall winner. If there is a single person who is preferred, by a majority of voters, in each of their pairwise contests, then that person is called a Condorcet winner. This is named for the Marquis de Condorcet (1743 – 1794) another early voting theorist. The Condorcet criterion is one of many criteria considered appropriate for a voting system; it says that if the ballots return a Condorcet winner, then that winner should be chosen by the system. This is one of the faults of IRV: that it does not necessarily return a Condorcet winner.\nLet\u0026#39;s look again at the How to Vote preferences, and the numbers of voters of each:\nIn [ ]: htvd = pd.DataFrame(list(htv.values()),index=htv.keys(),columns=htv.keys()).transpose() In [ ]: htvd.loc[\u0026#39;Firsts\u0026#39;]=list(firsts.values) In [ ]: htvd Out[ ]: Hayward Sanaghan Thorpe Lenk Chipp Cooper Rossiter Burns Toscano Edwards Spirovska Fontana Hayward 1 3 6 7 10 5 6 10 3 2 2 2 Sanaghan 10 1 9 8 12 12 12 12 4 10 12 3 Thorpe 7 2 1 3 4 8 9 5 2 4 3 4 Lenk 6 5 3 1 5 6 11 3 5 3 7 5 Chipp 8 6 10 5 1 2 2 2 6 8 4 6 Cooper 5 7 8 11 6 1 7 4 7 9 5 7 Rossiter 12 8 12 12 7 7 1 6 8 12 6 8 Burns 11 9 2 2 3 3 5 1 9 6 8 9 Toscano 3 10 7 9 11 11 8 11 1 5 10 10 Edwards 2 11 4 4 9 9 10 9 10 1 9 11 Spirovska 4 12 5 6 2 10 3 8 11 7 1 12 Fontana 9 4 11 10 8 4 4 7 12 11 11 1 Firsts 354 208 16254 770 1149 433 1493 12721 329 154 214 1857 Here the how to vote information is in the columns. If we look at just the first two candidates, we see that Hayward is preferred to Sanaghan by all voters except for those who voted for Sanaghan. Thus a majority (in fact, nearly all) voters preferred Hayward to Sanaghan.\nFor each pair of candidates, the number of voters preferring one to the other can be computed by this program:\ndef condorcet(): condorcet_table = pd.DataFrame(columns=cands,index=cands).fillna(0) for c in cands: hc = htv for i in range(12): for j in range(12): if hc[i] \u0026amp;lt; hc[j]: condorcet_table.loc[cands[i],cands[j]] += firsts return(condorcet_table) We can see the results of this program:\nIn [ ]: ct = condorcet(); ct Out[ ]: Hayward Sanaghan Thorpe Lenk Chipp Cooper Rossiter Burns Toscano Edwards Spirovska Fontana Hayward 0 35728 4505 5042 19370 21633 20573 3116 35607 4888 3335 18283 Sanaghan 208 0 2065 2394 18648 3164 19926 2748 2835 2394 2394 17715 Thorpe 31431 33871 0 21504 20140 20935 34010 19370 33760 35428 32726 32153 Lenk 30894 33542 14432 0 19926 33442 34229 3886 33760 33935 32726 31945 Chipp 16566 17288 15796 16010 0 18895 34443 6037 18845 18404 18960 33871 Cooper 14303 32772 15001 2494 17041 0 34443 3395 18075 18404 15548 31608 Rossiter 15363 16010 1926 1707 1493 1493 0 4101 18075 18404 17041 15906 Burns 32820 33188 16566 32050 29899 32541 31835 0 35099 35428 32726 32024 Toscano 329 33101 2176 2176 17091 17861 17861 837 0 3887 2902 18075 Edwards 31048 33542 508 2001 17532 17532 17532 508 32049 0 20359 18075 Spirovska 32601 33542 3210 3210 16976 20388 18895 3210 33034 15577 0 20717 Fontana 17653 18221 3783 3991 2065 4328 20030 3912 17861 17861 15219 0 What we want to see, of course, if anybody has obtained a majority of preferences against everybody else. To do this we can find all the values greater than the majority, and add up their number. A value of 11 indicates a Condorcet winner:\nIn [ ]: maj = firsts.sum()//2 + 1; maj Out[ ]: 17969 In [ ]: ((ct \u0026amp;gt;= maj)*1).sum(axis = 1) Out[ ]: Hayward 6.0 Sanaghan 2.0 Thorpe 11.0 Lenk 9.0 Chipp 6.0 Cooper 5.0 Rossiter 2.0 Burns 10.0 Toscano 2.0 Edwards 5.0 Spirovska 6.0 Fontana 2.0 dtype: float64 So in this case we do indeed have a Condorcet winner in Thorpe, and this election (at least with our simplifying assumptions) is also one in which IRV returned the Condorcet winner.\nRange and approval voting If you go to rangevoting.org you\u0026#39;ll find a nspirited defense of a system called range voting. To vote in such a system, each voter gives an \u0026#34;approval weight\u0026#34; for each candidate. For example, the voter may mark off a value between 0 and 10 against each candidate, indicating their level of approval. There is no requirement for a voter to mark candidates differently: a voter might give all candidates a value of 10, or of zero, or give one candidate 10 and all the others zero. One simplified version of range voting is approval voting, where the voter simply indicates as many or as few candidates as she or he approves of. A voter may approve of just one candidate, or all of them. As with range voting, the winner is the one with the maximum number of approvals. A system where each voter approves of just one candidate is the First Past the Post system, and as we have seen previously, this is equivalent to simply counting only the first preferences of our ballots.\nWe can\u0026#39;t possibly know how voters may have approved of the candidates, but we can run a simple simulation: given a number between 1 and 12, suppose that each voter approves of their first preferences. Given the preferences and numbers, we can easily tally the approvals for each voter:\ndef approvals(n): # Determines the approvals result if voters took their # first n preferences as approvals approvals_result = dict() for c in cands: approvals_result = 0 firsts = firstprefs.loc[:,\u0026#39;Hayward\u0026#39;:\u0026#39;Fontana\u0026#39;].sum(axis=0) for c in cands: v = firsts # number of 1st pref votes for candidate c for i in range(1,n+1): appr = cands[htv.index(i)] # the candidate against position i on c htv card approvals_result[appr] += v return(approvals_result) Now we can see what happens with approvals for :\nIn [1 ]: for i in range(1,7): ...: si = sorted(approvals(i).items(),key = lambda x: x[1],reverse=True) ...: print([i]+[s[0] for s in si]) ...: [1, \u0026#39;Thorpe\u0026#39;, \u0026#39;Burns\u0026#39;, \u0026#39;Fontana\u0026#39;, \u0026#39;Rossiter\u0026#39;, \u0026#39;Chipp\u0026#39;, \u0026#39;Lenk\u0026#39;, \u0026#39;Cooper\u0026#39;, \u0026#39;Hayward\u0026#39;, \u0026#39;Toscano\u0026#39;, \u0026#39;Spirovska\u0026#39;, \u0026#39;Sanaghan\u0026#39;, \u0026#39;Edwards\u0026#39;] [2, \u0026#39;Burns\u0026#39;, \u0026#39;Thorpe\u0026#39;, \u0026#39;Chipp\u0026#39;, \u0026#39;Hayward\u0026#39;, \u0026#39;Fontana\u0026#39;, \u0026#39;Rossiter\u0026#39;, \u0026#39;Spirovska\u0026#39;, \u0026#39;Lenk\u0026#39;, \u0026#39;Edwards\u0026#39;, \u0026#39;Cooper\u0026#39;, \u0026#39;Toscano\u0026#39;, \u0026#39;Sanaghan\u0026#39;] [3, \u0026#39;Burns\u0026#39;, \u0026#39;Lenk\u0026#39;, \u0026#39;Thorpe\u0026#39;, \u0026#39;Chipp\u0026#39;, \u0026#39;Hayward\u0026#39;, \u0026#39;Spirovska\u0026#39;, \u0026#39;Sanaghan\u0026#39;, \u0026#39;Fontana\u0026#39;, \u0026#39;Rossiter\u0026#39;, \u0026#39;Toscano\u0026#39;, \u0026#39;Edwards\u0026#39;, \u0026#39;Cooper\u0026#39;] [4, \u0026#39;Burns\u0026#39;, \u0026#39;Lenk\u0026#39;, \u0026#39;Thorpe\u0026#39;, \u0026#39;Edwards\u0026#39;, \u0026#39;Chipp\u0026#39;, \u0026#39;Cooper\u0026#39;, \u0026#39;Fontana\u0026#39;, \u0026#39;Spirovska\u0026#39;, \u0026#39;Hayward\u0026#39;, \u0026#39;Sanaghan\u0026#39;, \u0026#39;Rossiter\u0026#39;, \u0026#39;Toscano\u0026#39;] [5, \u0026#39;Thorpe\u0026#39;, \u0026#39;Lenk\u0026#39;, \u0026#39;Burns\u0026#39;, \u0026#39;Spirovska\u0026#39;, \u0026#39;Edwards\u0026#39;, \u0026#39;Chipp\u0026#39;, \u0026#39;Cooper\u0026#39;, \u0026#39;Fontana\u0026#39;, \u0026#39;Hayward\u0026#39;, \u0026#39;Sanaghan\u0026#39;, \u0026#39;Rossiter\u0026#39;, \u0026#39;Toscano\u0026#39;] [6, \u0026#39;Lenk\u0026#39;, \u0026#39;Thorpe\u0026#39;, \u0026#39;Burns\u0026#39;, \u0026#39;Hayward\u0026#39;, \u0026#39;Spirovska\u0026#39;, \u0026#39;Chipp\u0026#39;, \u0026#39;Edwards\u0026#39;, \u0026#39;Cooper\u0026#39;, \u0026#39;Rossiter\u0026#39;, \u0026#39;Fontana\u0026#39;, \u0026#39;Sanaghan\u0026#39;, \u0026#39;Toscano\u0026#39;] It\u0026#39;s remarkable, that after , the first number of approvals required for Thorpe again to win is .\nOther election methods There are of course many many other methods of selecting a winning candidate from ordered ballots. And each of them has advantages and disadvantages. Some of the disadvantages are subtle (although important); others have glaring inadequacies, such as first past the post for more than two candidates. One such comparison table lists voting methods against standard criteria. Note that IRV – the Australian preferential system – is one of the very few methods to fail monotonicity. This is seen as one of the system\u0026#39;s worst failings. You can see an example of this in an old blog post.\nRather than write our own programs, we shall simply dump our information into the Ranked-ballot voting calculator page and see what happens. First the data needs to be massaged into an appropriate form:\nIn [ ]: for c in cands: ...: st = str(firsts)+\u0026#34;:\u0026#34;+c ...: for i in range(2,13): ...: st += \u0026#34;\u0026amp;gt;\u0026#34;+cands[htv.index(i)] ...: print(st) ...: 354:Hayward\u0026amp;gt;Edwards\u0026amp;gt;Toscano\u0026amp;gt;Spirovska\u0026amp;gt;Cooper\u0026amp;gt;Lenk\u0026amp;gt;Thorpe\u0026amp;gt;Chipp\u0026amp;gt;Fontana\u0026amp;gt;Sanaghan\u0026amp;gt;Burns\u0026amp;gt;Rossiter 208:Sanaghan\u0026amp;gt;Thorpe\u0026amp;gt;Hayward\u0026amp;gt;Fontana\u0026amp;gt;Lenk\u0026amp;gt;Chipp\u0026amp;gt;Cooper\u0026amp;gt;Rossiter\u0026amp;gt;Burns\u0026amp;gt;Toscano\u0026amp;gt;Edwards\u0026amp;gt;Spirovska 16254:Thorpe\u0026amp;gt;Burns\u0026amp;gt;Lenk\u0026amp;gt;Edwards\u0026amp;gt;Spirovska\u0026amp;gt;Hayward\u0026amp;gt;Toscano\u0026amp;gt;Cooper\u0026amp;gt;Sanaghan\u0026amp;gt;Chipp\u0026amp;gt;Fontana\u0026amp;gt;Rossiter 770:Lenk\u0026amp;gt;Burns\u0026amp;gt;Thorpe\u0026amp;gt;Edwards\u0026amp;gt;Chipp\u0026amp;gt;Spirovska\u0026amp;gt;Hayward\u0026amp;gt;Sanaghan\u0026amp;gt;Toscano\u0026amp;gt;Fontana\u0026amp;gt;Cooper\u0026amp;gt;Rossiter 1149:Chipp\u0026amp;gt;Spirovska\u0026amp;gt;Burns\u0026amp;gt;Thorpe\u0026amp;gt;Lenk\u0026amp;gt;Cooper\u0026amp;gt;Rossiter\u0026amp;gt;Fontana\u0026amp;gt;Edwards\u0026amp;gt;Hayward\u0026amp;gt;Toscano\u0026amp;gt;Sanaghan 433:Cooper\u0026amp;gt;Chipp\u0026amp;gt;Burns\u0026amp;gt;Fontana\u0026amp;gt;Hayward\u0026amp;gt;Lenk\u0026amp;gt;Rossiter\u0026amp;gt;Thorpe\u0026amp;gt;Edwards\u0026amp;gt;Spirovska\u0026amp;gt;Toscano\u0026amp;gt;Sanaghan 1493:Rossiter\u0026amp;gt;Chipp\u0026amp;gt;Spirovska\u0026amp;gt;Fontana\u0026amp;gt;Burns\u0026amp;gt;Hayward\u0026amp;gt;Cooper\u0026amp;gt;Toscano\u0026amp;gt;Thorpe\u0026amp;gt;Edwards\u0026amp;gt;Lenk\u0026amp;gt;Sanaghan 12721:Burns\u0026amp;gt;Chipp\u0026amp;gt;Lenk\u0026amp;gt;Cooper\u0026amp;gt;Thorpe\u0026amp;gt;Rossiter\u0026amp;gt;Fontana\u0026amp;gt;Spirovska\u0026amp;gt;Edwards\u0026amp;gt;Hayward\u0026amp;gt;Toscano\u0026amp;gt;Sanaghan 329:Toscano\u0026amp;gt;Thorpe\u0026amp;gt;Hayward\u0026amp;gt;Sanaghan\u0026amp;gt;Lenk\u0026amp;gt;Chipp\u0026amp;gt;Cooper\u0026amp;gt;Rossiter\u0026amp;gt;Burns\u0026amp;gt;Edwards\u0026amp;gt;Spirovska\u0026amp;gt;Fontana 154:Edwards\u0026amp;gt;Hayward\u0026amp;gt;Lenk\u0026amp;gt;Thorpe\u0026amp;gt;Toscano\u0026amp;gt;Burns\u0026amp;gt;Spirovska\u0026amp;gt;Chipp\u0026amp;gt;Cooper\u0026amp;gt;Sanaghan\u0026amp;gt;Fontana\u0026amp;gt;Rossiter 214:Spirovska\u0026amp;gt;Hayward\u0026amp;gt;Thorpe\u0026amp;gt;Chipp\u0026amp;gt;Cooper\u0026amp;gt;Rossiter\u0026amp;gt;Lenk\u0026amp;gt;Burns\u0026amp;gt;Edwards\u0026amp;gt;Toscano\u0026amp;gt;Fontana\u0026amp;gt;Sanaghan 1857:Fontana\u0026amp;gt;Hayward\u0026amp;gt;Sanaghan\u0026amp;gt;Thorpe\u0026amp;gt;Lenk\u0026amp;gt;Chipp\u0026amp;gt;Cooper\u0026amp;gt;Rossiter\u0026amp;gt;Burns\u0026amp;gt;Toscano\u0026amp;gt;Edwards\u0026amp;gt;Spirovska \u0026lt;/pre\u0026gt; The above can be copied and pasted into the given text box. Then the page returns:\nwinner method(s) Thorpe Baldwin Black Carey Coombs Copeland Dodgson Hare Nanson Raynaud Schulze Simpson Small Tideman Burns Borda Bucklin You can see that Thorpe would be the winner under almost every other voting system. This indicates that Thorpe being returned by IRV seems not just an artifact of the system, but represents the genuine wishes of the electorate.\nProgrammable CAD\u0026#xa0;\u0026#xa0;\u0026#xa0;CAD Every few years I decide to have a go at using a CAD package for the creation of 3D diagrams and shapes, and every time I give it up. There\u0026#39;s simply too much to learn in terms of creating shapes, moving them about, and so on, and every system seems to have its own ways of doing things. My son (who is an expert in Blender) recommended that I experiment with Tinkercad, and indeed this is probably a pretty easy way of getting started with 3D CAD. But it didn\u0026#39;t suit me: I wanted to place things precisely in relation to each other, and fiddling with dragging and dropping with the mouse was harder and more inconvenient than it should have been. No doubt there are ways of getting exact line ups, but it isn\u0026#39;t obvious to the raw beginner.\nI then discovered that there are lots of different CAD \u0026#34;programming languages\u0026#34;; or more properly scripting languages, where the user describes how the figure is to be built in the system\u0026#39;s language. Then the system builds it from the script. In this sense these systems are descendants of the venerable VRML, of which you can see some examples here, and its modern version X3D.\nSome of the systems that I looked at were:\nOpenSCAD, which uses its own scripting language OpenJSCAD, based on JavaScript implicitCAD, based on Haskell, No doubt there are others. All of these systems have primitive shapes (spheres, cubes, cylinders etc), operations on shapes (shifting, stretching, rotating, extruding etc) so a vast array of different forms can be generated. Some systems allow for a great deal of flexibility, so that a cylinder with a radius of zero at one end will be a cone, or of different radii at each end a frustum.\nI ended up choosing OpenJSCAD, which is being actively developed, is based on a well known and robust language, and is also great fun to use. Here is a simple example, to construct a tetrahedron whose vertices are chosen from the vertices of a cube with vertices . The vertices whose product is 1 will be the vertices of a tetrahedron. We can make a nice tetrahedral shape by putting a small sphere at each vertex, and joining each sphere by a cylinder of the same radius:\nThe code should be fairly self-explanatory. And here is the tetrahedron:\nI won\u0026#39;t put these models in this post, as one of them is slow to render: but look at a coloured tetrahedron, and an icosahedron.\nNote that CAD design of this sort is not so much for animated media so much as precise designs for 3D printing. But I like it for exploring 3D geometry.\n","link":"https://numbersandshapes.net/blog_posts/","section":"","tags":null,"title":""},{"body":"","link":"https://numbersandshapes.net/categories/","section":"categories","tags":null,"title":"Categories"}]