Speeds of Julia and Python

Share on:


Python is of course one of the world's currently most popular languages, and there are plenty of statistics to show it. Of all languages in current use, Python is one of the oldest (in the very quick time-scale of programming languages) dating from 1990 - only C and its variants are older. However, it seems to keep its eternal youth by being re-invented, and by its constantly increasing libraries. Indeed, one of Python's greatest strength is its libraries, and pretty much every Python user will have worked with numpy, scipy, matplotlib, pandas, to name but four. In fact, aside from some specialized applications (mainly involving security, speed, or memory) Python can be happily used for almost everything.

Julia on the other hand is newer, dating from 2012. (Only Swift is newer.) It was designed to have the speed of C, the power of Matlab, and the ease of use of Python. Note the comparison with Matlab - Julia was designed as a language for technical computing, although it is very much a general purpose language. It can even be used for low-level systems programming.

Like Python, Julia can be extended through packages, of which there are many: according to Julia's package repository there are 2554 at the time of writing. Some of the packages are big, mature, and robust, others are smaller or represent a niche interest. You can go to Julia Observer to get a sense of which packages are the most popular, largest, have the most commits on github, and so on. Because Julia is still relatively new, packages are still being actively developed. However, some such as Plots, JuMP for optimization, Differential Equations, to name but three, are very much ready for the Big Time.

The purpose of this post is to do a single comparison of Julia and Python for speed.

Matrix permanents

Given a square matrix, its determinant is a well-known and useful construct (in spite of Sheldon Axler).

The determinant of an \(n\times n\) matrix \(M=m_{ij}\) can be formally defined as

\[ \det(M)=\sum_{\sigma\in S_n}\left(\text{sgn}(\sigma)\prod_{i=1}^nm_{i,\sigma(i)}\right) \]

where the sum is taken over all permutations \(\sigma\) of \(1,2,\ldots,n\), and where \(\text{sgn}(\sigma)\) is the sign of the permutation; which is defined in terms of the number of digit swaps to get to it: an even number of swaps has a sign of 1, and an odd number a sign of \(-1\). The determinant can be effectively computed by Gaussian elimination of a matrix into triangular form, which takes in the order of \(n^3\) operations; the determinant is then the product of the diagonal elements.

The permanent is defined similarly, except for the sign:

\[ \text{per}(M)=\sum_{\sigma\in S_n}\prod_{i=1}^nm_{i,\sigma(i)}. \]

Remarkably enough, this simple change renders the permanent impossible to be computed effectively; all known algorithms have exponential orders. Computing by expanding each permutation takes \(O(n!n)\) operations, some better algorithms (such as Ryser's algorithm) have order \(O(2^{n-1}n)\).

The permanent has some applications, although not as many as the determinant. An easy and immediate result is that if \(M\) is a matrix consisting entirely of ones, except for the main diagonal of zeros (so that it is the "ones complement" of the identity matrix), its permanent is the number of derangements of \(n\) objects; that is, the number of permutations in which there are no fixed points.

First Python. Here is a simple program, saved as permanent.py to compute the permanent from its definition:

 1import itertools as it
 2import numpy as np
 4def permanent(m):
 5    nr,nc = np.shape(m)
 6    if nr != nc:
 7      raise ValueError("Matrix must be square")
 8    pm = 0
 9    for p in it.permutations(range(nr)):
10      pm += np.product([m[i,p[i]] for i in range(nr)])
11    return pm

I am not interested in optimizing speed; simply to implement the same algorithm in Python and Julia to see what happens. Now lets run this in a Python REPL (I'm using IPython here):

1In [1]: import permanent as pt
3In [2]: import numpy as np
5In [3]: M = (1 - np.identity(4)).astype(np.intc)
7In [4]: pt.permanent(M)
8Out[4]: 9

and this is correct. This result was practically instantaneous, but it slows down appreciably, as you'd expect, for larger matrices:

 1In [5]: from timeit import default_timer as timer
 3In [6]: M = (1 - np.identity(8)).astype(np.intc)
 5In [7]: t = timer();print(pt.permanent(M));timer()-t
 7Out[7]: 0.7398275199811906
 9In [8]: M = (1 - np.identity(9)).astype(np.intc)
11In [9]: t = timer();print(pt.permanent(M));timer()-t
13Out[9]: 10.244881154998438
15In [10]: M = (1 - np.identity(10)).astype(np.intc)
17In [11]: t = timer();print(pt.permanent(M));timer()-t
19Out[11]: 86.57762016600464

Now no doubt this could be speeded up in numerous ways, but that is not my point: I am simply implementing the same algorithm in each language. At any rate, my elementary program becomes effectively unusable for matrices bigger than about \(8\times 8\).

Now for Julia. Again, we start with a simple program:

 1using Combinatorics
 3function permanent(m)
 4    nr,nc = size(m)
 5    if nr != nc
 6      error("Matrix must be square")
 7    end
 8    pm = 0
 9    for p in permutations(1:nr)
10      pm += prod(m[i,p[i]] for i in 1:nr)
11    end
12    return(pm)

You can see this program and the Python one above are, to all intents and purposes, identical. There are no clever optimizing tricks, it is a raw implementation of the basic definition.

First, a quick test:

1julia> using LinearAlgebra
3julia> M = 1 .- Matrix(1I,4,4);
5julia> include("permanent.jl")
7julia> permanent(M)

So far, so good. Now for some time trials:

 1julia> using BenchmarkTools
 3julia> M = 1 .- Matrix(1I,8,8);
 5julia> @time permanent(M)
 60.020514 seconds (201.61 k allocations: 14.766 MiB)
 9julia> M = 1 .- Matrix(1I,9,9);
11julia> @time permanent(M)
12  0.245049 seconds (1.81 M allocations: 143.965 MiB, 33.73% gc time)
15julia> M = 1 .- Matrix(1I,10,10);
17julia> @time permanent(M)
18  1.336724 seconds (18.14 M allocations: 1.406 GiB, 3.20% gc time)

You'll see that Julia, thanks to its JIT compiler, is much much faster than Python. The point is that I didn't have to do anything here to access that speed, it's just a splendid part of the language.

Winner: Julia, by a country mile.

A few words at the end

The timings given above are not absolute - running on a different system or with different versions of Python, Julia, and their libraries, will give different results. But the point is not the exact times taken, but the comparison of time between Julia and Python.

For what it's worth, I'm running a fairly recently upgraded version of Arch Linux on a Lenovo Thinkpad X1 Carbon, generation 3. I'm running Julia 1.3.0 and Python 3.7.4. The machine has 8Gb of memory, of which about 2Gb were free.