Fitting the SIR model of disease to data in Julia

Share on:

A few posts ago I showed how to do this in Python. Now it's Julia's turn. The data is the same: spread of influenza in a British boarding school with a population of 762. This was reported in the British Medical Journal on March 4, 1978, and you can read the original short article here.

As before we use the SIR model, with equations

\begin{align*} \frac{dS}{dt}&=-\frac{\beta IS}{N}\\ \frac{dI}{dt}&=\frac{\beta IS}{N}-\gamma I\\ \frac{dR}{dt}&=\gamma I \end{align*}

where \(S\), \(I\), and \(R\) are the numbers of susceptible, infected, and recovered people. This model assumes a constant population - so no births or deaths - and that once recovered, a person is immune. There are more complex models which include a changing population, as well as other disease dynamics.

The above equations can be written without the population; since it is constant, we can just write \(\beta\) instead of \(\beta/N\). The values \(\beta\) and \(\gamma\) are the parameters which affect the working of this model, their values and that of their ratio \(\beta/\gamma\) provide information of the speed of the disease spread.

As with Python, our interest will be to see if we can find values of \(\beta\) and \(\gamma\) which model the school outbreak.

We will do this in three functions. The first sets up the differential equations:

1using DifferentialEquations
2
3function SIR!(du,u,p,t)
4    S,I,R = u
5    β,γ = p
6    du[1] = dS = -β*I*S
7    du[2] = dI = β*I*S - γ*I
8    du[3] = dR = γ*I
9end

The next function determines the sum of squares between the data and the results of the SIR computations for given values of the parameters. Since we will put all our functions into one file, we can create the constant values outside any functions which might need them:

 1data = [1, 3, 6, 25, 73, 222, 294, 258, 237, 191, 125, 69, 27, 11, 4]
 2tspan = (0.0,14.0)
 3u0 = [762.0,1.0,0.0]
 4
 5function ss(x)
 6    prob = ODEProblem(SIR!,u0,tspan,(x[1],x[2]))
 7    sol = solve(prob)
 8    sol_data = sol(0:14)[2,:]
 9    return(sum((sol_data - data) .^2))
10end

Note that we don't have to carefully set up the problem to produce values at each of the data points, which in our case are the integers from 0 to 14. Julia will use a standard numerical technique with a dynamic step size, and values corresponding to the data points can then be found by interpolation. All of this functionality is provided by the DifferentialEquations package. For example, R[10] will return the 10th value of the list of computed R values, but R(10) will produce the interpolated value of R at \(t=10\).

Finally we use the Optim package to minimize the sum of squares, and the Plots package to plot the result:

 1using Optim
 2using Plots
 3
 4function run_optim()
 5    opt = optimize(ss,[0.001,0.01],NelderMead())
 6    beta,gamma = opt.minimizer
 7    prob = ODEProblem(SIR!,u0,tspan,(beta,gamma))
 8    sol = solve(prob)
 9    plot(sol,
10	    linewidth=2,
11	    xaxis="Time in days",
12	    label=["Susceptible" "Infected" "Recovered"])
13    plot!([0:14],data,linestyle=:dash,marker=:circle,markersize=4,label="Data")
14end

Running the last function will produce values

\begin{align*} \beta&=0.0021806887934782853\\ \gamma&=0.4452595474326912 \end{align*}

and the final plot looks like this:

!Julia SIR plot

This was at least as easy as in Python, and with a few extra bells and whistles, such as interpolation of data points. Nice!