The Gini coefficient and income inequality in Australia

According to Australia’s national broadcasting network, the ABC, inequality is increasing. This claim has been refuted by several academics. Now there are many different measures of inequality, but here I’m going to look at just one, the Gini coefficient, named for Corrado Gini who first discussed it in 1912.
Suppose we have a list of incomes, and the number of people earning that income (where we may assume that each number here refers to thousands), for example:

Population  Income
----------  ------
   741          0
   381        200
   692        400
   778        600
    20        800
   662       1000
   228       1200
   796       1400
   221       1600
    51       1800
   361       2000

The only restriction is that the incomes must be listed in increasing order. Start by ordering by forming the cumulative sum of both incomes and populations, and scale each sum to between 0 and 1:

Scaled cumulative population     Scaled cumulative income	 
----------------------------     ------------------------
           0.15027                        0.00000	
           0.22754                        0.01818	
           0.36788                        0.05455	
           0.52565                        0.10909	
           0.52971                        0.18182	
           0.66396                        0.27273	
           0.71020                        0.38182	
           0.87163                        0.50909	
           0.91645                        0.65455	
           0.92679                        0.81818	
           1.00000                        1.00000

The right column is now a fraction of the total income earned, and the left column is the fraction of the population who earn up to that income.

What we have now is the fraction of population which earns a fraction of the total income. Plot income against population; the result will be a concave curve known as a Lorenz curve y=L(x):

If income is perfectly equal, then for any fraction p between 0 and 1, that fraction of the population will earn that fraction of total income, and the Lorenz curve will be the straight line y=x. The Gini coefficient is defined to be the fraction:

    \[ \frac{\mbox{area between the line $y=x$ and the Lorenz curve }} {\mbox{area under the line $y=x$}} \]

or more simply

    \begin{align*} G&=2\int^1_0x-L(x)\,dx\\ &=1-2\int^1_0L(x)\,dx. \end{align*}

Thus the larger the Gini coefficient, the more unequal the income. A population which enjoys perfectly equal incomes will have a Gini coefficient of zero.

Given a discrete list of scaled cumulative sums of incomes and population, the integral of the Lorenz curve can be approximated by trapezoidal sums. If the cumulative income values are

    \[ w_0,w_1,w_2,\ldots,w_n \]

and the cumulative population values are

    \[ p_0,p_1,p_2,\ldots,p_n \]

then the area of the trapezoid between the x values of p_k and p_{k+1} is

    \[ (p_{k+1}-p_k)\left( \frac{w_k+w_{k+1}}{2} \right) \]

Thus the Gini coefficient can be computed as

    \[ G=1-\sum_{k=0}^{n-1}(p_{k+1}-p_k)(w_k+w_{k+1}) \]

where the sum is twice the area of trapezoids which form the area under the Lorenz curve.

This can be computed easily in Matlab or any other matrix-oriented language such as GNU Octave or Scilab. Suppose cp and ci are the lists from above corresponding to population and income. Then the Gini coefficient can be computed as:

1-sum(diff(cp).*(ci(1:end-1)+ci(2:end)))
ans = 0.52579

We shall look at Australian incomes as obtained from the Australian Bureau of Statistics, in ten year intervals; 1995-1996, 2005-2006, and 2015-2016 obtained from http://www.abs.gov.au/ausstats/abs@.nsf/mf/6302.0

The raw data given in the following table, where the values represents thousands of people earning that particular income:

   Income    1995-1996   2005-2006   2015-2016
   -------   ---------   ---------   ---------
      0.00      28.70       19.80        0.00         
     99.00      57.00       67.10       92.40         
    199.00      58.20       36.30       48.30         
    299.00     502.10      195.00       87.80         
    399.00     393.10      601.90      159.20         
    499.00     511.70      272.60      598.90         
    599.00     363.30      456.10      359.40         
    699.00     340.60      402.20      401.60         
    799.00     315.00      328.70      398.70         
    899.00     291.00      312.80      341.40         
    999.00     307.50      274.00      317.80         
   1099.00     264.60      266.50      315.30         
   1199.00     244.30      282.40      255.60         
   1299.00     204.60      295.40      271.40         
   1399.00     239.70      262.50      282.50         
   1499.00     220.80      236.20      238.30         
   1599.00     211.70      250.70      236.60         
   1699.00     200.30      216.80      251.80         
   1799.00     190.10      202.50      228.60         
   1899.00     174.20      213.80      230.90         
   1999.00     185.50      204.60      221.00         
   2199.00     263.00      369.40      426.30         
   2399.00     214.90      367.60      369.00         
   2599.00     165.70      271.60      334.80         
   2799.00     144.40      251.50      293.50         
   2999.00     120.10      219.50      248.70         
   3499.00     161.40      338.70      547.80         
   3999.00      90.00      243.20      371.40         
   4999.00      79.40      217.90      487.60         
   5000.00      77.10      227.60      509.90         

Suppose the data is read into GNU Octave as a matrix H, then we can determine the scaled cumulative sums, and hence the Gini coefficients:

>  tmp = H(:,1);
>  inc = cumsum(tmp)/sum(tmp);
>  tmp = H(:,2);
>  pop1 = cumsum(tmp)/sum(tmp);
>  tmp = H(:,3);
>  pop2 = cumsum(tmp)/sum(tmp);
>  tmp = H(:,4);
>  pop3 = cumsum(tmp)/sum(tmp);   
>  gini1 = 1-sum(diff(pop1).*(inc(1:end-1)+inc(2:end)))
   
   ans =  0.57696

> gini2 = 1-sum(diff(pop2).*(inc(1:end-1)+inc(2:end)))

   ans =  0.42627

> gini3 = 1-sum(diff(pop3).*(inc(1:end-1)+inc(2:end)))

   ans =  0.29479

This means we have:

Year Gini Coefficient
1995 – 1996 0.57696
2005 – 2006 0.42627
2015 – 2016 0.29479

which seems to indicate that inequality is in fact decreasing. We can also plot the Lorenz curves for each population set in turn, and for comparison also show the line y=x of maximum income equality:

The Lorenz curves become closer to the straight line, which indicates a decrease in inequality over this sequence of measurements, at least as measured by the Gini coefficient.

Leave a Reply

Your email address will not be published. Required fields are marked *