# The Gini coefficient and income inequality in Australia

According to Australia’s national broadcasting network, the ABC, inequality is increasing. This claim has been refuted by several academics. Now there are many different measures of inequality, but here I’m going to look at just one, the Gini coefficient, named for Corrado Gini who first discussed it in 1912.
Suppose we have a list of incomes, and the number of people earning that income (where we may assume that each number here refers to thousands), for example:

Population  Income
----------  ------
741          0
381        200
692        400
778        600
20        800
662       1000
228       1200
796       1400
221       1600
51       1800
361       2000


The only restriction is that the incomes must be listed in increasing order. Start by ordering by forming the cumulative sum of both incomes and populations, and scale each sum to between 0 and 1:

Scaled cumulative population     Scaled cumulative income
----------------------------     ------------------------
0.15027                        0.00000
0.22754                        0.01818
0.36788                        0.05455
0.52565                        0.10909
0.52971                        0.18182
0.66396                        0.27273
0.71020                        0.38182
0.87163                        0.50909
0.91645                        0.65455
0.92679                        0.81818
1.00000                        1.00000


The right column is now a fraction of the total income earned, and the left column is the fraction of the population who earn up to that income.

What we have now is the fraction of population which earns a fraction of the total income. Plot income against population; the result will be a concave curve known as a Lorenz curve :

If income is perfectly equal, then for any fraction between 0 and 1, that fraction of the population will earn that fraction of total income, and the Lorenz curve will be the straight line . The Gini coefficient is defined to be the fraction:

or more simply

Thus the larger the Gini coefficient, the more unequal the income. A population which enjoys perfectly equal incomes will have a Gini coefficient of zero.

Given a discrete list of scaled cumulative sums of incomes and population, the integral of the Lorenz curve can be approximated by trapezoidal sums. If the cumulative income values are

and the cumulative population values are

then the area of the trapezoid between the values of and is

Thus the Gini coefficient can be computed as

where the sum is twice the area of trapezoids which form the area under the Lorenz curve.

This can be computed easily in Matlab or any other matrix-oriented language such as GNU Octave or Scilab. Suppose cp and ci are the lists from above corresponding to population and income. Then the Gini coefficient can be computed as:

1-sum(diff(cp).*(ci(1:end-1)+ci(2:end)))
ans = 0.52579


We shall look at Australian incomes as obtained from the Australian Bureau of Statistics, in ten year intervals; 1995-1996, 2005-2006, and 2015-2016 obtained from http://www.abs.gov.au/ausstats/abs@.nsf/mf/6302.0

The raw data given in the following table, where the values represents thousands of people earning that particular income:

   Income    1995-1996   2005-2006   2015-2016
-------   ---------   ---------   ---------
0.00      28.70       19.80        0.00
99.00      57.00       67.10       92.40
199.00      58.20       36.30       48.30
299.00     502.10      195.00       87.80
399.00     393.10      601.90      159.20
499.00     511.70      272.60      598.90
599.00     363.30      456.10      359.40
699.00     340.60      402.20      401.60
799.00     315.00      328.70      398.70
899.00     291.00      312.80      341.40
999.00     307.50      274.00      317.80
1099.00     264.60      266.50      315.30
1199.00     244.30      282.40      255.60
1299.00     204.60      295.40      271.40
1399.00     239.70      262.50      282.50
1499.00     220.80      236.20      238.30
1599.00     211.70      250.70      236.60
1699.00     200.30      216.80      251.80
1799.00     190.10      202.50      228.60
1899.00     174.20      213.80      230.90
1999.00     185.50      204.60      221.00
2199.00     263.00      369.40      426.30
2399.00     214.90      367.60      369.00
2599.00     165.70      271.60      334.80
2799.00     144.40      251.50      293.50
2999.00     120.10      219.50      248.70
3499.00     161.40      338.70      547.80
3999.00      90.00      243.20      371.40
4999.00      79.40      217.90      487.60
5000.00      77.10      227.60      509.90


Suppose the data is read into GNU Octave as a matrix H, then we can determine the scaled cumulative sums, and hence the Gini coefficients:

>  tmp = H(:,1);
>  inc = cumsum(tmp)/sum(tmp);
>  tmp = H(:,2);
>  pop1 = cumsum(tmp)/sum(tmp);
>  tmp = H(:,3);
>  pop2 = cumsum(tmp)/sum(tmp);
>  tmp = H(:,4);
>  pop3 = cumsum(tmp)/sum(tmp);
>  gini1 = 1-sum(diff(pop1).*(inc(1:end-1)+inc(2:end)))

ans =  0.57696

> gini2 = 1-sum(diff(pop2).*(inc(1:end-1)+inc(2:end)))

ans =  0.42627

> gini3 = 1-sum(diff(pop3).*(inc(1:end-1)+inc(2:end)))

ans =  0.29479


This means we have:

Year Gini Coefficient
1995 – 1996 0.57696
2005 – 2006 0.42627
2015 – 2016 0.29479

which seems to indicate that inequality is in fact decreasing. We can also plot the Lorenz curves for each population set in turn, and for comparison also show the line of maximum income equality:

The Lorenz curves become closer to the straight line, which indicates a decrease in inequality over this sequence of measurements, at least as measured by the Gini coefficient.