Georg Hoffmann of PrimaKlima has turned away from climatology for a moment to carry out an interesting statistical analysis of the Iranian election results. Bizarrely, the percentage split between the incumbent and the closest rival remained entirely stable throughout the count – an R2 value of 0.999. But even more bizarrely, the lead for Ahmadinejad doesn’t correlate with anything – as if the uniform national swing beloved of psephologists was real, or for that matter, as if someone had simply shifted the numbers across the board. For comparison, he ran the same exercise for the 2005 German elections, which shows a wide scatter of points with a concentration of big CDU leads in the south.

Then, however, comes the genuinely scientific bit. What would Benford’s law, the principle that in most data sets there is a large excess of numbers that begin with low digits, and that therefore fake data can be identified by its divergence from this, make of it? (The data, by the way, is available here.) Well…it turns out that the results pass the Benford test, which may mean that they are honest or possibly that the Iranian Ministry of the Interior reads blogs, too.

I don’t quite understand the second point. Can you please explain more?

What does it mean

“in most data sets there is a large excess of numbers that begin with low digits, and that therefore fake data can be identified by its divergence from this”.

What do the “low digits” refer to?

Benford’s law is the observation that _dimensioned_ measurements are distributed as if drawn from a logarithmic scale, since that is the only distribution stable under an arbitrary change of basis (does that help?).

Less precisely, but more comprehensibly, Benford’s law says that in many circumstances, measurements of various sorts are more likely to start with lower than higher digits (for complex, but convincing, reasons as outlined above). It is an important rule in, e.g., forensic data analysis.

Note that Andrew Gelman (who is prof of statistics at Columbia) disputes the application of Benford’s law here in his blog. The posting is worth reading.

