Benford’s Law, also called the First-Digit Law, refers to the frequency distribution of digits in many (but not all) real-life sources of data. In this distribution, the number 1 occurs as the leading digit about 30% of the time, while larger numbers occur in that position less frequently: 9 as the first digit less than 5% of the time. Benford’s Law also concerns the expected distribution for digits beyond the first, which approach a uniform distribution.
I was curious if that might emerge in geography (or Ordnance Survey data) somehow. Turns out if we look at the areas (in metres squared) of the polygons in the Boundary Line Product (i.e. the areas of all the counties, wards, consistuencies, districts, parishes etc. in GB) then we get a pretty good fit. In the table below the first column is the leading digit of the polygon area, the second is the percentage of areas starting with that leading digit and the third column is the value Benford’s Law predicts:
1: 30.6 30.1
2: 15.9 17.6
3: 11.3 12.5
4: 9.8 9.7
5: 8 7.9
6: 7.3 6.7
7: 6.3 5.8
8: 5.6 5.1
9: 4.9 4.6
Tony Hirst blogs about two of my recent blogs…
Originally posted on OUseful.Info, the blog...:
As anyone who’s followed this blog for some time will know, my relationship with Linked Data has been an off and on again one over the years. At the current time, it’s largely off – all my OpenRefine installs seem to have given up the ghost as far as reconciliation and linking services go, and I have no idea where the problem lies (whether with the plugins, the installs, with Java, with the endpoints, with the reconciliations or linkages I’m trying to establish).
My dabblings with pulling data in from Wikipedia/DBpedia to Gephi (eg as described in Visualising Related Entries in Wikipedia Using Gephi and the various associated follow-on posts) continue to be hit and miss due to the vagaries of DBpedia and the huge gaps in infobox structured data across Wikipedia itself.
With OpenRefine not doing its thing for me, I haven’t been able to use that app as…
View original 453 more words
On behalf of my employer:
The growth and development of new, web and mobile applications demands the development of new data to enable effective location searching. In response we have published illustrative data for an updated gazetteer of names. We’d love it if you would take a look and provide us with your feedback. We want to make sure that the new product meets your needs. Access the data through OS Insight
There is also a linked data version of this data available via the above link. This contains RDF in n-triples format.
The data will be available for review until Friday 4th October 2013.