Archive
Benford’s Law and the Administrative Geography of Great Britain
Just listened to the latest episode of the Infinite Monkey Cage, and was reminded of Benford’s Law. This states:
Benford’s Law, also called the First-Digit Law, refers to the frequency distribution of digits in many (but not all) real-life sources of data. In this distribution, the number 1 occurs as the leading digit about 30% of the time, while larger numbers occur in that position less frequently: 9 as the first digit less than 5% of the time. Benford’s Law also concerns the expected distribution for digits beyond the first, which approach a uniform distribution.
I was curious if that might emerge in geography (or Ordnance Survey data) somehow. Turns out if we look at the areas (in metres squared) of the polygons in the Boundary Line Product (i.e. the areas of all the counties, wards, consistuencies, districts, parishes etc. in GB) then we get a pretty good fit. In the table below the first column is the leading digit of the polygon area, the second is the percentage of areas starting with that leading digit and the third column is the value Benford’s Law predicts:
1: 30.6 30.1
2: 15.9 17.6
3: 11.3 12.5
4: 9.8 9.7
5: 8 7.9
6: 7.3 6.7
7: 6.3 5.8
8: 5.6 5.1
9: 4.9 4.6
Not bad…
Quick Play with Cayley Graph DB and Ordnance Survey Linked Data
Earlier this month Google announced the release of the open source graph database/triplestore Cayley. This weekend I thought I would have a quick look at it, and try some simple queries using the Ordnance Survey Linked Data.
Cayley is written in Go, so first I had to download and install that. I then downloaded Cayley from here. As an initial experiment I decided to use the Boundary Line Linked Data, and you can grabbed the data as n-triples here. I only wanted a subset of this data – I didn’t need all of the triplestores storing the complex boundary geometries for my initial test so I discarded the files of the form *-geom.nt and the files of the form county.nt, dbu.nt etc. (these are the ones with the boundaries in). Finally I put the remainder of the data into one file so it was ready to load into Cayley.
It is very easy to load data into Cayley – see the getting started section part on the Cayley pages here. I decided I wanted to try the web interface so loading the data (in a file called all.nt) was a simple case of typing:
./cayley http –dbpath=./boundaryline/all.nt
Once you’ve done this point your web browser to http://localhost:64210/ and you should see something like:
One of the things that will first strike people used to using RDF/triplestores is that Cayley does not have a SPARQL interface, and instead uses a query language based on Gremlin. I am new to Gremlin, but seems it has already been used to explore linked data – see blog from Dan Brickley from a few years ago.
The main purpose of this blog post is to give a few simple examples of queries you can perform on the Ordnance Survey data in Cayley. If you have Cayley running then you can find the query language documented here.
At the simplest level the query language seems to be an easy way to traverse the graph by starting at a node/vertex and following incoming or outgoing links. So to find All the regions that touch Southampton it is a simple case of starting at the Southampton node, following a touches outbound link and returning the results:
g.V(“http://data.ordnancesurvey.co.uk/id/7000000000037256“).Out(“http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches“).All()
Giving:
If you want to return the names and not the IDs:
g.V(“http://data.ordnancesurvey.co.uk/id/7000000000037256“).Out(“http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches“).Out(“http://www.w3.org/2000/01/rdf-schema#label“).All()
You can used also filter – so to just see the counties bordering Southampton:
g.V(“http://data.ordnancesurvey.co.uk/id/7000000000037256“).Out(“http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches“).Has(“http://www.w3.org/1999/02/22-rdf-syntax-ns#type“,”http://data.ordnancesurvey.co.uk/ontology/admingeo/County“).Out(“http://www.w3.org/2000/01/rdf-schema#label“).All()
The Ordnance Survey linked data also has spatial predicates ‘contains’, ‘within’ as well as ‘touches’. Analogous queries can be done with those. E.g. find me everything Southampton contains:
g.V(“http://data.ordnancesurvey.co.uk/id/7000000000037256“).Out(“http://data.ordnancesurvey.co.uk/ontology/spatialrelations/contains“).Out(“http://www.w3.org/2000/01/rdf-schema#label“).All()
So after this very quick initial experiment it seems that Cayley is very good at providing an easy way of doing very quick/simple queries. One query I wanted to do was find everything in, say, Hampshire – the full transitive closure. This is very easy to do in SPARQL, but in Cayley (at first glance) you’d have to write some extra code (not exactly rocket science, but a bit of a faff compared to SPARQL). I rarely touch Javascript these days so for me personally this will never replace a triplestore with a SPARQL endpoint, but for JS developers this tool will be a great way to get started with and explore linked data/RDF. I might well brush up on my Javascript and provide more complicated examples in a later blog post…
Visualising the Location Graph – example with Gephi and Ordnance Survey linked data
This is arguably a simpler follow up to my previous blog post, and here I want to look at visualising Ordnance Survey linked data in Gephi. Now Gephi isn’t really a GIS, but it can be used to visualise the adjacency graph where regions are represented as nodes in a graph, and links represent adjacency relationships.
The approach here will be very similar to the approach in my previous blog. The main difference is that you will need to use the Ordnance Survey SPARQL endpoint and not the DBpedia one. So this time in the Gephi semantic web importer enter the following endpoint URL:
http://data.ordnancesurvey.co.uk/datasets/os-linked-data/apis/sparql
The Ordnance Survey endpoint returns turtle by default, and Gephi does not seem to like this. I wanted to force the output as XML. I figured this could be done in the using a ‘REST parameter name’ (output) with value equal to xml. This did not seem to work, so instead I had to do a bit of a hack. In the ‘query tag…’ box you will need to change the value from ‘query’ to ‘output=xml&query’. You should see something like this in the Semantic Web Importer now:
Now click on the query tab. If we want to, for example, view the adjacent graph for consistuencies we can enter the following query:
prefix gephi:<http://gephi.org/>
construct {
?s gephi:label ?label .
?s gephi:lat ?lat .
?s gephi:long ?long .
?s <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches> ?o .}
where
{
?s a <http://data.ordnancesurvey.co.uk/ontology/admingeo/WestminsterConstituency> .
?o a <http://data.ordnancesurvey.co.uk/ontology/admingeo/WestminsterConstituency> .
?s <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches> ?o .
?s <http://www.w3.org/2000/01/rdf-schema#label> ?label .
?s <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?s <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
}
and click ‘run’. To visualise the output you will need to follow the exact same steps mentioned here (remember to recast the lat and long variables to decimal).
If we want to view adjacency of London Boroughs then we can do this with a similar query:
prefix gephi:<http://gephi.org/>
construct {
?s gephi:label ?label .
?s gephi:lat ?lat .
?s gephi:long ?long .
?s <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches> ?o .}
where
{
?s a <http://data.ordnancesurvey.co.uk/ontology/admingeo/LondonBorough> .
?o a <http://data.ordnancesurvey.co.uk/ontology/admingeo/LondonBorough> .
?s <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches> ?o .
?s <http://www.w3.org/2000/01/rdf-schema#label> ?label .
?s <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?s <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
}
When visualising you might want to change the scale parameter to 10000.0. You should see something like this:
So far so good. Now imagine we want to bring in some other data – recall my previous blog post here. We can use SPARQL federation to bring in data from other endpoints. Suppose we would like to make the size of the node represent the ‘IMD rank‘ of each London Borough…we can do with by bringing in data from the Open Data Communities site:
prefix gephi:<http://gephi.org/>
construct {
?s gephi:label ?label .
?s gephi:lat ?lat .
?s gephi:long ?long .
?s gephi:imd-rank ?imdrank .
?s <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches> ?o .}
where
{
?s a <http://data.ordnancesurvey.co.uk/ontology/admingeo/LondonBorough> .
?o a <http://data.ordnancesurvey.co.uk/ontology/admingeo/LondonBorough> .
?s <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches> ?o .
?s <http://www.w3.org/2000/01/rdf-schema#label> ?label .
?s <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?s <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
SERVICE <http://opendatacommunities.org/sparql> {
?x <http://purl.org/linked-data/sdmx/2009/dimension#refArea> ?s .
?x <http://opendatacommunities.org/def/IMD#IMD-score> ?imdrank . }
}
You will need to recast the imdrank as an integer for what follows (do this using the same approach used to recast the lat/long variables). You can now use Gephi to resize the nodes according to IMD rank. We do this using the ranking tab:
You should now see you London Boroughs re-sized according to their IMD rank:
turning the lights off and adding some labels we get:
Ordnance Survey Linked Data: The Search API
Please note in some of the examples below I have been having trouble with wordpress ‘correcting’ quote marks in my text. If you find the queries don’t work you may need to manually replace the copied quote marks from below with new ones via your keyboard. Hope that makes sense.
One of the biggest improvements to the new Ordnance Survey Linked Data site is the much improved search functionality. You can either search over a specific dataset (e.g. the Code-Point(R) Open linked data) or over all the combined datasets. I will first give some examples of using the Boundary-Line(TM) search API.
The Boundary-Line search API explorer can be found here. The simplest use of this search API is to enter some text for the name of an administrative area or the GSS code (the ONS identifier for a statistical region) into the search box. To get started enter Southampton into the query box. You will see that the search results are returned in JSON (RSS and Atom are additional options). Results contain the URI of the entities that match your queries along with a number of useful attributes.
Note that the Request box shows the actual GET request that is being done, and you can use this GET request in your applications. Now try searching for a GSS code, enter E06000045 into the query box. You should see results for the City of Southampton returned. So far so straight forward. The search function also allows for wildcards in search, for example in the Query box type:
label:Southa*
It is also possible to narrow search results by type. Recall that the search for Southampton returned both Westminster constituencies and a unitary authority with Southampton in their name. To just find the Westminster constituencies search for the following:
label:Southampton AND type:”http://data.ordnancesurvey.co.uk/ontology/admingeo/WestminsterConstituency“
The search API also allows you to perform a number of simple spatial queries. The first of these are bounding box queries. For the Boundary-Line data you can specify a bounding box, and find all the administrative regions whose centroids lie within that bounding box. The bounding box can be expressed in eastings and northings. For example try the following:
easting:[371000 TO 374000] AND northing:[161000 TO 164500]
in the query box.
The answers can be narrowed down further by specifying the type of object that should be returned. For example to just get the civil parishes in this bounding box try the following:
easting:[371000 TO 374000] AND northing:[161000 TO 164500] AND type:”http://data.ordnancesurvey.co.uk/ontology/admingeo/CivilParish“
Another type of simple spatial query we can do in the search API is ‘find me all feature of a kind type within a certain radius of a given point’. Here the point can be specified in either lat/long or easting/northing. To find all of the civil parishes in a 50 km radius of the point with easting 442339 and northing 112882 put:
type:”http://data.ordnancesurvey.co.uk/ontology/admingeo/CivilParish“
into the query box and put the appropriate values in the easting and northing boxes, followed by a 50 in the radio search box. If, for example, you want to perform this query again but find civil parishes and districts enter the following into the query box:
type:”http://data.ordnancesurvey.co.uk/ontology/admingeo/CivilParish“ OR type:”http://data.ordnancesurvey.co.uk/ontology/admingeo/District“
and try the query again.
These are just some simple examples of the search API. The full documentation is here.
Putting SPARQL on the Map with Ordnance Survey Linked Data & OS OpenSpace
A colleague was asking me if I knew how to plot SPARQL query results from the Ordnance Survey linked data onto an OS OpenSpace map. Although I’d done it a few times before, it was never something I’d blogged. So here goes…
This is a lot easier than you might imagine. The first thing you want to do is perform your SPARQL query and get back the results as a csv file. I blogged about this a while back, but here is a quick recap. Let us suppose I want to plot a centroid for all the districts in England, and have their name appear in the pop up text. It is easy to perform a query to get back the easting, northing and name for all the districts. First go to the Boundary-Line(TM) SPARQL endpoint and enter the following query:
select ?x ?y ?name
where
{
?a <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?a <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/easting> ?x .
?a <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/northing> ?y .
?a a <http://data.ordnancesurvey.co.uk/ontology/admingeo/District> .
}
Make sure the response format is set to CSV. Now click the query button. The “Reponse” box will have your query results, and the “Request box” should have a long complicated looking URL:
http://data.ordnancesurvey.co.uk/datasets/boundary-line/apis/sparql?query=select+%3Fx+%3Fy+%3Fname%0D%0Awhere%0D%0A%7B%0D%0A%3Fa+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23label%3E+%3Fname+.%0D%0A%3Fa+%3Chttp%3A%2F%2Fdata.ordnancesurvey.co.uk%2Fontology%2Fspatialrelations%2Feasting%3E+%3Fx+.%0D%0A%3Fa+%3Chttp%3A%2F%2Fdata.ordnancesurvey.co.uk%2Fontology%2Fspatialrelations%2Fnorthing%3E+%3Fy+.%0D%0A%3Fa+a+%3Chttp%3A%2F%2Fdata.ordnancesurvey.co.uk%2Fontology%2Fadmingeo%2FDistrict%3E+.%0D%0A%7D&output=csv
So far so easy.
Now we come to the OS OpenSpace part. It is easy to plot a text file in OS OpenSpace. To find out how go to the OS OpenSpace Code Playground and select the link “Add markers and text from a file“. You should see an example mashup showing some points plotted. To see what is going on click on “Edit in Code Playground” and you should see the javascript & HTML that produces the map. In the Javascript window you can edit the code and preview the changes. For this example the first simple thing you need to do is adjust the zoom level. To do this change:
osMap.setCenter(new OpenSpace.MapPoint(400000, 400000), 7);
to
osMap.setCenter(new OpenSpace.MapPoint(400000, 400000), 1);
so we are zoomed all the way out.
We now need to change the input text file. To do this change the following line in the sample code:
var markersFile = “/res/mymarkers.txt”;
In this line replace /res/mymarkers.txt with the URL you got from the SPARQL endpoint in the Request box. Once you have done that click the ‘render’ button and you should now see your results plotted on an OS OpenSpace map. Click on a map pointer to display the name of the district. Easy as that.
As an exercise to the reader…consult my last few blog posts and display markers for postcodes in a district of your choosing.
Ordnance Survey SPARQL Endpoint
I just wanted to quick mention one feature of the Ordnance Survey linked data SPARQL endpoints that I think it pretty neat. Go to the SPARQL endpoint and try one of the queries from my last four blogs posts. In this post I’ll got with the following simple query (recall this query gets the name, lat, long, gss code and unit_id for all districts in Great Britain):
select ?name ?lat ?long ?gss ?unit_id
where
{
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
?x <http://data.ordnancesurvey.co.uk/ontology/admingeo/gssCode> ?gss .
?x <http://data.ordnancesurvey.co.uk/ontology/admingeo/hasUnitID> ?unit_id .
?x a <http://data.ordnancesurvey.co.uk/ontology/admingeo/District> .
}
You will notice that on hitting the query button that a box will appear that says “Request” and a rather long URL will appear:
You can now use this URL to issue a GET request in PHP, Javascript etc. and use these output within a web application just as you would with any API call. To see this working in a simple way copy the long URL you get from your SPARQL query and at the command line (if running something UNIXy) type:
curl LONG_URL
where LONG_URL is your long URL. You should now see the JSON response from that GET request.
Happy SPARQLing…
Ordnance Survey Linked Data – Combining postcode and spatial queries
In my previous blog posts (here, here and here) I concentrated on a few simple SPARQL queries. However, these SPARQL queries were only performed on one dataset at a time. With linked data things get more interesting when you combine datasets (even if those datasets are both from the same publisher). The original opendata for Code-Point(R) Open was published as a CSV file, and Boundary-Line(TM) was published as a shape file. Both formats are useful, but combining the two datasets can be tricky if you are not sure how to use shape files. The two datasets are, however, implicitly linked via a key. In the linked data versions we have converted both datasets into a common data language (RDF) and made those implicit links explicit (see https://johngoodwin225.wordpress.com/2013/08/05/ordnance-survey-linked-data-a-simple-postcode-query/ ). We can now easily query both datasets together.
To query both datasets go to the SPARQL endpoint for the combined data:
http://data.ordnancesurvey.co.uk/datasets/os-linked-data/explorer/sparql
Recall that the query:
select ?postcode ?lat ?long
where
{
?x <http://www.w3.org/2000/01/rdf-schema#label> ?postcode .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
?x <http://data.ordnancesurvey.co.uk/ontology/postcode/ward>
<http://data.ordnancesurvey.co.uk/id/7000000000017707> .
}
finds all of the postcodes (and their lat/long) in a ward called Bevois. Using the combined datasets we can run queries like: find me all postcodes in the ward Bevois and all the wards touching Bevois. Let’s break this down. To find all postcodes (and their lat/long) in regions that touch Bevois first find all the wards that touch Bevois, and then find all the postcodes within those wards. So the first line in the query below “?y <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches> <http://data.ordnancesurvey.co.uk/id/7000000000017707>” finds all wards touching Bevois, and the rest of the query matches postcodes in the wards “?y”:
select ?postcode ?lat ?long
where
{
?y <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches>
<http://data.ordnancesurvey.co.uk/id/7000000000017707> .
?x <http://www.w3.org/2000/01/rdf-schema#label> ?postcode .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
?x <http://data.ordnancesurvey.co.uk/ontology/postcode/ward> ?y .
}
To find all the postcodes in Bevois and the wards touching Bevois we simply union the two queries above together:
select ?postcode ?lat ?long
where
{{
?x <http://www.w3.org/2000/01/rdf-schema#label> ?postcode .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
?x <http://data.ordnancesurvey.co.uk/ontology/postcode/ward>
<http://data.ordnancesurvey.co.uk/id/7000000000017707> .
}
UNION
{
?y <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches>
<http://data.ordnancesurvey.co.uk/id/7000000000017707> .
?x <http://www.w3.org/2000/01/rdf-schema#label> ?postcode .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
?x <http://data.ordnancesurvey.co.uk/ontology/postcode/ward> ?y .
}}
As an exercise for the reader – find all postcodes in Southampton and districts touching Southampton. Happy SPARQLing…
Ordnance Survey Linked Data – A Simple Postcode Query
In the previous two blog posts (here and here) I gave some simple examples of queries you could run on the Ordnance Survey Boundary-Line(TM) linked data. Here I want to give some simple examples of what you can do with the Code-Point(R) Open linked data. The Code-Point(R) Open linked data has a URI for every Postcode Unit, Postcode Sector, Postcode District and Postcode Area in England, Scotland and Wales. Each Postcode Unit is nested within a Postcode Sector, each Postcode Sector is nested within a Postcode District and each Postcode District is nested within a Postcode Area. The reciprocal contains relationships are also included.
A Postcode Unit URI takes the form:
http://data.ordnancesurvey.co.uk/id/postcodeunit/SO160AS
A Postcode SectorURI takes the form:
http://data.ordnancesurvey.co.uk/id/postcodesector/SO160
A Postcode District URI takes the form:
http://data.ordnancesurvey.co.uk/id/postcodedistrict/SO16
A Postcode Area URI takes the form:
http://data.ordnancesurvey.co.uk/id/postcodearea/SO
Let us now try some SPARQL. Go to the CodePoint-Open SPARQL endpoint, and for simplicity select the response format to CSV. Supposed we wanted to select all of the postcodes and their lat/long coordinate for postcodes within the SO postcode area. This can be done simply as follows:
select ?postcode ?lat ?long
where
{
?x <http://www.w3.org/2000/01/rdf-schema#label> ?postcode .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
?x <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/within>
<http://data.ordnancesurvey.co.uk/id/postcodearea/SO> .
}
Notice that I do not need to specify that ‘x’ is a postcode unit in this case as only postcode units have a lat/long value.
Let us try a slightly more complicated query – say I wanted all the postcodes, and their lat/long, within the postcode sectors SO16 and SO17. I would do a query similar to the above, but with a UNION to collect together results for each postcode sector:
select ?postcode ?lat ?long
where
{
{
?x <http://www.w3.org/2000/01/rdf-schema#label> ?postcode .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
?x <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/within>
<http://data.ordnancesurvey.co.uk/id/postcodedistrict/SO16> .
}
UNION
{
?x <http://www.w3.org/2000/01/rdf-schema#label> ?postcode .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
?x <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/within>
<http://data.ordnancesurvey.co.uk/id/postcodedistrict/SO17> .
}
}
So far so easy…
The Code-Point Open linked data also contains links to the administrative areas that ‘contain’ the postcode. A word of caution is needed here. Postcodes do not respective administrative boundaries so containment in this case actually means the administrative region that the lat/long for that postcode lies within. There are three predicates for relating postcode units to administrative areas. These are:
- ward – this relates postcode units to wards and unitary electoral divisions
- district – this relates postcode units to districts, metropolitan districts, London boroughs and unitary authorities
- county – this relates postcode units to counties (where applicable)
So say I want a list of all the postcode units (and their lat/longs) that lie within the ward of Bevois. This is a straightforward query. First you can find the URI for Bevois here. You’ll find the URI for Bevois is:
http://data.ordnancesurvey.co.uk/id/7000000000017707
Now enter the following SPARQL query to find all the postcodes in Bevois:
select ?postcode ?lat ?long
where
{
?x <http://www.w3.org/2000/01/rdf-schema#label> ?postcode .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
?x <http://data.ordnancesurvey.co.uk/ontology/postcode/ward>
<http://data.ordnancesurvey.co.uk/id/7000000000017707> .
}
I’ll leave it as an exercise for the reader to find all of the postcode units (and their lat/long) in Southampton.
What if I want to find out the postcode districts that are covered by the ward Bevois? First I find all the postcode units in Bevois, and then I do a query to look up all the postcode districts those postcode units are within as follows:
select distinct ?postcodedistrict
where
{
?x <http://data.ordnancesurvey.co.uk/ontology/postcode/ward>
<http://data.ordnancesurvey.co.uk/id/7000000000017707> .
?x <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/within> ?y .
?y a <http://data.ordnancesurvey.co.uk/ontology/postcode/PostcodeDistrict> .
?y <http://www.w3.org/2000/01/rdf-schema#label> ?postcodedistrict .
}
You will now see all the postcode districts the ward Bevois covers. As a final exercise to the reader perform the same query, but find postcode sectors that cover Southampton. Happy SPARQLing.
Ordnance Survey Linked Data – A Simple Spatial Query
In this blog I thought I would give an example of some very simple spatial queries using the Ordnance Survey Linked Data. When we first created the Ordnance Survey linked data not many RDF triplestores had spatial indexes, or in other words there was no easy way to say ‘find me all the Parishes in Hampshire‘ using a query based on the geometries of these regions. This functionality is fairly standard in GIS systems and a number of spatially enabled relational databases, and is now being increasingly implemented in RDF triplestores and other NoSQL technologies. To get round this issue it was decided that it would be very useful to precompute various topological relationships between the administrative areas described in the Boundary-Line(TM) linked data. What you will see in the data are explicit spatial relationships like touches, within and contains that relate the different administrative regions. Now the administrative geography of this country is complicated, and I’m no geographer so a complete description of it will be left for a later blog post. For now I will say that Boundary-Line contains different geographies based on national voting and some on local authorities. The spatial relationships are only includedwhere relevant – for example you won’t find explicit spatial relationships between Westminster Constituencies and Counties, but you will find them between Counties and Districts.
In the Ordnance Survey linked data you will find three types of spatial relationship: touches, within and contains:
- touches means that two regions share a point on their boundary, but share no common points on their interior. They are adjacent/bordering. Touches relationships are typically only recorded between regions of the same type, i.e. which parish touches which parish. You won’t find a list of parishes touching counties. However, at some levels it gets a bit more complicated due to single tier local authorities (unitary authorities) and those based on a double tier (county/district). Counties and unitary authorities tessellate the country at some level, as do districts and unitary authorities.
- contains and within are fairly self explanatory I hope. Contains and within relationships are only stated between regions in the same geography and only explicitly stated between entities that directly contain/are within each other. What does this last part mean? In the local authorities geography counties contain districts and districts, in turn, contain parishes. You will only find explicit ‘contains’ statements between counties and districts, and between districts and parishes – you won’t find them between counties and parishes.
So now for some examples. Supposed I want to find the name of all the regions contained immediately in Hampshire. First you need to find which URI identifies Hampshire. Go to the Boundary-Line search API and search for Hampshire. You should then see that the county of Hampshire has the following URI:
http://data.ordnancesurvey.co.uk/id/7000000000017765
You can now use this in your query. Go to the SPARQL endpoint and enter the following:
select ?x ?name
where
{
?x <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/within> <http://data.ordnancesurvey.co.uk/id/7000000000017765> .
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
}
You will see a list of everything immediately within Hampshire, and these will all be of type district. Supposed you now want to get everything within Hampshire. This can be done easily by adding a ‘+’ at the end of the within predicate as follows:
select ?x ?name
where
{
?x <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/within>+ <http://data.ordnancesurvey.co.uk/id/7000000000017765> .
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
}
You now have a list of everything within Hampshire – this includes districts, wards and parishes. Now suppose you just want the parishes- you can do this by adding an extra line to the query to only match x to things of type civil parish:
select ?x ?name
where
{
?x <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/within>+ <http://data.ordnancesurvey.co.uk/id/7000000000017765> .
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?x a <http://data.ordnancesurvey.co.uk/ontology/admingeo/CivilParish> .
}
Touches works in a similar way. Supposed you want the names of unitary authorities that touch Hampshire issue the following:
select ?x ?name
where
{
?x <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches> <http://data.ordnancesurvey.co.uk/id/7000000000017765> .
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?x a <http://data.ordnancesurvey.co.uk/ontology/admingeo/UnitaryAuthority> .
}
Say you want to find parishes that touch Hampshire. This is where it gets complicated and the following is maybe for advanced SPARQL-wizards only. First find all of things that touch Hampshire (this will include other counties, unitary authorities and districts), then find all parishes within those regions and find which of those parishes touch parishes within Hampshire:
select distinct ?y ?name
where
{
?x <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches> <http://data.ordnancesurvey.co.uk/id/7000000000017765> .
?y <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/within> ?x .
?y a <http://data.ordnancesurvey.co.uk/ontology/admingeo/CivilParish> .
?z <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/within>+ <http://data.ordnancesurvey.co.uk/id/7000000000017765> .
?z a <http://data.ordnancesurvey.co.uk/ontology/admingeo/CivilParish> .
?z <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches> ?y .
?y <http://www.w3.org/2000/01/rdf-schema#label> ?name .
}
Congratulations – you now have a list of all the parishes that touch Hampshire.
Hopefully some of these queries are useful – happy SPARQLing.
Ordnance survey Linked Data – Simple SPARQL example
Yesterday I received a request asking how to extract some simple data from the Ordnance Survey linked data using a SPARQL query. This post is not intended as a SPARQL tutorial – you can find plenty of those here.
A user wanted to know how to retrieve the name, unit-id, GSS Code, lat and long of all the unitary authorities, districts and metropolitan districts in England, Scotland and Wales as a CSV file.
To extract this information for all of the districts go to the Ordnance Survey’s Boundary-Line(TM) linked data SPARQL endpoint explorer and in the response format drop down menu select CSV. Now in the query window enter the following query:
select ?name ?lat ?long ?gss ?unit_id
where
{
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
?x <http://data.ordnancesurvey.co.uk/ontology/admingeo/gssCode> ?gss .
?x <http://data.ordnancesurvey.co.uk/ontology/admingeo/hasUnitID> ?unit_id .
?x a <http://data.ordnancesurvey.co.uk/ontology/admingeo/District> .
}
This query selects the various attributes from the data, and the final line of the query makes sure that all of the entities selected from the data are of type District.
Scroll down the page and you should see the query response. To get the values for the district, unitary authorities and metropolitan districts we need to use a SPARQL union to gather together all of the results as follows:
select ?name ?lat ?long ?gss ?unit_id
where
{
{
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
?x <http://data.ordnancesurvey.co.uk/ontology/admingeo/gssCode> ?gss .
?x <http://data.ordnancesurvey.co.uk/ontology/admingeo/hasUnitID> ?unit_id .
?x a <http://data.ordnancesurvey.co.uk/ontology/admingeo/District> .
}
UNION
{
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
?x <http://data.ordnancesurvey.co.uk/ontology/admingeo/gssCode> ?gss .
?x <http://data.ordnancesurvey.co.uk/ontology/admingeo/hasUnitID> ?unit_id .
?x a <http://data.ordnancesurvey.co.uk/ontology/admingeo/MetropolitanDistrict> .
}
UNION
{
?x <http://www.w3.org/2000/01/rdf-schema#label> ?name .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat .
?x <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long .
?x <http://data.ordnancesurvey.co.uk/ontology/admingeo/gssCode> ?gss .
?x <http://data.ordnancesurvey.co.uk/ontology/admingeo/hasUnitID> ?unit_id .
?x a <http://data.ordnancesurvey.co.uk/ontology/admingeo/UnitaryAuthority> .
}
}
order by ?name
The ‘order by’ at the end of the query orders the results in alphabetical order.
To save the query results as a CSV file again make sure that response format in set to CSV and this time, before hitting the query button, make sure the ‘show raw response’ option is selected. Now hit the query button and you should be given the option to save your query result as a CSV file.