In a previous post I wrote up some of the features of the new Ordnance Survey Linked Data. In this blog post I want to run through a concrete example of the sort of thing you can build using this linked data.
A while ago Talis built their BIS Explorer. The aim of this application was to allow users to “identify centres of excellence at the click of a button” and more can be read about the application here. This data mash-up took different data sources about funded research projects and joined them together using linked data. In the original application you could, for example, look at funded research projects by European Region in Great Britain. This can be seen here. At the time this demo was created Ordnance Survey was yet to publish its postcode data as linked data, but if they had it would have been very easy to get a more fine grained view of research projects down at the county and district level. Here’s how…
The basic data model of the original BIS data was fairly straight forward. Universities and businesses have a link to the projects they worked on. For each university there is also postcode information. Things get interesting if instead of/as well as linking to a string representation of a postcode you link to the URI for said postcode. This can be done by using the property:
So say we wanted to do this for Imperial College. All we need is this (this example is in N-Triple format) in our data:
Now, by the power of linked data, connecting to a resource for the postcode means we can now enrich our university dataset with knowledge of the ward, county and district the university is in. Also, given that the university is connected to a project we have a link from project to region. Through the link from project to university to postcode to region we can now start to have a more finely grained view on which areas are getting more funding.
So how do we do this in practice? There are the steps I followed.
- Download the BIS data from here and load it into a triple store (linked data database) of your choice. There are plenty of good open source ones available e.g. Sesame or TDB to name two.
- I then then added the links to postcode URIs as described above.
- Following that I loaded the data for the postcodes in a similar manner to that described here. A relatively simple script retrieved the RDF for the relevant postcodes and loaded the RDF into my store. The nice thing about linked data and RDF is that the stores are like a big bucket of data and you can keep throwing more and more in. Hopefully future linked data tools will make this step trivial, but for now some scripting was required.
- Job done. I now have links from research projects to regions.
Basically what I created from this was an aggregation of various datasets that you can now query. This is something that is made very easy using linked data and URIs to identify things like postcodes. As more publishers release data in linked data form there is more and more potential for building services and applications on top of aggregations of these datasets. So that’s what I decided to do…
This application (I make the usual apology for my lack of web development skills and for the slowness which some caching would not doubt sort out) builds a clickable map view of this data aggregation. The OS OpenSpace API makes it possible to retrieve the unit ID for selected polygons. I can then use this unit ID in a SPARQL query to find the projects funded in that region.
However, it would have been easier if there was a RESTful API on top of the data aggregation that would have let me retrieve these results instead of doing some SPARQL. So that is what I decided to build next using the Linked Data API. The linked data API basically lets you create RESTful type shortcuts to relatively complex SPARQL query. Due to my lack of PHP skills it was an initially bumpy ride getting it to work (see here) but I got there in the end and the result was an API that lets you return research projects by selecting regions either through their SNAC codes or Ordnance Survey IDs, e.g:
results can be returned in different formats using content negotiation  or by simple adding the relevant .html, .json to the URI, e.g.:
I hope this example shows how linked data can be useful in building applications on top of data aggregations. To summarise:
- Publishers release data in linked data format.
- Having data in a common format (RDF) with dereferencable URIs makes it relatively each to retrieve and aggregate from a number of resources, especially if data is linked to URIs for ‘things’ and not just ‘strings’.
- The linked data API makes it possible to build a RESTful service on top of a data aggregation so web developers need not be put of by complex SPARQL queries.
- Applications can then built using these services.
 for some reason the HTML conneg only seems to work in Firefox.