Archive

Posts Tagged ‘Ordnance Survey’

How are you using Ordnance Survey Linked Data?

I might have mentioned (a few times) that the new look Ordnance Survey linked data site is now live. A question I ask from time to time is:

1) Are you using the data, and if so what for (if you don’t mind saying)?

2) Even if you aren’t actively using the data are you linking to it?

Please comment below if you have anything you’d like to share. Thank you in advance!

New Ordnance Survey Linked Data Site not just for Data Geeks

June 3, 2013 1 comment

Ordnance Survey’s new linked data site went live today. You can read the official press release here. One of the major improvements to the site is the look and feel of the site, and as a result of this the site should be useful to people who don’t care about ‘scary things’ like APIs, linked data or RDF.

One key additional feature of the new site is map views (!) of entities in the data. This means the site could be useful if you want to share your postcode with friends or colleagues as a means of locating your house or place of work. Every postcode in Great Britain has a webpage in the OS linked data of the form:

http://data.ordnancesurvey.co.uk/id/postcodeunit/{POSTCODE}

Examples of this would be the OS HQ postcode:


http://data.ordnancesurvey.co.uk/id/postcodeunit/SO160AS

or the postcode for the University of Southampton:


http://data.ordnancesurvey.co.uk/id/postcodeunit/SO171BJ

Click on either of these links you’ll see a map of the postcode – which you can view at various levels of zoom. You’ll also see useful information about the postcode such as its lat/long coordinate. More interestingly you’ll notice that it provides information about the ward, district/unitary authority, county (where applicable) and country your postcode is located in. So for the University of Southampton postcode we can see it’s located in the ward Portswood, the district Southampton and the country England.

Another interesting addition to the site is links to a few useful external sites such as: They Work For You, Fix My Street, NHS Choice and Police UK. This hopefully makes the linked data site a useful location based hub to information about what’s going on in your particular postcode area.

Why not give it a try with your postcode…:)

GeoSPARQL and Ordnance Survey Linked Data

April 26, 2013 3 comments

The Ordnance Survey Linked Data contains lots of qualitative spatial information – that is topological relationships between different regions. We have information about what each region contains, is within and touches (e.g. Cambridgeshire touches Norfolk). These relationships were encoded using an Ordnance Survey vocabulary as there was nothing suitable at the time. Since then a new standard has emerged from the OGC called GeoSPARQL. In the long term we would probably like to migrate the OS data over to the GeoSPARQL standard, but to stop third party applications using the data from breaking we decided not to on this release. However, mappings from the OS vocabulary have been made to the GeoSPARQL vocabulary via ‘owl:equivalentProperty’. So each of the spatial relationships now have a link to their equivalent in GeoSPARQL. Please see: contains, within, touches, equals, disjoint and partially overlaps for more details on which properties they are related to in GeoSPARQL.

 

Ordnance Survey Linked Data and the Reconciliation API

April 25, 2013 3 comments

The new Ordnance Survey Linked Data has a reconciliation API that allows users to turn text into URIs by matching against the Ordnance Survey linked data using a tool called open refine.

I’m not an expert on open refine but had a quick try of the tool today using some open data about libraries (available here). Instructions on installing Open Refine can be found here.

To use the Open Refine load your data into the tool and create your new project. On loading the library data into Open Refine you should see something like this:

Image

We can use Open Refine to turn the labels in both the ‘county’ column and postcode column into URIs. For the county column click the down arrow next the column name and select reconcile -> start reconciling. Now click ‘Add Standard Service’ and add the following URL
http://beta.data.ordnancesurvey.co.uk/datasets/boundary-line/apis/reconciliation. 

As the ‘county’ column will contain a mixture of types select the ‘reconcile against no particular type’ option and click ‘start reconciling’. You should now see that most of the text labels have turned to hyperlinks (note OS linked data does not included Northern Ireland data…this accounts for the missing values).

You can do the same for the postcode column, but this time use the API at:
http://beta.data.ordnancesurvey.co.uk/datasets/code-point-open/apis/reconciliation

Your data should now look something like:

Image

You have now successfully replaced the text in these columns with links to the OS linked data.

Another useful thing to try is a simple bit of geocoding based on postcodes. Again go to the postcode column and select “Edit Column -> Add Column by fetching URLs’. Where asked type in a column name (e.g. PC JSON) and in the Expression box type:

http://beta.data.ordnancesurvey.co.uk/datasets/code-point-open/apis/search?output=json&query=’ + escape(value,’url’)

You should now see a column appear full of JSON results:

Screen Shot 2013-04-25 at 15.23.11

On the PC JSON column select “Edit Column -> Add Column Based on this column”. Again add a column name of your choice. I wanted to extract the value of the easting and northing and add it as a column so I called my new column ‘easting,northing’. In the expression box enter the following to get the value of the easting and northing:

with(value.parseJson(), pair, pair.results[0].easting + ‘,’ + pair.results[0].northing)

and you should now see something like:

Screen Shot 2013-04-25 at 15.27.27

Congratulations…you have now geo-coded your libary spreadsheet via a postcode and the OS linked data.

For more info on how to use Open Refine for reconciliation watch this youtube video.

Announcing new beta Ordnance Survey Linked Data Site

April 25, 2013 1 comment

Ordnance Survey has released a new beta linked data site. You can read the official press release here.

I thought I’d write a quick (unofficial) guide to some of the changes. The most obvious one that is hopefully apparent as you navigate round the site is the much improved look and feel of the site. Including maps (!) showing where particular resources are located. Try this and this for example. Maps can be viewed at different levels of zoom.

Another improvement is the addition of new APIs. The first of these is an improved search function. Supported fields for search and some examples can be found here. The search API now includes a spatial search element.

The SPARQL API is improved. Output is now available in additional formats (such as CSV) as well as the usual SPARQL-XML and SPARQL-JSON. Example SPARQL queries are also included to get users started.

Another interesting addition is a new reconciliation API. This allows developers to use the Ordnance Survey linked data with the Open Refine tool. This would allow a user to match a list of postcodes or place names in a spreadsheet to URIs in the Ordnance Survey linked data.

In the new release the Ordnance Survey linked data has been split into distinct datasets. You could use the above described APIs with the complete dataset or, if preferred, just work on the Code-Point Open or Boundary Line datasets.

For details on where to send feedback on the new site please see the official press release here.

Update: I blogged a bit more about some of the new APIs here.

Introducing RAGLD

December 21, 2011 1 comment

RAGLD (Rapid Assembly of Geo-centred Linked Data) is a project looking at the development of a software component library to support the Rapid Assembly of Geo-centred Linked Data applications

The advent of new standards and initiatives for data publication in the context of the World Wide Web (in particular the move to linked data formats) has resulted in the availability of rich sources of information about the changing economic, geographic and socio-cultural landscape of the United Kingdom, and many other countries around the world. In order to exploit the latent potential of these linked data assets, we need to provide access to tools and technologies that enable data consumers to easily select, filter, manipulate, visualize, transform and communicate data in ways that are suited to specific decision-making processes.In this project, we will enable organizations to press maximum value from the UK’s growing portfolio of linked data assets. In particular, we will develop a suite of software components that enables diverse organizations to rapidly assemble ‘goal-oriented’ linked data applications and data processing pipelines in order to enhance their awareness and understanding of the UK’s geographic, economic and socio-cultural landscape.A specific goal for the project will be to support comparative and multi-perspective region-based analysis of UK linked data assets (this refers to an ability to manipulate data with respect to various geographic region overlays), and as part of this activity we will incorporate the results of recent experimental efforts which seek to extend the kind of geo-centred regional overlays that can be used for both analytic and navigational purposes. The technical outcomes of this project will lead to significant improvements in our ability to exploit large-scale linked datasets for the purposes of strategic decision-making.RAGLD is a collaboative research initiative between the Ordnance Survey, Seme4 Ltd and the University of Southampton, and is funded in part by the Technology Strategy Board‘s “Harnessing Large and Diverse Sources of Data” programme. Commencing October 2011, the project runs for 18 months.

If you’d like to input into the requirements phase of the project I’d be very grateful if you could fill in one of these questionnaires. Many thanks in advance.

Making things with Ordnance Survey Linked Data…

November 3, 2011 7 comments

Following the example of “Making things with BBC data” I thought I’d ask the same question for Ordnance Survey linked data. Please leave a comment if you’ve used Ordnance Survey linked data for anything from a quick hack, full blown project or if you even just link to it in your data. Thanks!

 

 

Adventures with Kasabi…and a request for help…

October 5, 2011 1 comment

I’ve been playing around with Kasabi a bit of late. Kasabi is a new information market place from Talis that provides a useful place to publish your data, and then build services on top of it.

By way of a quick example you’ll see that the currently Ordnance Survey Linked Data is hosted in Kasabi here. There are a number of standard APIs provided with each dataset such as a SPARQL endpoint, search API etc. Kasabi provides an easy way to map complex SPARQL queries to simple API queries. For example, this API provides an easy way to do topological queries on the Ordnance Survey Linked Data. For example:

http://api.kasabi.com/dataset/ordnance-survey-linked-data/apis/33m?id=7000000000037256&spatialrelation=touches&apikey=yourkey

will find all regions that touch The City of Southampton.

This API gives you a list of all postcodes (and their lat/long) within a particular region. For example:

http://api.kasabi.com/api/ordnance-survey-postcode-region?district=7000000000037256&apikey=yourkey

will find all postcodes inThe City of Southampton. Additionally we can apply a style sheet to get back the same information as KML:

http://api.kasabi.com/api/ordnance-survey-postcode-region?district=7000000000037256&apikey=yourkey&output=kml

Kasabi also provides a handy place to store and host data, and one Sunday afternoon I decided to see how easy it would be to create a couple of hyperlocal datasets: one for Southampton and one for Hampshire. The basic approach in creating these hyperlocal datasets was to effectively ‘cookie cut’ region specific data from a number of different linked (open) data sets and put them into one store. So for Southampton and Hampshire we have a list of: airports, bus stops, stations, schools, GPs, hospitals, renewable energy generators, heritage sites, councillors, crime statistics, administrative regions and postcodes…   Each important element, like schools or hospitals, is linked to a postcode, district, ward—and in the case of Hampshire—county. The Ordnance Survey linked data is effectively acting as the clue between disparate sources of information. The fact that each of these datasets was provided as linked data, and furthermore referenced common identifiers for administrative regions and postcodes, meant it was very easy to bring them together in one store. Some sample queries are provided here. It gets more interesting when you combine elements from different datasets to, for example, ask questions like ‘find me GPs in my ward, and all the bus stops within a 100 metre radius of those GPs’.  If one were extra paranoid it would then be possible to extend the query to only find bus stops in areas of low anti-social crime levels. These queries are all well and good, or ‘nerdy, but nice’ as Andrew Stott put it.

What I was really hoping to do was build a nice webapp on top of this integrated data. Anyone who has seen my previous mash-ups will know that web design is not amongst my key skills, so Zach Beauvais suggested I put word out to the developer community to see if anyone fancied building on this data to make something cool and interesting like (for example) the (in)famous postcode paper. Any volunteers? :)

 

 

 

How can I use the Ordnance Survey Linked Data: a python rdflib example

January 18, 2011 3 comments

In this blog post I talked about the potential of (Ordnance Survey) linked data. Partly motivated by this challenge I decided to write up how I did the mash up of data.gov.uk data and Ordnance Survey linked data. This post is a slightly different take on a previous post.

For this mashup I used Python 2.7 and rdflib 3.0.0.

First off you need to install rdflib. Full instructions on doing this can be found here. If you use easy_install you can install rdflib by typing:

easy_install -U "rdflib>=3.0.0"

You will also need to install rdfextras (see here). This can also be done using easy_install

easy_install rdfextras

You are now good to go. The next thing I needed was the BIS funding data. This can be downloaded here. The original BIS data gives location for various organisations via a URI based on the organisation’s postcode. For example:

<http://education.data.gov.uk/id/institution/UniversityOfWalesSwansea&gt;
<http://research.data.gov.uk/def/project/location&gt;
<http://education.data.gov.uk/id/institution/BabrahamBioscienceTechnolgiesLtd/SA28PP&gt; .

I edited the data to point to URIs for postcodes in the Ordnance Survey linked data (note these weren’t available when the BIS data was created). Now we have:

<http://education.data.gov.uk/id/institution/UniversityOfWalesSwansea&gt;
<http://research.data.gov.uk/def/project/location&gt;
<http://data.ordnancesurvey.co.uk/id/postcodeunit/SA28PP&gt; .

This triple basically states the location of the University of Wales in terms of its postcode.

So the edited RDF data now contains location information for research institutions in terms of a postcode URI, and it also contains information about the research projects worked on by those institutions and how much funding those projects received. Using rdflib it is very straight forward to load this data into Python and use it programmatically. Here’s how:

These first few lines load the necessary libraries and plugins:

import logging
import rdflib

# Configure how we want rdflib logger to log messages
_logger = logging.getLogger("rdflib")
_logger.setLevel(logging.DEBUG)_hdlr = logging.StreamHandler()
_hdlr.setFormatter(logging.Formatter('%(name)s %(levelname)s: %(message)s'))
_logger.addHandler(_hdlr)

from rdflib import Graph
from rdflib import URIRef, Literal, BNode, Namespace, ConjunctiveGraph
from rdflib import RDF
from rdflib import RDFS
rdflib.plugin.register('sparql', rdflib.query.Processor,'rdfextras.sparql.processor', 'Processor')
rdflib.plugin.register('sparql', rdflib.query.Result, 'rdfextras.sparql.query', 'SPARQLQueryResult')

 

we now create a Graph in which to store the RDF:

store = Graph()

the data can be easily loaded from the web or hard drive. In this case I have the files stored locally:

store.parse("file:/C:/Projects/RDFPythonPlay/data/businessdatagovuk.nt", format="nt")
store.parse("file:/C:/Projects/RDFPythonPlay/data/educationdatagovuk.nt", format="nt")
store.parse("file:/C:/Projects/RDFPythonPlay/data/patentsdatagovuk.nt", format="nt")
store.parse("file:/C:/Projects/RDFPythonPlay/data/researchdatagovuk.nt", format="nt")

Recall from here that I am interested in seeing which parties are funding in which local authority areas. The data as it stands will not let me do this. However, the OS postcode linked data provides information about the local authority areas that a postcode is contained in. All I now have to do is ‘follow my nose’ and load in the postcode data. I can do this by going through the triples containing links between organisations and postcodes via the location property. First I set up a few namespace bindings:

# Bind a few prefix, namespace pairs.
store.bind("PROJECT", "http://research.data.gov.uk/def/project/")
store.bind("FOAF", "http://xmlns.com/foaf/0.1/")

# Create a namespace object for the project and FOAF namespaces.
PROJECT = Namespace("http://research.data.gov.uk/def/project/")
FOAF = Namespace("http://xmlns.com/foaf/0.1/")

I can now iterate over the triples in the store and find those who subject is a type of foaf:Organization, and which contain the location property. An example of such a triple would be the one we had above:

<http://education.data.gov.uk/id/institution/UniversityOfWalesSwansea&gt;
<http://research.data.gov.uk/def/project/location&gt;
<http://data.ordnancesurvey.co.uk/id/postcodeunit/SA28PP&gt; .

I can then lookup the data behind the postcode URI and load this into the store. This is all done by the following code:

# For each foaf:Organization in the store get the postcode

for organization in store.subjects(RDF.type, FOAF["Organization"]):
for postcode in store.objects(organization, PROJECT["location"]):
try:
print postcode
store.parse(postcode)
except:
print '404 not found'

Now the data in the store will contain a link from organisation to postocde, and a link from postcode to local authority area. We can now traverse the graph to find the link from organisation to local authority area. We can now use a simple SPARQL query to retrieve a list of projects giving the local authority areas the participating organisations are based in. The SPARQL query to do this is:

select distinct ?label ?districtlabel
where
{
?organisation <http://research.data.gov.uk/def/project/project&gt; ?project .
?project <http://www.w3.org/2000/01/rdf-schema#label&gt; ?label .
?organisation <http://research.data.gov.uk/def/project/location&gt; ?x .
?x <http://data.ordnancesurvey.co.uk/ontology/postcode/district&gt; ?district .
?district <http://www.w3.org/2000/01/rdf-schema#label&gt; ?districtlabel . }

We can now add that into our Python code as follows and print out the query answers:


query = """select distinct ?label ?districtlabel \
where \
{\
?organisation <http://research.data.gov.uk/def/project/project&gt; ?project .\
?project <http://www.w3.org/2000/01/rdf-schema#label&gt; ?label . \
?organisation <http://research.data.gov.uk/def/project/location&gt; ?x . \
?x <http://data.ordnancesurvey.co.uk/ontology/postcode/district&gt; ?district . \
?district <http://www.w3.org/2000/01/rdf-schema#label&gt; ?districtlabel . }"""

answers = store.query(query).serialize('python')

for (label,districtlabel) in answers:
print "%s was funded in %s" % (label,districtlabel)

To summarise, this post shows how you just need rdflib and Python to build a simple linked data mashup – no separate triplestore is required! RDF is loaded into a Graph. Triples in this Graph reference postcode URIs. These URIs are de-referenced and the RDF behind them is loaded into the Graph. We have now enhanced the data in the Graph with local authority area information. So as well as knowing the postcode of the organisations taking part in certain projects we now also know which local authority area they are in. Job done! We can now analyse funding data at the level of postcode, local authority area and (as an exercise for the ready) European region.

[Python note - WordPress keeps messing with my indentation and I'm too tired to fix. I hope that doesn't detract from your enjoyment of this blog post :) ]

So what can I do with the new Ordnance Survey Linked Data?

October 25, 2010 7 comments

In a previous post I wrote up some of the features of the new Ordnance Survey Linked Data. In this blog post I want to run through a concrete example of the sort of thing you can build using this linked data.

A while ago Talis built their BIS Explorer. The aim of this application was to allow users to “identify centres of excellence at the click of a button” and more can be read about the application here. This data mash-up took different data sources about funded research projects and joined them together using linked data. In the original application you could, for example, look at funded research projects by European Region in Great Britain. This can be seen here. At the time this demo was created Ordnance Survey was yet to publish its postcode data as linked data, but if they had it would have been very easy to get a more fine grained view of research projects down at the county and district level. Here’s how…

The basic data model of the original BIS data was fairly straight forward. Universities and businesses have a link to the projects they worked on. For each university there is also postcode information. Things get interesting if instead of/as well as linking to a string representation of a postcode you link to the URI for said postcode. This can be done by using the property:


http://data.ordnancesurvey.co.uk/ontology/postcode/postcode

So say we wanted to do this for Imperial College. All we need is this (this example is in N-Triple format) in our data:

<
http://education.data.gov.uk/id/institution/ImperialCollegeOfScienceTechnologyAndMedicine
>

<
http://data.ordnancesurvey.co.uk/ontology/postcode/postcode
>

<
http://data.ordnancesurvey.co.uk/id/postcodeunit/W68RF
> .

Now, by the power of linked data, connecting to a resource for the postcode means we can now enrich our university dataset with knowledge of the ward, county and district the university is in. Also, given that the university is connected to a project we have a link from project to region. Through the link from project to university to postcode to region we can now start to have a more finely grained view on which areas are getting more funding.

So how do we do this in practice? There are the steps I followed.

  1. Download the BIS data from here and load it into a triple store (linked data database) of your choice. There are plenty of good open source ones available e.g. Sesame or TDB to name two.
  2. I then then added the links to postcode URIs as described above.
  3. Following that I loaded the data for the postcodes in a similar manner to that described here. A relatively simple script retrieved the RDF for the relevant postcodes and loaded the RDF into my store. The nice thing about linked data and RDF is that the stores are like a big bucket of data and you can keep throwing more and more in. Hopefully future linked data tools will make this step trivial, but for now some scripting was required.
  4. Job done. I now have links from research projects to regions.

Basically what I created from this was an aggregation of various datasets that you can now query. This is something that is made very easy using linked data and URIs to identify things like postcodes. As more publishers release data in linked data form there is more and more potential for building services and applications on top of aggregations of these datasets.  So that’s what I decided to do…

This application (I make the usual apology for my lack of web development skills and for the slowness which some caching would not doubt sort out) builds a clickable map view of this data aggregation. The OS OpenSpace API makes it possible to retrieve the unit ID for selected polygons. I can then use this unit ID in a SPARQL query to find the projects funded in that region.

However, it would have been easier if there was a RESTful API on top of the data aggregation that would have let me retrieve these results instead of doing some SPARQL. So that is what I decided to build next using the Linked Data API. The linked data API basically lets you create RESTful type shortcuts to relatively complex SPARQL query. Due to my lack of PHP skills it was an initially bumpy ride getting it to work (see here) but I got there in the end and the result was an API that lets you return research projects by selecting regions either through their SNAC codes or Ordnance Survey IDs, e.g:


http://www.johngoodwin.me.uk/bis/api/project/county/os/
{unit id},

e.g. 
http://www.johngoodwin.me.uk/bis/api/project/county/os/17765

 

results can be returned in different formats using content negotiation [1] or by simple adding the relevant .html, .json to the URI, e.g.:


http://www.johngoodwin.me.uk/bis/api/project/euro/os/41424.html


http://www.johngoodwin.me.uk/bis/api/project/euro/os/41424.json

I hope this example shows how linked data can be useful in building applications on top of data aggregations. To summarise:

  1. Publishers release data in linked data format.
  2. Having data in a common format (RDF) with dereferencable URIs makes it relatively each to retrieve and aggregate from a number of resources, especially if data is linked to URIs for ‘things’ and not just ‘strings’.
  3. The linked data API makes it possible to build a RESTful service on top of a data aggregation so web developers need not be put of by complex SPARQL queries.
  4. Applications can then built using these services.

[1] for some reason the HTML conneg only seems to work in Firefox.

Follow

Get every new post delivered to your Inbox.

Join 1,662 other followers