Archive

Posts Tagged ‘Semantic Web’

Quick Play with Cayley Graph DB and Ordnance Survey Linked Data

June 29, 2014 2 comments

Earlier this month Google announced the release of the open source graph database/triplestore Cayley. This weekend I thought I would have a quick look at it, and try some simple queries using the Ordnance Survey Linked Data.

Cayley is written in Go, so first I had to download and install that. I then downloaded Cayley from here. As an initial experiment I decided to use the Boundary Line Linked Data, and you can grabbed the data as n-triples here. I only wanted a subset of this data – I didn’t need all of the triplestores storing the complex boundary geometries for my initial test so I discarded the files of the form *-geom.nt and the files of the form county.nt, dbu.nt etc. (these are the ones with the boundaries in). Finally I put the remainder of the data into one file so it was ready to load into Cayley.

It is very easy to load data into Cayley – see the getting started section part on the Cayley pages here. I decided I wanted to try the web interface so loading the data (in a file called all.nt) was a simple case of typing:

./cayley http –dbpath=./boundaryline/all.nt

Once you’ve done this point your web browser to http://localhost:64210/ and you should see something like:

Screen Shot 2014-06-29 at 10.43.35

 

One of the things that will first strike people used to using RDF/triplestores is that Cayley does not have a SPARQL interface, and instead uses a query language based on Gremlin. I am new to Gremlin, but seems it has already been used to explore linked data – see blog from Dan Brickley from a few years ago.

The main purpose of this blog post is to give a few simple examples of queries you can perform on the Ordnance Survey data in Cayley. If you have Cayley running then you can find the query language documented here.

At the simplest level the query language seems to be an easy way to traverse the graph by starting at a node/vertex and following incoming or outgoing links. So to find All the regions that touch Southampton it is a simple case of starting at the Southampton node, following a touches outbound link and returning the results:

g.V(“http://data.ordnancesurvey.co.uk/id/7000000000037256“).Out(“http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches“).All()

Giving:

Screen Shot 2014-06-29 at 10.56.15

If you want to return the names and not the IDs:

g.V(“http://data.ordnancesurvey.co.uk/id/7000000000037256“).Out(“http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches“).Out(“http://www.w3.org/2000/01/rdf-schema#label“).All()

Screen Shot 2014-06-29 at 10.58.30

You can used also filter – so to just see the counties bordering Southampton:

g.V(“http://data.ordnancesurvey.co.uk/id/7000000000037256“).Out(“http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches“).Has(“http://www.w3.org/1999/02/22-rdf-syntax-ns#type“,”http://data.ordnancesurvey.co.uk/ontology/admingeo/County“).Out(“http://www.w3.org/2000/01/rdf-schema#label“).All()

Screen Shot 2014-06-29 at 11.01.17

 

The Ordnance Survey linked data also has spatial predicates ‘contains’, ‘within’ as well as ‘touches’. Analogous queries can be done with those. E.g. find me everything Southampton contains:

g.V(“http://data.ordnancesurvey.co.uk/id/7000000000037256“).Out(“http://data.ordnancesurvey.co.uk/ontology/spatialrelations/contains“).Out(“http://www.w3.org/2000/01/rdf-schema#label“).All()

So after this very quick initial experiment it seems that Cayley is very good at providing an easy way of doing very quick/simple queries. One query I wanted to do was find everything in, say, Hampshire – the full transitive closure. This is very easy to do in SPARQL, but in Cayley (at first glance) you’d have to write some extra code (not exactly rocket science, but a bit of a faff compared to SPARQL). I rarely touch Javascript these days so for me personally this will never replace a triplestore with a SPARQL endpoint, but for JS developers this tool will be a great way to get started with and explore linked data/RDF. I might well brush up on my Javascript and provide more complicated examples in a later blog post…

 

 

 

New Ordnance Survey Linked Data Site not just for Data Geeks

June 3, 2013 1 comment

Ordnance Survey’s new linked data site went live today. You can read the official press release here. One of the major improvements to the site is the look and feel of the site, and as a result of this the site should be useful to people who don’t care about ‘scary things’ like APIs, linked data or RDF.

One key additional feature of the new site is map views (!) of entities in the data. This means the site could be useful if you want to share your postcode with friends or colleagues as a means of locating your house or place of work. Every postcode in Great Britain has a webpage in the OS linked data of the form:

http://data.ordnancesurvey.co.uk/id/postcodeunit/{POSTCODE}

Examples of this would be the OS HQ postcode:

http://data.ordnancesurvey.co.uk/id/postcodeunit/SO160AS

or the postcode for the University of Southampton:

http://data.ordnancesurvey.co.uk/id/postcodeunit/SO171BJ

Click on either of these links you’ll see a map of the postcode – which you can view at various levels of zoom. You’ll also see useful information about the postcode such as its lat/long coordinate. More interestingly you’ll notice that it provides information about the ward, district/unitary authority, county (where applicable) and country your postcode is located in. So for the University of Southampton postcode we can see it’s located in the ward Portswood, the district Southampton and the country England.

Another interesting addition to the site is links to a few useful external sites such as: They Work For You, Fix My Street, NHS Choice and Police UK. This hopefully makes the linked data site a useful location based hub to information about what’s going on in your particular postcode area.

Why not give it a try with your postcode…:)

Announcing new beta Ordnance Survey Linked Data Site

April 25, 2013 1 comment

Ordnance Survey has released a new beta linked data site. You can read the official press release here.

I thought I’d write a quick (unofficial) guide to some of the changes. The most obvious one that is hopefully apparent as you navigate round the site is the much improved look and feel of the site. Including maps (!) showing where particular resources are located. Try this and this for example. Maps can be viewed at different levels of zoom.

Another improvement is the addition of new APIs. The first of these is an improved search function. Supported fields for search and some examples can be found here. The search API now includes a spatial search element.

The SPARQL API is improved. Output is now available in additional formats (such as CSV) as well as the usual SPARQL-XML and SPARQL-JSON. Example SPARQL queries are also included to get users started.

Another interesting addition is a new reconciliation API. This allows developers to use the Ordnance Survey linked data with the Open Refine tool. This would allow a user to match a list of postcodes or place names in a spreadsheet to URIs in the Ordnance Survey linked data.

In the new release the Ordnance Survey linked data has been split into distinct datasets. You could use the above described APIs with the complete dataset or, if preferred, just work on the Code-Point Open or Boundary Line datasets.

For details on where to send feedback on the new site please see the official press release here.

Update: I blogged a bit more about some of the new APIs here.

What is Linked Data?

July 5, 2012 Leave a comment

I wrote an introductory blog entitled “What is Linked Data?” over at the newly revamped data.gov.uk. You can read it here.

About Time…

April 22, 2012 8 comments

I’ve had an initial stab at encoding the Allen Interval algebra as an ontology mainly using this page as guidance for property composition. I’ve done two versions: the first is limited to the subset of the composition rules that can be expressed in OWL 2 and the second contains a hopefully complete axiomatisation using DL Safe SWRL rules.

I’ve included some simple examples in the ontologies to show the inference at work.

Next step will be aligning this ontology to the OWL Time ontology. Feed back on potential applications etc. would be appreciated.

Introducing RAGLD

December 21, 2011 1 comment

RAGLD (Rapid Assembly of Geo-centred Linked Data) is a project looking at the development of a software component library to support the Rapid Assembly of Geo-centred Linked Data applications

The advent of new standards and initiatives for data publication in the context of the World Wide Web (in particular the move to linked data formats) has resulted in the availability of rich sources of information about the changing economic, geographic and socio-cultural landscape of the United Kingdom, and many other countries around the world. In order to exploit the latent potential of these linked data assets, we need to provide access to tools and technologies that enable data consumers to easily select, filter, manipulate, visualize, transform and communicate data in ways that are suited to specific decision-making processes.In this project, we will enable organizations to press maximum value from the UK’s growing portfolio of linked data assets. In particular, we will develop a suite of software components that enables diverse organizations to rapidly assemble ‘goal-oriented’ linked data applications and data processing pipelines in order to enhance their awareness and understanding of the UK’s geographic, economic and socio-cultural landscape.A specific goal for the project will be to support comparative and multi-perspective region-based analysis of UK linked data assets (this refers to an ability to manipulate data with respect to various geographic region overlays), and as part of this activity we will incorporate the results of recent experimental efforts which seek to extend the kind of geo-centred regional overlays that can be used for both analytic and navigational purposes. The technical outcomes of this project will lead to significant improvements in our ability to exploit large-scale linked datasets for the purposes of strategic decision-making.RAGLD is a collaboative research initiative between the Ordnance Survey, Seme4 Ltd and the University of Southampton, and is funded in part by the Technology Strategy Board‘s “Harnessing Large and Diverse Sources of Data” programme. Commencing October 2011, the project runs for 18 months.

If you’d like to input into the requirements phase of the project I’d be very grateful if you could fill in one of these questionnaires. Many thanks in advance.

So what can I do with the new Ordnance Survey Linked Data?

October 25, 2010 7 comments

In a previous post I wrote up some of the features of the new Ordnance Survey Linked Data. In this blog post I want to run through a concrete example of the sort of thing you can build using this linked data.

A while ago Talis built their BIS Explorer. The aim of this application was to allow users to “identify centres of excellence at the click of a button” and more can be read about the application here. This data mash-up took different data sources about funded research projects and joined them together using linked data. In the original application you could, for example, look at funded research projects by European Region in Great Britain. This can be seen here. At the time this demo was created Ordnance Survey was yet to publish its postcode data as linked data, but if they had it would have been very easy to get a more fine grained view of research projects down at the county and district level. Here’s how…

The basic data model of the original BIS data was fairly straight forward. Universities and businesses have a link to the projects they worked on. For each university there is also postcode information. Things get interesting if instead of/as well as linking to a string representation of a postcode you link to the URI for said postcode. This can be done by using the property:

http://data.ordnancesurvey.co.uk/ontology/postcode/postcode

So say we wanted to do this for Imperial College. All we need is this (this example is in N-Triple format) in our data:

<http://education.data.gov.uk/id/institution/ImperialCollegeOfScienceTechnologyAndMedicine>

<http://data.ordnancesurvey.co.uk/ontology/postcode/postcode>

<http://data.ordnancesurvey.co.uk/id/postcodeunit/W68RF> .

Now, by the power of linked data, connecting to a resource for the postcode means we can now enrich our university dataset with knowledge of the ward, county and district the university is in. Also, given that the university is connected to a project we have a link from project to region. Through the link from project to university to postcode to region we can now start to have a more finely grained view on which areas are getting more funding.

So how do we do this in practice? There are the steps I followed.

  1. Download the BIS data from here and load it into a triple store (linked data database) of your choice. There are plenty of good open source ones available e.g. Sesame or TDB to name two.
  2. I then then added the links to postcode URIs as described above.
  3. Following that I loaded the data for the postcodes in a similar manner to that described here. A relatively simple script retrieved the RDF for the relevant postcodes and loaded the RDF into my store. The nice thing about linked data and RDF is that the stores are like a big bucket of data and you can keep throwing more and more in. Hopefully future linked data tools will make this step trivial, but for now some scripting was required.
  4. Job done. I now have links from research projects to regions.

Basically what I created from this was an aggregation of various datasets that you can now query. This is something that is made very easy using linked data and URIs to identify things like postcodes. As more publishers release data in linked data form there is more and more potential for building services and applications on top of aggregations of these datasets.  So that’s what I decided to do…

This application (I make the usual apology for my lack of web development skills and for the slowness which some caching would not doubt sort out) builds a clickable map view of this data aggregation. The OS OpenSpace API makes it possible to retrieve the unit ID for selected polygons. I can then use this unit ID in a SPARQL query to find the projects funded in that region.

However, it would have been easier if there was a RESTful API on top of the data aggregation that would have let me retrieve these results instead of doing some SPARQL. So that is what I decided to build next using the Linked Data API. The linked data API basically lets you create RESTful type shortcuts to relatively complex SPARQL query. Due to my lack of PHP skills it was an initially bumpy ride getting it to work (see here) but I got there in the end and the result was an API that lets you return research projects by selecting regions either through their SNAC codes or Ordnance Survey IDs, e.g:

http://www.johngoodwin.me.uk/bis/api/project/county/os/{unit id},

e.g. http://www.johngoodwin.me.uk/bis/api/project/county/os/17765

 

results can be returned in different formats using content negotiation [1] or by simple adding the relevant .html, .json to the URI, e.g.:

http://www.johngoodwin.me.uk/bis/api/project/euro/os/41424.html

http://www.johngoodwin.me.uk/bis/api/project/euro/os/41424.json

I hope this example shows how linked data can be useful in building applications on top of data aggregations. To summarise:

  1. Publishers release data in linked data format.
  2. Having data in a common format (RDF) with dereferencable URIs makes it relatively each to retrieve and aggregate from a number of resources, especially if data is linked to URIs for ‘things’ and not just ‘strings’.
  3. The linked data API makes it possible to build a RESTful service on top of a data aggregation so web developers need not be put of by complex SPARQL queries.
  4. Applications can then built using these services.

[1] for some reason the HTML conneg only seems to work in Firefox.

Follow

Get every new post delivered to your Inbox.

Join 2,189 other followers