Home > Semantic Web > Quick Play with Cayley Graph DB and Ordnance Survey Linked Data

Quick Play with Cayley Graph DB and Ordnance Survey Linked Data


Earlier this month Google announced the release of the open source graph database/triplestore Cayley. This weekend I thought I would have a quick look at it, and try some simple queries using the Ordnance Survey Linked Data.

Cayley is written in Go, so first I had to download and install that. I then downloaded Cayley from here. As an initial experiment I decided to use the Boundary Line Linked Data, and you can grabbed the data as n-triples here. I only wanted a subset of this data – I didn’t need all of the triplestores storing the complex boundary geometries for my initial test so I discarded the files of the form *-geom.nt and the files of the form county.nt, dbu.nt etc. (these are the ones with the boundaries in). Finally I put the remainder of the data into one file so it was ready to load into Cayley.

It is very easy to load data into Cayley – see the getting started section part on the Cayley pages here. I decided I wanted to try the web interface so loading the data (in a file called all.nt) was a simple case of typing:

./cayley http –dbpath=./boundaryline/all.nt

Once you’ve done this point your web browser to http://localhost:64210/ and you should see something like:

Screen Shot 2014-06-29 at 10.43.35

 

One of the things that will first strike people used to using RDF/triplestores is that Cayley does not have a SPARQL interface, and instead uses a query language based on Gremlin. I am new to Gremlin, but seems it has already been used to explore linked data – see blog from Dan Brickley from a few years ago.

The main purpose of this blog post is to give a few simple examples of queries you can perform on the Ordnance Survey data in Cayley. If you have Cayley running then you can find the query language documented here.

At the simplest level the query language seems to be an easy way to traverse the graph by starting at a node/vertex and following incoming or outgoing links. So to find All the regions that touch Southampton it is a simple case of starting at the Southampton node, following a touches outbound link and returning the results:

g.V(“http://data.ordnancesurvey.co.uk/id/7000000000037256“).Out(“http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches“).All()

Giving:

Screen Shot 2014-06-29 at 10.56.15

If you want to return the names and not the IDs:

g.V(“http://data.ordnancesurvey.co.uk/id/7000000000037256“).Out(“http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches“).Out(“http://www.w3.org/2000/01/rdf-schema#label“).All()

Screen Shot 2014-06-29 at 10.58.30

You can used also filter – so to just see the counties bordering Southampton:

g.V(“http://data.ordnancesurvey.co.uk/id/7000000000037256“).Out(“http://data.ordnancesurvey.co.uk/ontology/spatialrelations/touches“).Has(“http://www.w3.org/1999/02/22-rdf-syntax-ns#type“,”http://data.ordnancesurvey.co.uk/ontology/admingeo/County“).Out(“http://www.w3.org/2000/01/rdf-schema#label“).All()

Screen Shot 2014-06-29 at 11.01.17

 

The Ordnance Survey linked data also has spatial predicates ‘contains’, ‘within’ as well as ‘touches’. Analogous queries can be done with those. E.g. find me everything Southampton contains:

g.V(“http://data.ordnancesurvey.co.uk/id/7000000000037256“).Out(“http://data.ordnancesurvey.co.uk/ontology/spatialrelations/contains“).Out(“http://www.w3.org/2000/01/rdf-schema#label“).All()

So after this very quick initial experiment it seems that Cayley is very good at providing an easy way of doing very quick/simple queries. One query I wanted to do was find everything in, say, Hampshire – the full transitive closure. This is very easy to do in SPARQL, but in Cayley (at first glance) you’d have to write some extra code (not exactly rocket science, but a bit of a faff compared to SPARQL). I rarely touch Javascript these days so for me personally this will never replace a triplestore with a SPARQL endpoint, but for JS developers this tool will be a great way to get started with and explore linked data/RDF. I might well brush up on my Javascript and provide more complicated examples in a later blog post…

 

 

 

About these ads
  1. Nolan Nichols
    June 29, 2014 at 11:31 pm

    nice overview, thanks! Surprising that Google went with Gremlin over SPARQL – makes me think that this is built off of something related to the tinkerpop stack (http://www.tinkerpop.com/)

    • John Goodwin
      June 30, 2014 at 6:44 am

      Thanks. I will hopefully get round to looking at some more complex queries in a further post. It probably works well for a web developer wanting to do reasonably simple things with the RDF data – probably less of a learning curve that getting into full SPARQL.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 2,189 other followers

%d bloggers like this: